-
Martin Killenberg authoredMartin Killenberg authored
spec_dataValidityPropagation.dox 11.87 KiB
// put the namespace around the doxygen block so we don't have to give it all the time in the code to get links
namespace ChimeraTK {
/**
\page spec_dataValidityPropagation Technical specification: data validity propagation Specification version V1.0RC2
> **This is a release candidate in implementation**
> **NOTICE FOR FUTURE RELEASES: AVOID CHANGING THE NUMBERING!** The tests refer to the sections, incl. links and unlinked references from tests or other parts of the specification. These break, or even worse become wrong, when they are not changed consistenty!
1. General idea
---------------------------------------
\anchor dataValidity_1
- 1.1 In ApplicationCore each variable has a data validiy flag attached to it. DataValidity can be 'ok' or 'faulty'.
- 1.2 This flag is automatically propagated: If any of the inputs of an ApplicationModule is faulty, the data validity of the module becomes faulty, which means
all outputs of this module will automatically be flagged as faulty.
Fan-outs might be special cases (see 2.4).
- 1.3 If a device is in error state, all variables which are read from it shall be marked as 'faulty'. This flag is then propagated through all the modules (via 1.1.2) so it shows up in the control system.
- 1.4 The user code has the possibility to query the data validity of the module
- 1.5 The user code has the possibility to set the data validity of the module to 'faulty'. However, the user code cannot actively set the module to 'ok' if any of the module inputs are 'faulty'.
- \anchor dataValidity_1_6 1.6 The user code can flag individual outputs as bad. However, the user code cannot actively set an output to 'ok' if the data validity of the module is 'faulty'. \ref dataValidity_comment_1_6 "(*)"
- 1.7 The user code can get the data validity flag of individual inputs and take special actions.
- 1.8 The data validity of receiving variables is set to 'faulty' on construction. Like this, data is marked as faulty as long as no sensible initial values have been propagated.
### Comments
- \anchor dataValidity_comment_1_6 \ref dataValidity_1_6 "1.6": The alternative implementation to add a data validity flag to the write() function is not a good solution because it does not work with writeAll().
2. Technical implementation
---------------------------
### 2.1 MetaDataPropagatingRegisterDecorator (*)
- 2.1.1 Each input and each output of a module (or fan out) is decorated with a MetaDataPropagatingRegisterDecorator (except for the TriggerFanOut, see. \ref dataValidity_2_4 "2.4")
- 2.1.2 The decorator knows about the module it is connected to. It is called the 'owner'.
- 2.1.3 **read:** For each read operation it checks the incoming data validity and increases/decreases the data fault counter of the owner.
- \anchor dataValidity_2_1_5 2.1.5 **write:** When writing, the decorator is checking the validity of the owner and the individual flag of the output set by the user. Only if both are 'ok' the output validity is 'ok', otherwise the outgoing data is send as 'faulty'.
### 2.2 removed
### 2.3 ApplicationModule
- \anchor dataValidity_2_3_1 2.3.1 Each ApplicationModule has one data fault counter variable which is increased/decreased by EntityOwner::incrementDataFaultCounter() and EntityOwner::decrementDataFaultCounter.
- 2.3.2 All inputs and outputs have a MetaDataPropagatingRegisterDecorator.
- \anchor dataValidity_2_3_3 2.3.3 The main loop of the module usualy does not care about data validity. If any input is invalid, all outputs are automatically invalid \ref dataValidity_comment_2_3_3a "(*)". The loop just runs through normaly, even if an input has invalid data. \ref dataValidity_comment_2_3_3b "(*)"
- 2.3.4 Inside the ApplicationModule main loop the module's data fault counter is accessible. The user can increment and decrement it, but has to be careful to do this in pairs. The more common use case will be to query the module's data validity.
### 2.4 TriggerFanOut
\anchor dataValidity_2_4
The TriggerFanOut is special in the sense that it does not compute anything, but reads multiple independent poll-type inputs when a trigger arrives, and pushes them out. In contrast to an ApplicationModule or one of the data fan-outs, their data validities are not connected.
- 2.4.1 Only the push-type trigger input of the TriggerFanOut is equiped with a MetaDataPropagatingRegisterDecorator.
- \anchor dataValidity_2_4_2 2.4.2 The poll-type data inputs do not have a MetaDataPropagatingRegisterDecorator. \ref dataValidity_comment_2_4_2 "(*)".
- 2.4.3 The individual poll-type inputs propagate the data validity flag only to the corresponding outputs.
- \anchor dataValidity_2_4_4 2.4.4 Although the trigger conceptually has data type 'void', it can also be `faulty` \ref dataValidity_comment_2_4_4 "(*)". An invalid trigger is processed, but all read out data is flagged as `faulty`.
### 2.5 Interaction with exception handling
See @ref spec_execptionHandling.
- 2.5.1 The MetaDataPropagatingRegisterDecorator is always placed *around* the ExceptionHandlingDecorator if both decorators are used on a process variable. Like this a `faulty` flag raised by the ExceptionHandlingDecorator is automatically picked up by the MetaDataPropagatingRegisterDecorator.
- 2.5.2 The first failing read returns with the old data and the 'faulty' flag. Like this the flag is propagated to the outputs. Only further reads might freeze until the device is available again.
### 2.6 Application class
- 2.6.1 For device variables, the requirement of setting receiving endpoints to 'faulty' on construction can not be fulfilled. In DeviceAccess the accessors are bidirectional and provide no possibility to distinguish sender and receiver. Instead, the validity is set right after construction in Application::createDeviceVariable() for receivers.
### Comments:
- 2.1 The MetaDataPropagatingRegisterDecorator also propagates the version number, not only the data validity flag. Hence it's not called DataValidityPropagatingRegisterDecorator.
- \anchor dataValidity_comment_2_3_3a \ref dataValidity_2_3_3 "2.3.3" If a module has an output that would still be valid, even though one of the inputs is faulty, it means that the output is not connected to that particular input. This is an indicator that the module is doing unrelated things and probably should be split.
- \anchor dataValidity_comment_2_3_3b \ref dataValidity_2_3_3 "2.3.3" A change of the data validity of the module does not automatically change the validity on all outputs. A module might not always write all of its outputs. If the module's data validity is 'faulty', those outputs which are not written stay valid. This is correct because their calculation was not affected by the faulty data yet. And if an output which has faulty data is not updated when the data validity goes back to 'ok' it also stays 'faulty', which is correct as well.
- \anchor dataValidity_comment_2_4_2 \ref dataValidity_2_4_2 "2.4.2" The data validities of the different poll type inputs are not correlated, so they shall not be propagated to the TriggerFanOuts data fault counter. The version numbers of poll-type inputs are not propagated anyway, so there is nothing for the decorator to do.
- \anchor dataValidity_comment_2_4_4 \ref dataValidity_2_4_4 "2.4.4" A void variable can be invalid if the sending device fails, hence there is no new data, and then a heartbeat times out and raises an exception. This will result in a void variable with data validity `faulty` because it does not originate form a recevied message.
3. Implementation details
-------------------------
- 3.1 The decorators which manipulate the data fault counter are responsible for counting up and down in pairs, such that the counter goes back to 0 if all data is ok, and never becomes negative.
4. Circular dependencies
------------------------
If modules have circular dependencies, the algorithm described in \ref dataValidity_1 "section 1" leads to a self-excited loop: Once the DataValidity::invalid flag has made a full circle, there is always at least on input with invalid data in each module and you can't get rid of it any more. To break this circle, the following additional behaviour is implemented:
### 4.1 General behaviour
- 4.1.1 Inputs which are part of a circular dependency are marked as _circular input_. Inputs which are coming from other applicatiation modules
which are not part of the circle, from the control system module or from device modules are considered _external inputs_.
- \anchor dataValidity_4_1_2 4.1.2 All modules which have a circular dependency form a _circular network_.
- 4.1.2.1 Also entangled circles of different variables which intersect in some of the modules are part of the same circular network.
- 4.1.2.2 There can be multiple disconnected circular networks in an application.
- 4.1.3 Circular inputs and circular networks are identified at application start after the variable networks are established.
- \anchor dataValidity_4_1_4 4.1.4 As long as at least one _external input_ of any module in the _circular network_ is invalid, the invalidity flag is propagated as described in
\ref dataValidity_1 "1."
- \anchor dataValidity_4_1_5 4.1.5 Once all _external inputs_ of one _circular network_ are back to DataValidity::ok, all _circular inputs_ of the _circular network_ ignore the
invalid flag and also switch to DataValidity::ok. This breaks the circle.
- \anchor dataValidity_4_1_6 4.1.6 If all inputs of a module have DataValidity::ok, the module's output validity is also DataValidity::ok, even if other modules in
the _circular network_ have _external inputs_ which are invalid.
- The ControlSystemModule and DeviceModules are never part of a circular dependency.
### 4.2 Side effects and race conditions
- \anchor dataValidity_4_2_1 4.2.1 If a process variable has a fluctuating error state (e.g. alternating validity `ok` and `faulty`) and is the only external input, the faulty flag
is ignored in the _circular inputs_ of all modules each time the last received value of that particular variable has DataValidity::ok. (The data validity is set to `ok` "too early")
- 4.2.2 If it is strictly required that a DataValidity::faulty flag is always transported because critical decisions depend on it, circular dependencies must be avoided in the application.
- 4.2.3 Lets assume two _external inputs_ in two modules are invalid. These inputs are in different circles, but the circles are inirectly connected through a third circle, which all together form one circular. If now one of those inputs goes back to DataValidity::ok, the parcicular circle does not have an _external input_ which is `faulty`, however the module does not go back to DataValidity::ok because the _circular network_ still has a faulty _external input_. This might be considered as resolving the invalidity "too late", but it is consistent because data might have propagated from that part of the _circular network_.
### 4.3 Technical implementation
- \anchor dataValidity_4_3_1 4.3.1 In addition to the owner (see \ref dataValidity_2_3_1 "2.3.1"), the _circular network_ (\ref dataValidity_4_1_2 "4.1.2") also get an invalidity counter.(\ref dataValidity_comment_4_3_1 "*").
- 4.3.2 Each module and each _circular input_ knows its _circular network_.
- 4.3.3 If an _external input_ receives data, it increases/decreases the _circular network_'s invalidity counter, together with the owner's invaliditys counter.
- 4.3.4 If a module estimates its validity (as used in \ref dataValidity_2_1_5 "2.1.5"), it returns
- `DataValidity::ok` if the module's internal invalidity counter is 0 (\ref dataValidity_4_1_6 "4.1.6")
- `DataValidity::ok` if the module's internal invalidity counter is not 0 and the _circular network_'s invalidity counter is 0 (\ref dataValidity_4_1_5 "4.1.5")
- `DataValidity::faulty` if both counters are not 0 (\ref dataValidity_4_1_4 "4.1.4")
#### Comments
- \anchor dataValidity_comment_4_3_1 \ref dataValidity_4_3_1 "4.3.1" This counter has to be atomic because it is accessed from different module threads.
*/
} // end of namespace ChimeraTK