Newer
Older
/**
\page exceptionHandlingDesign Exception Handling Design
\section gen_idea General Idea
Exceptions must be handled by ApplicationCore in a way that the application developer does not have to care much about it.
In case of a ChimeraTK::runtime exception the Application must catch the expection and report it to the DeviceModule. The DeviceModule should handle this exception and block the device until the device can be opened again. As there could many devices make sure only the faulty device is blocked.
Even if a device is faulty it should not block the server from starting.
Once in error state, set the DataValidity flag for that module to faulty and propogate this to all of it‘s output variables. After the exception is cleared and operation returns without a data fault flag, set DataValidity flag to ok. Furthermore, the device must be reinitialised automatically and also recover the values of process variables as the device might have rebooted and the variables have been re-set.
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
<b>Genesis</b>
- When DeviceModule is created it is registered with Application. (Added to a list in Application::registerDeviceModule.)
- An initailisation handler can be added to the device through constructor. Initialisation handlers are callback function which will be executed after a device recovers from an exception.
- A list of TransferElements shared pointers is created as writeAfterOpen which is used to write constants after the devcie is opened.
- A list of TransferElements shared pointers is created as writeRecoveryOpen which is populated in function addRecoveryAccessor in the DeviceModule.
- ChimeraTK::NDRegisterAccessor is used to access the device variables inside class Application.
- Class ExceptionHandlingDecorator facilitates ChimeraTK::NDRegisterAccessor in case of exception.
- Recovery accessor is added for writebale register when ChimeraTK::NDRegisterAccessor is obtianed. These recovery accessors are used to recover the values of variables after the recovery.
- setOnwer() is used to set the application module or variable group as owner of the (feeding) device which is decorated with and ExceptionHandlingDecorator.
<b>The Flow</b>
- Application has started but the device is not opened
- All the writes will be delayed until the device is opened
- Constants too will be written only after the device is opened.
- The device is opened for the first time inside DeviceModule::handleException().
- If there is no exception
- deviceError.status is set to 0.
- Device is initailised iterating over initialisationHandlers list.
- Constant feeders are written to the device using writeAfterOpen().
- When a read / write operation ExceptionHandlingDecorator<UserType>::genericTransfer (ChimeraTK::NDRegisterAccessor) on device causes a ChimeraTK::runtime exception, the exception is caught
- Inside ExceptionHandlingDecorator
- The dataValidity of the DeviceModule is set to faulty using setOwnerValidityFunction(DataValidity::faulty)
- incrementDataFaultCounter is set to true
- Error is reported to DeviceModule with the exception as DeviceModule::reportException(e.what).
- incrementDataFaultCounter is picked up by MetaDataPropagatingDecorator and all the outputs are set faulty.
- In DeviceModule::reportException
- The Error is pushed into an error queue and the deviceError.status is set to 1.
- The device is blocked until the error state is resolved i.e., device can be opened again.
- Exception is handled by DeviceModule::handleException() in a separate thread.
- It will keep on trying to open the device until successful.
- Once device is opened,
- deviceError.status is set to 0.
- device is reinitalisied through initialisationHandlers.
- process variables are written again through writeRecoveryOpen().
- device thread is notified and it no longer remains block.
-<b>Add an exception handling and reporting machinsm to the device module (DeviceModule).</b>
Description.
Add two error state variables.
- "state" (boolean flag if error occurred)
- "message" (string with error message)
These variables are automatically connected to the control systen in this format
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
- /Devices/{AliasName}/status
Add a thread safe function reportException().
A user/application can report an exception by calling reportException of DeviceModule with an exception string. The reportException packs the exception in a queue and the blocks the thread. This queue is processed by an internal function handleException which updates the DeviceError variables (status=1 and message="YourExceptionString") and tries to open the device. Once device can be opened the DeviceError variables are updated (status=0 and message="") and blocking threads are notified to continue. It must be noted that whatever operation which lead to exception e.g., read or write, should be repeated after the exception is handled.
Implmentation.
- DeviceModule
-<b>Catch ChimeraTK::runtime_error exceptions.</b>
Description.
Catch all the ChimeraTK::runtime_error exceptions that could be thrown in read and write operations and feed the error state into the DeviceModule through the function DeviceModule::reportException() . NDRegisterAccessors coming from device should be used as a singal central point to catch these excpetions.
Retry the failed operation after reportException() returns.
Implmentation.
It is done by placing a ExceptionHandlingDecorator around all NDRegisterAccessors coming from a device.
- NDRegisterAccessors
- Application
-<b>Faulty device should not block any other device.</b>
Description.
Each TriggerFanOut deals with several variable networks at the same time, which are triggered by the same trigger. Each variable network has its own feeder and one or more consumers. You do not need to change anything about the variable networks.
On the other hand, the trigger itself is a variable network, too. The TriggerFanOut has a consumer of this trigger network. This is the accessor on which the blocking read() is called in the loop. You will need to create additional consumers in the trigger network, one for each TriggerFanOut.
Implementation.
- Application (Application::typedMakeConnection)
-<b>The Server must always start even if a device is in error state.</b>
Description.
To make sure that the server should always start, the initial opening of the device should take place in the DeviceModule itself, inside the exception handling loop so that device can go to the error state right at the beginning and the server can start despite not all its devices are available.
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
Implementation.
- DeviceModule ( DeviceModule::handleException() ).
-<b>Set/clear fault flag of module in case of exception.</b>
Background.
A DataValidity flag of a module is set to faulty if any input variables returns with a set data fault flag after a read operation and is cleared once all inputs have data fault no longer set. In a write operation, the module's data fault flag status is attached to the variable to write.
More detail ...(Martin‘s doc)
Description.
In case of an ChimeraTK:runtime_error exception this DataValidity flag should also be set to faulty and propogated to all outputs of the module. When the operation completes after clearing the exception state, the flag should be cleared as well.
Implmentation.
- ExceptionHandlingDecorator
- TriggerFanOut
Additional note from code author.
Note that if the data is distributed through a triggered FanOut (i.e. variables from device is connected to other variables through a trigger, the usual way for poll-type variables) the data read from the receiving end of the variable cannot be considered valid if the DataValidity is faulty.
Additionaly, a change of to a faulty validity state will signal the availability of new data on those variables, which is to be considered invalid.
Bahnhof.Variables which are Constants or outputs of the ConfigReader and are connected to a DeviceModule should be written in an initialisation handler. Currently they are written in ConfigReader::pepare() etc., which might block the application initialisation if an exception occurs in the process of writing these variables.
-<b>Initialise the device after recovey.</b>
Description.
If a device is recovered after an exception, it might need to be reinitialised (e.g. because it was power cycled). The device should be automatically reinitialised after recovery.
Implementation.
A list of DeviceModule std::function is added. InitialisationHandlers can be added through construtor and addInitialisationHandler() function. When the device recovers all the initialisationHandlers in the list are executed.
- DeviceModule
-<b>Recover process variables after exception.</b>
Background.
After a device has failed and recovered, it might have re-booted and lost the values of the process variables that live in the server and are written to the device. Hence these values have to be re-written after the device has recovered.
Description.
Technically the issue is that the original value that has been written is not safely accessible when recovering. Inside the accessor the user buffer must not be touched because the recovery is taking place in a different thread. In addition we don't know where the data is (might or might not have been swapped away, depending whether write() or writeDestructively() has been call by the user).
The only race condition free way is to create a copy when writing the data to the device, so they are available when recovering.
Implementation.
- DeviceModule
- ExceptionHandlingDecorator
A list of TransferElements shared pointers is created with as writeRecoveryOpen which is populated in function addRecoveryAccessor in the DeviceModule.
ExceptionHandlingDecorator is extended by adding second accessor to the same register as the target accessor it is decorating and data is copied in doPreWrite().
*/