[wip] exception handling spec: fix behavior of asynchronous read etc.

d555e3e8 · Martin Christoph Hierholzer · db970f94 · d555e3e8
Commit d555e3e8 authored 4 years ago by Martin Christoph Hierholzer
--- a/doc/spec_exceptionHandling.md
+++ b/doc/spec_exceptionHandling.md
@@ -49,12 +49,9 @@ When the device is functional, it be (re)initialised by using application-define
  - 2.2 Read operations will propagate the DataValidity::faulty flag to the owning module / fan out (without changing the actual value):
    - 2.2.1 The normal module algorithm code will be continued, to allow this flag to propagate to the outputs in the same way as if it had been received through the process variable itself (c.f. 1.2).
    - 2.2.2 The DataValidity::faulty flag resulting from the fault state is propagated once, even if the variable had the a DataValidity::faulty flag already set previously for another reason.
-    - 2.2.3 Blocking read operations will be skipped, if the fault flag has not yet been read once by the same accessor. If the fault flag had already been read previously by the same accessor, the operation is frozen (regardless of the type of the first read). When the frozen operation is finally executed, another exception might be thrown, in which case the previously frozen operation is finally skipped.
-    - 2.2.4 Non-blocking read operations (incl. readLatest) will be skipped. The return value will be false (no new data), if the fault flag has been read once already by the same accessor and hence is already propagated (regardless of the type of the first read), true otherwise.
-    - 2.2.5 Asynchronous read operations behave analogous to 2.2.3:
-      - A TransferFuture, which was valid while the exception was received, is fulfilled immediately when the exception is received, the DataValidity::faulty is propagated to the owning module and the value is left unchanged (i.e. the underlying operation is effectively skipped).
-      - The TransferFuture of an asynchronous read operation that is started only after the exception was received will be fulfilled immediately (i.e. the underlying operation is effectively skipped), if no other read operation of (regardless of the type) of the same accessor has read the fault flag once already. Otherwise it will be fulfilled only after the device is recovered (i.e. the underlying operation is effectively frozen).
-    - 2.2.6 If the fault state had been resolved in between two read operations (regardless of the type) and the device had become faulty again before the second read is executed, it is not defined whether the second operation will frozen/delayed/skipped (depending on the type) or not. The second operation might behave either like it is a new exception or like the same fault state would still prevail. (*)
+    - 2.2.3 readLatest() (including any read operation without AccessMode::wait_for_new_data) will be skipped. The return value will be false (no new data), if the fault flag has been read once already by the same accessor and hence is already propagated (regardless of the type of the first read), true otherwise.
+    - 2.2.4 Read operations with AccessMode::wait_for_new_data (read(), readNonBlocking() and readAsync()) will be skipped, if the DataValidity::faulty flag has not yet been propagated by the same accessor (which counts as new data, i.e. readNonBlocking() will return true). Otherwise, it will behave like there is no new data: Blocking operations will be frozen, non-blocking operations will be skipped. When the frozen operation is finally executed, another exception might be thrown, in which case the previously frozen operation is finally skipped.
+    - 2.2.5 If the fault state had been resolved in between two read operations (regardless of the type) and the device had become faulty again before the second read is executed, it is not defined whether the second operation will frozen/delayed/skipped (depending on the type) or not. The second operation might behave either like it is a new exception or like the same fault state would still prevail. (*)
  - 2.3 Write operations will be delayed. In case of a fault state (new or persisting), the actual write operation will take place asynchronously when the device is recovering. The same mechanism as used for 3.1.2 is used here, hence the order of write operations is guaranteed across accessors, but only the latest written value of each accessor prevails. (*)
    - 2.3.1 The return value of write() indicates whether data was lost in the transfer. If the write has to be delayed due to an exception, the return value will be true, if a previously delayed and not-yet writen value is discarded in the process, false otherwise.
    - 2.3.2 When the delayed value is finally written to the device during the recovery procedure, it is guaranteed that no data loss happens (writes with data loss will be retried).
@@ -77,7 +74,7 @@ When the device is functional, it be (re)initialised by using application-define

 - 1.1 In future, maybe logic_errors are also handled, so configuration errors can nicely be presented to the control system. This may be important especially since logic_errors may depend also on the configuration of external components (devices). If e.g. a device is changed (e.g. device is another control system application which has been modified), logic_errors may be thrown in the recovery phase, despite the device had been successfully initialsed previously.

- 2.2.6 Not defining the behavior here avoids a conflict with 1.2 without requiring a complicated implementation which does not block in this case. Implementing this would not present any gain for the application. If there are many exceptions on the same device in a short period of time, the number of faulty data updates seen by the application modules will always depend on the speed the module is attempting to read data (unless we require every exception to be visible to every module, but this will have complex effects, too). It might break consistency of the number of updates sent through different paths in an application, but applications should anyway not rely on that and use a DataConsistencyGroup to synchronise instead. Hence, the implementation will block always if a blocking read sees a known exception
+- 2.2.5 Not defining the behavior here avoids a conflict with 1.2 without requiring a complicated implementation which does not block in this case. Implementing this would not present any gain for the application. If there are many exceptions on the same device in a short period of time, the number of faulty data updates seen by the application modules will always depend on the speed the module is attempting to read data (unless we require every exception to be visible to every module, but this will have complex effects, too). It might break consistency of the number of updates sent through different paths in an application, but applications should anyway not rely on that and use a DataConsistencyGroup to synchronise instead. Hence, the implementation will block always if a blocking read sees a known exception

 - 2.3 / 3.1.3 If timing is important for write operations (e.g. must not write a sequence of registers too fast), or if multiple values need to be written to the same register in sequence, the application cannot fully rely on the framework's recovery procedure. The framework hence provides the process variable Devices/<alias>/deviceBecameFunctional for each device, which will be written each time the recovery procedure is completed (cf. 3.1.3). ApplicationModules which implement such timed sequence need to receive this variable and restart the entire sequence after the recovery.

@@ -128,7 +125,7 @@ A so-called ExceptionHandlingDecorator is placed around all device register acce
 - 1.4 In doPreRead() certain read operations are frozen in case of a fault state (see A.2.2):
  - 1.4.1 Obtain the recovery lock through DeviceModule::getRecoverySharedLock(), to prevent interference with an ongoing recovery procedure.
  - 1.4.2 Decide, whether freezing is done (don't freeze yet). Freezing is done if one of the following conditions is met:
-    - read type is blocking and AccessMode::wait_for_new_data is set, previousReadFailed == true, and DeviceModule::deviceHasError == true (cf. A.2.2.3), or
+    - read type is blocking and AccessMode::wait_for_new_data is set, previousReadFailed == true, and DeviceModule::deviceHasError == true (cf. A.2.2.4), or
    - no initial value has been read yet (getCurretVersion() == {nullptr}) and DeviceModule::deviceHasError == true (cf. A.4.2).
  - 1.4.3 Obtain the DeviceModule::errorLock. Only then release the recovery lock. (*)
  - 1.4.4 Wait on DeviceModule::errorIsReportedCondVar.