Skip to content
Snippets Groups Projects
Commit a6b9d8ce authored by Mikhail Karnevskiy's avatar Mikhail Karnevskiy
Browse files

Improve documentation

parent b9c78de9
No related branches found
No related tags found
No related merge requests found
......@@ -547,9 +547,10 @@ def create_consumer(server_name,source_path,has_filesystem,beamtime_id,data_sour
"""
:param server_name: Server endpoint (hostname:port)
:type server_name: string
:param source_path: Path to the folder to read data from
:param source_path: Path to the folder to read data from. If has_filesystem is False path have to point either to core or beamline gpfs.
Use `auto` to automatically point to code gpfs.
:type source_path: string
:param has_filesystem: True if the source_path is accessible locally, otherwise will use file transfer service to get data
:param has_filesystem: True if the source_path is accessible for client, otherwise will use file asapo transfer service to get data
:type has_filesystem: bool
:param beamline: beamline name, can be "auto" if beamtime_id is given
:type beamline: string
......
......@@ -12,6 +12,7 @@ FEATURES
* Consumer API: `get_stream_list` now have flag `detailed` which is `true` by default. In case of `false` it will not update an information about last timestamp, last message id and stream finish flag. This is done to speedup request in case of large number of streams.
* Consumer API: New function `get_source_list` returns list of all data-sources for given beamtime.
* Development tool: A new docker container, that provide standalone asapo service is created. Now asapo service can be lauched with a single command and require only docker installation.
* `query_messages` API is not used so far, but make significant constrain on asapo architecture. Therefore, it may be removed in the future.
IMPROVEMENTS
* Speed-up function `get_stream_list`. This is achieved by caching list of streams in the collection of MongoDB. Timestamp of the earliest message is fixed now (previously was updated with each call of the function).
......@@ -25,5 +26,6 @@ VERSION COMPATIBILITY
INTERNAL
* List of streams is stored in the MongoDB collection. This collection is populated, when get_stream_list is called.
* Messages are stored in the MongoDB with internal auto-incremented `_id`. User-given `id` is stored in `message_id`, indexed, and used to retrieve the data by `id`.
* Docker image `asapo-services-linux-build-env` is now can run fully-functional asapo service. This is used to launch integration tests during git-CI.
* Messages are stored in the MongoDB with internal auto-incremented `time_id`. User-given `id` is stored in `_id`. Both keys are indexed, and used to retrieve the data (choise of the key depends on client option).
* Docker image `asapo-services-linux-build-env` is now can run fully-functional asapo service. This is used to launch integration tests during git-CI.
* Integration test for deploy stage are added. They test python clients (current and old) against current asapo-standalone, which is running as git-ci service.
......@@ -14,9 +14,9 @@ the workflow can be split into two more or less independent tasks - data ingesti
2i) HiDRA (or other user application) then uses ASAPO Producer API to send messages (M1 and M2 in our case) in parallel to ASAPO Receiver. TCP/IP or RDMA protocols are used to send data most efficiently. ASAPO Receiver receives data in a memory cache
3i) - 4i) ASAPO saves data to a filesystem and adds a metadata record to a database
3i) - 4i) ASAPO saves data to a filesystem and/or receiver cache and adds a metadata record to a database
5i) A feedback is send to the producer client with success or error message (in case of error, some of the step above may not happen)
5i) A feedback is sent to the producer client with success or error message (in case of error, some of the step above may not happen)
### Data retrieval (numbers with r on the diagram)
......
---
title: Data in ASAPO
---
### Data structure
All data that is produced, stored and consumed via ASAPO is structured on several levels.
#### Beamtime
......@@ -27,3 +30,9 @@ So, for the case without datasets (single detector) the data hierarchy is Beamti
And with datasets (multi-detector) the data hierarchy is Beamtime→Data Source → Data Stream → Dataset→ Message in Dataset Substream:
![Docusaurus](/img/data-in-asapo-workflow2.png)
### Data storage
Data transferred to asapo service is stored in files on cache buffer or/and (depending on ingest mode) in disc. Asapo disc storage is bounded to DESY infrastructure. Mounting point is either located at beamline gpfs (so-called `raw` data type) or at core gpfs (so-called `processed` data type). Exact path is defined inside the asapo authorization system.
Currently, asapo caches data in a circular buffer with a size of multiple hundreds of GB. This cache is not dedicated to a certain producer, but rather global across asapo service.
\ No newline at end of file
......@@ -8,12 +8,13 @@ Producer
:undoc-members:
:show-inheritance:
Injest modes:
Ingest modes:
-------------
.. data:: INGEST_MODE_TRANSFER_DATA
.. data:: INGEST_MODE_TRANSFER_METADATA_ONLY
.. data:: INGEST_MODE_STORE_IN_FILESYSTEM
.. data:: INGEST_MODE_STORE_IN_DATABASE
.. data:: CACHE_ONLY_INGEST_MODE = INGEST_MODE_TRANSFER_DATA | INGEST_MODE_STORE_IN_DATABASE
.. data:: DEFAULT_INGEST_MODE = INGEST_MODE_TRANSFER_DATA | INGEST_MODE_STORE_IN_FILESYSTEM | INGEST_MODE_STORE_IN_DATABASE
......
......@@ -228,7 +228,9 @@ cdef class PyProducer:
"""
:param id: unique data id
:type id: int
:param exposed_path: Path which will be exposed to consumers
:param exposed_path: Relative path with respect to asapo root folder. Path should starts with `processed` if
producer type is `processed` or with `raw` is producer type is `raw`. Asapo root folder is defined automatically
based of beamline and beamtime. If data are not transferred (only cache transfer mode), path is used to be exposed in consumer.
:type exposed_path: string
:param data: data to send
:type data: contiguous buffer like numpy or bytes array, can be None for INGEST_MODE_TRANSFER_METADATA_ONLY ingest mode
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment