EOX GitLab Instance

Skip to content
Snippets Groups Projects
Commit c9a8be2c authored by Lubomir Dolezal's avatar Lubomir Dolezal
Browse files

docs finalize ingestion part

parent bd5d090c
No related branches found
No related tags found
2 merge requests!36Staging to master to prepare 1.0.0 release,!34Shib auth
......@@ -277,6 +277,8 @@ source/target
Here, the source file storage and the target file storage are configured.
This can either be a local directory or an OpenStack Swift object storage.
If Swift is used for source, download container can be left unset. In that case,
container can be inferred from the given path in format <bucket>/<object-name>.
workdir
......@@ -531,5 +533,7 @@ An example of creating ``BASIC_AUTH_USERS_AUTH`` secret:
htpasswd -nb user2 YyuN9bYRvBUUU6COx7itWw5qyyARus >> auth_list.txt
docker secret create BASIC_AUTH_USERS_AUTH auth_list.txt
For configuration of the ``shibauth`` service, please consult a separate chapter :ref:`access`.
The next section :ref:`management` describes how an operator interacts with a
deployed VS stack.
......@@ -82,7 +82,7 @@ For a more concrete example the following command executes a
.. code-block:: bash
redis-cli lpush preprocess_queue "/data25/OA/PL00/1.0/00/urn:eop:DOVE:MULTISPECTRAL_4m:20180811_081455_1054_3be7/0001/PL00_DOV_MS_L3A_20180811T081455_20180811T081455_TOU_1234_3be7.DIMA.tar"
redis-cli lpush preprocess_queue "data25/OA/PL00/1.0/00/urn:eop:DOVE:MULTISPECTRAL_4m:20180811_081455_1054_3be7/0001/PL00_DOV_MS_L3A_20180811T081455_20180811T081455_TOU_1234_3be7.DIMA.tar"
Usually, with a preprocessor service running and no other items in the
``preprocess_queue`` this value will be immediately popped from the list and
......@@ -92,7 +92,7 @@ of the ``preprocess_queue``:
.. code-block:: bash
$ redis-cli lrange preprocess_queue 0 -1
/data25/OA/PL00/1.0/00/urn:eop:DOVE:MULTISPECTRAL_4m:20180811_081455_1054_3be7/0001/PL00_DOV_MS_L3A_20180811T081455_20180811T081455_TOU_1234_3be7.DIMA.tar
data25/OA/PL00/1.0/00/urn:eop:DOVE:MULTISPECTRAL_4m:20180811_081455_1054_3be7/0001/PL00_DOV_MS_L3A_20180811T081455_20180811T081455_TOU_1234_3be7.DIMA.tar
Now that the product is being preprocessed, it should be visible in the
``preprocessing_set``. As the name indicates, this is using the ``Set``
......@@ -101,7 +101,7 @@ datatype, thus requiring the ``SMEMBERS`` subcommand to list:
.. code-block:: bash
$ redis-cli smembers preprocessing_set 0 -1
/data25/OA/PL00/1.0/00/urn:eop:DOVE:MULTISPECTRAL_4m:20180811_081455_1054_3be7/0001/PL00_DOV_MS_L3A_20180811T081455_20180811T081455_TOU_1234_3be7.DIMA.tar
data25/OA/PL00/1.0/00/urn:eop:DOVE:MULTISPECTRAL_4m:20180811_081455_1054_3be7/0001/PL00_DOV_MS_L3A_20180811T081455_20180811T081455_TOU_1234_3be7.DIMA.tar
Once the preprocessing of the product is finished, the preprocessor will remove
the currently worked on path from the ``preprocessing_set`` and add it either
......@@ -130,6 +130,27 @@ added to the ``registering_set``, afterwards the path is placed to either the
sets can be inspected by the ``LRANGE`` or ``SMEMBERS`` subcommands
respectively.
Ingestor and sftp
~~~~~~~~~~~~~~~~~
Triggering preprocessing and registration via pushing to the redis queues is very convenient for single ingestion campaigns, but not optimal for continuous ingestion of new products from "live" sources.
``Ingestor`` service, together optionally with ``sftp`` service allow data ingestion to be initiated by external means.
``Ingestor`` can work in two modes:
- Default: Exposing a simple ``/`` endpoint, and listening for ``POST`` requests containing ``data`` with either a Browse Report JSON or a string with path to the object storage with product to be ingested. It then parses this informatio and internally puts it into configured redis queue (preprocess or register).
- Alternative: Listening for newly added Browse Report files on a configured path on a file system via ``inotify``.
These Browse Report files need to be in an agreed XML schema to be correctly handled.
``Sftp`` service enables a secure access to a configured folder via sftp, while this folder can be mounted to other vs services. This way, ``Ingestor`` can listen for newly created files by the sftp access.
If the filedaemon alternative mode should be used, ``INOTIFY_WATCH_DIR`` environment variable needs to be set and a ``command`` used in the docker-compose.<stack>.ops.yml for ``ingestor`` service needs to be set to ``python3 filedaemon.py``:
.. code-block:: yaml
ingestor:
environment:
REDIS_PREPROCESS_MD_QUEUE_KEY: "preprocess_queue" # to override md_queue (json) and instead use (string)
command:
["python3", "/filedaemon.py"]
Direct Data Management
----------------------
......@@ -288,9 +309,9 @@ Deregistration
Preprocessing vs registration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The preprocessing step aims to ensure that cloud optimized GeoTIFF (COG) files are created in order to significantly speed up the viewing of large amount of data in lower zooms. There are several cases, where such preprocessing is not necessary or wanted.
The preprocessing step aims to ensure that cloud optimized GeoTIFF (COG) files are created in order to significantly speed up the viewing of large amount of data in lower zoom levels. There are several cases, where such preprocessing is not necessary or wanted.
- If data are already in COGs and in favorable projection, which will be presented to the user for most of the times, direct registration should be used. This means, paths to individual products will be pushed directly to the register queues.
- If data are already in COGs and in favorable projection, which will be presented to the user for most of the times, direct registration should be used. This means, paths to individual products will be pushed directly to the register-queue.
- Also for cases, where preprocessing step would take too much time, direct registration allowing access to the metadata and catalog functions, while justifying slower rendering times can be preferred.
......
......@@ -62,8 +62,7 @@ shutting down of the stack and new deployment.
Inspecting reports
------------------
Once registered, a xml report containing wcs and wms getcapabilities of the registered product is generated and can be accessed by connecting to the `SFTP` image
via the sftp protocol.
Once a product is registered, a xml report containing wcs and wms getcapabilities of the registered product is generated and can be accessed by connecting to the `SFTP` service via the sftp protocol.
In order to log into the logging folders through port 2222 on the hosting ip (e.g. localhost if you are running the dev stack) The following command can be used:
.. code-block:: bash
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment