VS issueshttps://gitlab.eox.at/esa/prism/vs/-/issues2021-11-21T16:29:00+01:00https://gitlab.eox.at/esa/prism/vs/-/issues/152Investigate multi project pipelines2021-11-21T16:29:00+01:00Nikola JankovicInvestigate multi project pipelinesCould be useful for integration testing once repos are separated:
- https://docs.gitlab.com/ee/ci/pipelines/multi_project_pipelines.html
- https://about.gitlab.com/blog/2018/10/31/use-multiproject-pipelines-with-gitlab-cicd/Could be useful for integration testing once repos are separated:
- https://docs.gitlab.com/ee/ci/pipelines/multi_project_pipelines.html
- https://about.gitlab.com/blog/2018/10/31/use-multiproject-pipelines-with-gitlab-cicd/ViewServer 2.0Nikola JankovicNikola Jankovichttps://gitlab.eox.at/esa/prism/vs/-/issues/151Branch v2.0 regressions2021-11-21T16:36:08+01:00Lubomir DoležalBranch v2.0 regressions- [x] Post handler in registrar does not work (reports)
- [ ] Registrar can only register products from the configured swift object storage container, not from arbitrary, which would be part of STAC item href
- [x] preprocessor can not c...- [x] Post handler in registrar does not work (reports)
- [ ] Registrar can only register products from the configured swift object storage container, not from arbitrary, which would be part of STAC item href
- [x] preprocessor can not create STAC item for already preprocessed products anew
- [x] preprocessor needs regex based asset name mapping of outputs, sorting is unrealiableViewServer 2.0https://gitlab.eox.at/esa/prism/vs/-/issues/144Create common software library to map to STAC2021-11-09T12:15:03+01:00Fabian SchindlerCreate common software library to map to STACThe task of the library is to map from whatever input is provided to a (list of) STAC items
Common use cases for major platforms should be covered by this library.
Examples:
- CREODIAS - OpenSearch - Sentinel-2:
- interpret OpenSear...The task of the library is to map from whatever input is provided to a (list of) STAC items
Common use cases for major platforms should be covered by this library.
Examples:
- CREODIAS - OpenSearch - Sentinel-2:
- interpret OpenSearch response (GeoJSON)
- pick `productIdentifier` property
- transform this to a valid S3 URL
- use stactools + sentinel2 to lookup that object storage prefix to create a STAC Item
- PRISM
- read list of object storage prefixes
- transform this to STAC items using the stored metadata thereViewServer 2.0Nikola JankovicNikola Jankovichttps://gitlab.eox.at/esa/prism/vs/-/issues/141STAC: Registrar2021-11-09T15:15:44+01:00Fabian SchindlerSTAC: RegistrarFind an approach of how STAC can be adopted as an input format and how the configuration must be adopted in order to support the change.Find an approach of how STAC can be adopted as an input format and how the configuration must be adopted in order to support the change.ViewServer 2.0Nikola JankovicNikola Jankovichttps://gitlab.eox.at/esa/prism/vs/-/issues/140STAC: Preprocessor2021-11-09T15:03:19+01:00Fabian SchindlerSTAC: PreprocessorFind an approach of how STAC can be adopted as an input/output format and how the configuration must be adopted in order to support the change.
- Preprocessor adds metadata to the actual content of STAC (image url + metadata), reference...Find an approach of how STAC can be adopted as an input/output format and how the configuration must be adopted in order to support the change.
- Preprocessor adds metadata to the actual content of STAC (image url + metadata), reference to the GSC etc.ViewServer 2.0Lubomir DoležalLubomir Doležalhttps://gitlab.eox.at/esa/prism/vs/-/issues/139STAC: Ingestor2021-11-21T16:22:10+01:00Fabian SchindlerSTAC: IngestorFind an approach to adopt STAC as an output format for the Ingestor.
How are the fields from the Browse Report XML mapped to the STAC item properties.
Example Browse Reports:
- Footprint browse: https://github.com/EOX-A/ngeo-b_autotes...Find an approach to adopt STAC as an output format for the Ingestor.
How are the fields from the Browse Report XML mapped to the STAC item properties.
Example Browse Reports:
- Footprint browse: https://github.com/EOX-A/ngeo-b_autotest/blob/branch-4-1/data/reference_test_data/browseReport_ASA_IM__0P_20100722_213840.xml
- Regular Grid browse: https://github.com/EOX-A/ngeo-b_autotest/blob/branch-4-1/data/test_data/ASA_WSM_1PNDPA20050331_075939_000000552036_00035_16121_0775.xml
- Rectified browse: https://github.com/EOX-A/ngeo-b_autotest/blob/branch-4-1/data/test_data/S2.xml
- Model in GeoTIFF browse: https://github.com/EOX-A/ngeo-b_autotest/blob/branch-4-1/data/test_data/rotated_axes.xmlViewServer 2.0Mussab AbdallaMussab Abdallahttps://gitlab.eox.at/esa/prism/vs/-/issues/138Investigate usage of Mapchete as a preprocessing tool2021-11-22T16:06:15+01:00Fabian SchindlerInvestigate usage of Mapchete as a preprocessing toolViewServer 2.0https://gitlab.eox.at/esa/prism/vs/-/issues/137Splitting VS into component repositories2021-11-22T16:14:23+01:00Fabian SchindlerSplitting VS into component repositoriesShould help with unit testing, documentation
Tasks:
- ~~Create gitlab group~~ -> `vs` group
- ~~Create repository for each specific component~~
- ~~Move source code~~
- ~~Scan for component specific issues and move them to the repo~...Should help with unit testing, documentation
Tasks:
- ~~Create gitlab group~~ -> `vs` group
- ~~Create repository for each specific component~~
- ~~Move source code~~
- ~~Scan for component specific issues and move them to the repo~~
- move deployment configurations to a separate repository -> https://gitlab.eox.at/esa/prism/configsViewServer 2.0https://gitlab.eox.at/esa/prism/vs/-/issues/136Adopt vsq2021-11-22T16:44:31+01:00Fabian SchindlerAdopt vsq`vsq` (https://gitlab.eox.at/esa/prism/vsq) should be adopted as the main queue interface for all components as it abstracts common tasks like enqueuing/fetching, waiting for results, setting up of a daemon, etc.
- Ensure that a compone...`vsq` (https://gitlab.eox.at/esa/prism/vsq) should be adopted as the main queue interface for all components as it abstracts common tasks like enqueuing/fetching, waiting for results, setting up of a daemon, etc.
- Ensure that a component can listen on and push to multiple configured queues rather than just one.
- copy over ngeo new daemon part of using different intermediate key for avoiding losses of reports (key is at least in one queue at a time), beware that there can be multiple registrars/preprocessors picking the items - to avoid multiple instances working on same item, numeric ID of container needs to be present in the intermediate queueViewServer 2.0https://gitlab.eox.at/esa/prism/vs/-/issues/128Using common data exchange format between components2021-11-22T17:45:36+01:00Fabian SchindlerUsing common data exchange format between components# Introduction
This is a collection of ideas and concepts with their respective advantages and drawbacks in their use.
The basic idea is to specify a common data exchange format encoding for most of the communications between the compo...# Introduction
This is a collection of ideas and concepts with their respective advantages and drawbacks in their use.
The basic idea is to specify a common data exchange format encoding for most of the communications between the components. The intention is to better decouple the systems and allow for better composability of the available components and potentially future ones.
It is hereby proposed to use STAC Items as a data exchange format between the components. The STAC Items are transient, in the sense that they are only put into the queues and not stored on volumes/buckets. [`VSQ`](https://gitlab.eox.at/esa/prism/vsq) allows to embed the STAC items into the JSON message structure. used.
General advantages are:
- it is possible to encode the footprint of the product directly in the JSON (but it can be set to `null` if not immediately available)
- there are several Python libraries available to digest, create or transform items (e.g: [PySTAC](https://github.com/stac-utils/pystac) and [stac.py](https://github.com/brazil-data-cube/stac.py)) but they are optional, as it is sometimes easier to simply work with the raw Python objects.
- it combines the data/metadata assets with readily available metadata values.
- referenced assets are not required to be on the same storage, allowing more flexibility
- the transient nature eliminates the requirement to create sidecar files to store metadata from one component to the next (such as GSC files generated in the preprocessor for the registrar)
Disadvantages:
- some concepts are harder to represent with STAC, such as data directories (object storage prefixes)
- it is not automatically clear how to deal with missing metadata. e.g: the `geometry` could be `null`, but how would the components handle that?
- verbosity. As the whole STAC Item is put into the queues, it may not be handy anymore to directly inspect the queues without additional tools.
## Components involved with registration/ingestion
This listing details what each component inputs/outputs and an assessment how the new format could be of use.
### *Ingestor*
- Input: Browse Report XML files
- Outputs: custom JSON format (basically translation of XML -> JSON) which currently only the preprocessor is able to handle properly
- Assessment: The custom JSON format could easily be replaced with the STAC Item format, which would standardize it, and allow for an easier integration with other components.
### *Preprocessor*
- Input: Object storage prefix or custom JSON format
- Output: Object storage prefix
- Assessment: Arguably, this component would benefit the most of a switch to STAC Items. Using the `assets` it is easily distinguishable which assets are of interest. Also, metadata of the input STAC Item could simply be passed through, without the preprocessor being required to understand it. In essence, only the asset links would have to be replaced or enriched with the processed items.
### *Registrar*
- Input: Object storage prefix
- Output: none
- Assessment: The current approach is not very stable. Several "schemes" are tried and checked whether they can be applied to be registered. Unifying this to STAC Items would greatly reduce the number of code paths. Metadata from the STAC Item could easily be handed through and mapped to the internal metadata model. It could be interesting to allow to forward the registered item to the next queue, so that the registrar is not necessarily the "dead-end" of the whole ingestion queue. (e.g: to start seeding the registered product)
### *Seeder*
- Input: ???
- Output: none
- Assessment: This component is currently not implemented in the new VS. In theory, it could retrieve seeding requests in the form of STAC Items to get the region and time of interest to seed.
### *Harvester*
- Input: custom JSON or raw values
- Output: tbd
- Assessment: currently there is no data format defined, STAC Items would be a "natural" fit as STAC API is actually one of the intended backends. Some backends may be more tricky though: e.g: object storage listings are not easily translatable into STAC Items without actually reading metadata files at that location. Some OADS outputs (`.index` files, basically just CSV) could actually map quite nicely into STAC Items.
## Usage example
### Harvester -> Preprocessor -> Registrar -> Seeder
In this example scenario, the Harvester queries an external catalogue and either passes through the STAC Items or transforms them to that format. The items are written to the queue and the harvester is oblivious of which component is the next in the chain.
The preprocessor has an immediate list of files (`assets`) to work with. There is usually no need to retrieve additional metadata, but if necessary a referenced metadata file can be opened to read that. It processes selected files from the assets, and creates a copy of the STAC Item input file and adds the preprocessed files as new assets. All other metadata is kept for other components to digest. This new STAC Item is send to the next queue.
The registrar receives the STAC Item and based on its contents and the configuration starts the registration into its backends. If successful, the STAC Item is passed on the the next component without modification.
The seeder uses the stored spatiotemporal information in the STAC Item to start the seeding process.ViewServer 2.0