Introduction
This repository holds the configuration of the PRISM View Server (PVS).
The present README.md holds the architecture, conventions, relevant configuration, installation instructions, as well as canonical references.
Architecture
The PRISM View Server (PVS) uses various Docker images whereas core
,
cache
, client
, ingestor
, fluentd
and preprocessor
are build from this repository and
the others are pulled from docker hub.
Prerequisites
Object Storage (OBS)
Access keys to store preprocessed items and caches used by all services.
Access key to input items used by preprocessor.
Networks
One internal and one external network per stack.
Volumes
In base stack
- traefik-data
In logging stack
- logging_es-data
Per collection
- db-data used by database
- redis-data used by redis
- instance-data used by registrar and renderer
- report-data sftp output of reporting interface
- from-fepd - sftp input to ingestor
Services
The following services are defined via docker compose files.
reverse-proxy
- based on the external traefik image
- data stored in local volume on swarm master
- reads swarm changes from /var/run/docker.sock on swarm master
- provides the endpoint for external access
- configured via docker labels
database
- based on external postgis:10 image
- DB stored in local volume on swarm master
- provides database to all other services
redis
- based on external redis image
- data stored in local volume on swarm master
- holds these keys
- preprocessing
- preprocess-md_queue
- holds metadata in json including object path for image to be preprocessed
-
lpush
by ingestor or manually -
brpop
by preprocessor
- preprocess_queue
- holds items (tar object path) to be preprocessed
-
lpush
by ingestor or manually -
brpop
by preprocessor
- preprocessing_set
- holds ids for currently preprocessed items
-
sadd
by preprocessor
- preprocess-success_set
- holds ids for successfully preprocessed items
-
sadd
by preprocessor
- preprocess-failure_set
- holds ids for failed preprocessed items
-
sadd
by preprocessor
- preprocess-md_queue
- registration
- register_queue
- holds items (metadata and data objects prefix - same as tar object path above) to be registered
-
lpush
by preprocessor or manually -
brpop
by registrar
- registering_set
- holds ids for currently registered items
-
sadd
by registrar
- register-success_set
- holds ids for successfully registered items
-
sadd
by registrar
- register-failure_set
- holds its for failed registered items
-
sadd
by registrar
- register_queue
- seeding
- seed_queue
- time intervals to pre-seed
-
lpush
by registrar or manually -
brpop
by seeder
- seed-success_set
- seed-failure_set
- seed_queue
- preprocessing
ingestor
- based on ingestor image
- by default a flask app listening on
/
endpoint forPOST
requests with reports - or can be overriden to be used as inotify watcher on a configured folder for new appearance of reports
- accepts browse reports with references to images on Swift
- extracts the browse metadata (id, time, footprint, image reference)
-
lpush
metadata into apreprocess-md_queue
TODO: seeder
- based on cache image
- connects to DB
-
brpop
time interval from seed_queue - for each seed time and extent from DB
- pre-seed using renderer
preprocessor
- based on preprocessor image (GDAL 3.1)
- connects to OBS
-
brpop
item from preprocess_queue or preprocess-md_queue-
sadd
to preprocessing_set - downloads image or package from OBS
- translates to COG
- translates to GSC if needed
- uploads COG & GSC to OBS
- adds item (metadata and data object paths) to register_queue
-
sadd
to preprocess-{success|failure}_set -
srem
from preprocessing_set
-
registrar
- based on core image
- connects to OBS & database
- uses instance-data volume
-
brpop
item from register_queue-
sadd
... - register in DB
- (optional) store time:start/time:end in seed_queue
-
sadd/srem
...
-
cache
- based on cache image
- connects to OBS & database
- provides external service for WMS & WMTS
- either serves WMTS/WMS requests from cache or retrieves on-demand from renderer to store in cache and serve
renderer
- based on core image
- connects to OBS & database
- provides external service for OpenSearch, WMS, & WCS
- renders WMS requests received from cache or seeder
logging stack
- uses elasticsearch:7.9 & kibana:7.9 external images
- fluentd image is build and published to registry because of additional plugins
- ES data stored in local volume on swarm master
- external access allowed to kibana through traefik
- log parsing enabled for cache and core
sftp
- uses external atmoz/sftp image
- provides sftp access to two volumes for report exchange on registration result xmls and ingest requirement xmls
- accessible on swarm master on port 2222
- credentials supplied via config
Usage
Test locally using docker swarm
Initialize swarm & stack:
docker swarm init # initialize swarm
Build images:
docker build core/ --cache-from registry.gitlab.eox.at/esa/prism/vs/pvs_core -t registry.gitlab.eox.at/esa/prism/vs/pvs_core
docker build cache/ --cache-from registry.gitlab.eox.at/esa/prism/vs/pvs_cache -t registry.gitlab.eox.at/esa/prism/vs/pvs_cache
docker build preprocessor/ --cache-from registry.gitlab.eox.at/esa/prism/vs/pvs_preprocessor -t registry.gitlab.eox.at/esa/prism/vs/pvs_preprocessor
docker build client/ --cache-from registry.gitlab.eox.at/esa/prism/vs/pvs_client -t registry.gitlab.eox.at/esa/prism/vs/pvs_client
docker build fluentd/ --cache-from registry.gitlab.eox.at/esa/prism/vs/fluentd -t registry.gitlab.eox.at/esa/prism/vs/fluentd
docker build ingestor/ --cache-from registry.gitlab.eox.at/esa/prism/vs/pvs_ingestor -t registry.gitlab.eox.at/esa/prism/vs/pvs_ingestor
Or pull them from the registry:
docker login -u {DOCKER_USER} -p {DOCKER_PASSWORD} registry.gitlab.eox.at
docker pull registry.gitlab.eox.at/esa/prism/vs/pvs_core
docker pull registry.gitlab.eox.at/esa/prism/vs/pvs_cache
docker pull registry.gitlab.eox.at/esa/prism/vs/pvs_preprocessor
docker pull registry.gitlab.eox.at/esa/prism/vs/pvs_client
docker pull registry.gitlab.eox.at/esa/prism/vs/fluentd
docker pull registry.gitlab.eox.at/esa/prism/vs/ingestor
Create external network for stack to run:
docker network create -d overlay vhr18-extnet
docker network create -d overlay emg-extnet
docker network create -d overlay dem-extnet
Add following .env files with credentials to the cloned copy of the repository /env folder: vhr18_db.env
, vhr18_obs.env
, vhr18_django.env
.
create docker secrets:
Sensitive environment variables are not included in the .env files, and must be generated as docker secrets. All stacks currently share these secret names, therefore it must stay the same for all stacks. To create docker secrets run:
# replace the "<variable>" with the value of the secret
printf "<OS_PASSWORD_DOWNLOAD>" | docker secret create OS_PASSWORD_DOWNLOAD -
printf "<DJANGO_PASSWORD>" | docker secret create DJANGO_PASSWORD -
printf "<OS_PASSWORD>" | docker secret create OS_PASSWORD -
# for production base stack deployment, additonal basic authentication credentials list need to be created
# format of such a list used by traefik are username:hashedpassword (MD5, SHA1, BCrypt)
sudo apt-get install apache2-utils
htpasswd -n <username> >> auth_list.txt
docker secret create BASIC_AUTH_USERS_AUTH auth_list.txt
docker secret create BASIC_AUTH_USERS_APIAUTH auth_list_api.txt
Deploy the stack in dev environment:
docker stack deploy -c docker-compose.vhr18.yml -c docker-compose.vhr18.dev.yml -c docker-compose.logging.yml -c docker-compose.logging.dev.yml vhr18-pvs # start VHR_IMAGE_2018 stack in dev mode, for example to use local sources
docker stack deploy -c docker-compose.emg.yml -c docker-compose.emg.dev.yml -c docker-compose.logging.yml -c docker-compose.logging.dev.yml emg-pvs # start Emergency stack in dev mode, for example to use local sources
Deploy base stack in production environment:
docker stack deploy -c docker-compose.base.ops.yml base-pvs
First steps:
# To register first data, use the following command inside the registrar container:
UPLOAD_CONTAINER=<product_bucket_name> && python3 registrar.py --objects-prefix <product_object_storage_item_prefix>
# To see the catalog opensearch response in the attached web client, a browser CORS extension needs to be turned on.
Tear town stack including data:
docker stack rm vhr18-pvs # stop stack
docker volume rm vhr18-pvs_db-data # delete volumes
docker volume rm vhr18-pvs_redis-data
docker volume rm vhr18-pvs_traefik-data
docker volume rm vhr18-pvs_instance-data
Setup logging
To access the logs, navigate to http://localhost:5601 . Ignore all of the fancy enterprise capabilities and select Kibana > Discover in the hamburger menu.
On first run, you need to define an index pattern to select the data source for kibana in elastic search.
Since we only have fluentd, you can just use *
as index pattern.
Select @timestamp
as time field
(see also).
Example of a kibana query to discover logs of a single service:
https://<kibana-url>/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15m,to:now))&_a=(columns:!(path,size,code,log),filters:!(),index:<index-id>,interval:auto,query:(language:kuery,query:'%20container_name:%20"<service-name>"'),sort:!())
Development service stacks keep their logging to stdout/stderr unless logging
dev stack is used.
On production machine, fluentd
is set as a logging driver for docker daemon by modifying /etc/docker/daemon.json
to
{
"log-driver": "fluentd",
"log-opts": {
"fluentd-sub-second-precision": "true"
}
}
setup sftp
The SFTP
image allow remote access into 2 logging folders, you can define (edit/add) users, passwords and (UID/GID) in the respective configuration file ( e.g config/vhr_sftp_users.conf ).
The default username is eox
, once the stack is deployed you can sftp into the logging folders through port 2222 on -if you are running the dev stack- localhost :
sftp -P 2222 eox@127.0.0.1
You will log in into/home/eox/data
directory which contains the 2 logging directories : to/panda
and from/fepd
NOTE: The mounted directory that you are directed into is /home/user
, where user
is the username, hence when changing the username in the .conf
file, the sftp
mounted volumes path in docker-compose.<collection>.yml
must change respectively.
Documentation
Installation
python3 -m pip install sphinx recommonmark sphinx-autobuild
Generate html and synchronize with client/html/user-guide
make html
# For watched html automatic building
make html-watch
# For pdf output and sync it to client/html/
make latexpdf
# To shrink size of pdf
gs -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook -dPrinted=false -q -o View-Server_-_User-Guide_small.pdf View-Server_-_User-Guide.pdf
# make latexpdf and make html combined
make build
The documentation is generated in the respective _build/html directory.
Create software releases
Source code release
Create a TAR from source code:
git archive --prefix release-1.0.0.rc.1/ -o release-1.0.0.rc.1.tar.gz -9 master
Save Docker images:
docker save -o pvs_core.tar registry.gitlab.eox.at/esa/prism/vs/pvs_core
docker save -o pvs_cache.tar registry.gitlab.eox.at/esa/prism/vs/pvs_cache
docker save -o pvs_preprocessor.tar registry.gitlab.eox.at/esa/prism/vs/pvs_preprocessor
docker save -o pvs_client.tar registry.gitlab.eox.at/esa/prism/vs/pvs_client
docker save -o pvs_ingestor.tar registry.gitlab.eox.at/esa/prism/vs/pvs_ingestor
docker save -o fluentd.tar registry.gitlab.eox.at/esa/prism/vs/fluentd