-
Mussab Abdalla authoredMussab Abdalla authored
Configuration
This chapter details how a running VS stack can be configured. And what steps are necessary to deploy the configuration.
In order for these configuration changes to be picked up by a running VS stack and to take effect some steps need to be performed. These steps are either a "re-deploy" of the running stack or a complete re-creation of it.
Stack Re-deploy
As will be further described, for some configurations it is sufficient to "re-deploy" the stack which automatically re-starts any service with changed configuration. This is done re-using the stack deployment command:
docker stack deploy -c docker-compose.<name>.yml -c docker-compose.<name>.dev.yml <stack-name>
Warning
When calling the docker stack deploy
command, it is vital to use the
command with the same files and name the stack was originally created.
Stack Re-creation
In some cases a stack re-redeploy is not enough, as the configuration was used
for a materialized instance which needs to be reverted. The easiest way to do
this is to delete the volume in question. If, for example, the
renderer/registrar configuration was updated, the instance-data
volume
needs to be re-created.
First, the stack needs to be shut down. This is done using the following command:
docker stack rm <stack-name>
When that command has completed (it is advisable to wait for some time until
all containers have actually stopped) the next step is to delete the
instance-data
volume:
docker volume rm <stack-name>_instance-data
Note
It is possible that this command fails, with the error message that the volume is still in use. In this case, it is advisable to wait for a minute and to try the deletion again.
Now that the volume was deleted, the stack can be re-deployed as described
above, which will trigger the automatic re-creation and initialization of the
volume. For the instance-data
, it means that the instance will be
re-created and all database models with it.
Docker Compose Settings
These configurations are altering the behavior of the stack itself and its contained services. A complete reference of the configuration file structure can be found in the Docker Compose documentation.
Environment Variables
These variables are passed to their respective containers environment and
change the behavior of certain functionality. They can be declared in the
Docker Compose configuration file directly, but typically they are bundled by
field of interest and then placed into .env
files and then passed to the
containers. So for example, there will be a <stack-name>_obs.env
file
to store the access parameters for the object storage.
All those files are placed in the env/
directory in the instances
directory.
Environment variables and .env
files are passed to the services via the
docker-compose.yml
directives. The following example shows how to pass
.env
files and direct environment variables:
services:
# ....
registrar:
env_file:
- env/stack.env
- env/stack_db.env
- env/stack_obs.env
- env/stack_redis.env
environment:
INSTANCE_ID: "prism-view-server_registrar"
INSTALL_DIR: "/var/www/pvs/dev/"
INIT_SCRIPTS: "/configure.sh /init-db.sh /initialized.sh"
STARTUP_SCRIPTS: "/wait-initialized.sh"
WAIT_SERVICES: "redis:6379 database:5432"
OS_PASSWORD_FILE: "/run/secrets/OS_PASSWORD"
# ...
.env
Files
The following .env
files are typically used:
-
<stack-name>.env
: The general.env
file used for all services -
<stack-name>_db.env
: The database access credentials, for all services interacting with the database. -
<stack-name>_django.env
: This env files defines the credentials for the django admin user to be used with the admin GUI. -
<stack-name>_obs.env
: This contains access parameters for the object storage(s). -
<stack-name>_redis.env
: Redis access credentials and queue names
Groups of Environment Variables
GDAL Environment Variables
This group of environment variables controls the intricacies of GDAL. They control how GDAL interacts with its supported files. As GDAL supports a variety of formats and backend access, most of the full list of env variables are not applicable and only a handful are actually relevant for the VS.
-
GDAL_DISABLE_READDIR_ON_OPEN
- Especially when using an Object Storage backend with a very large number of files, it is vital to activate this setting (=TRUE
) in order to suppress to read the whole directory contents which is very slow for some OBS backends. -
CPL_VSIL_CURL_ALLOWED_EXTENSIONS
- This limits the file extensions to disable the lookup of so called sidecar files which are not used for VS. By default this value is used:=.TIF,.tif,.xml
.
OpenStack Swift Environment Variables
These variables define the access coordinates and credentials for the OpenStack Swift Object storage backend.
This set of variables define the credentials for the object storage to place the preprocessed results:
ST_AUTH_VERSION
OS_AUTH_URL_SHORT
OS_AUTH_URL
OS_USERNAME
OS_PASSWORD
OS_TENANT_NAME
OS_TENANT_ID
OS_REGION_NAME
OS_USER_DOMAIN_NAME
This set of variables define the credentials for the object storage to retrieve the original product files:
OS_USERNAME_DOWNLOAD
OS_PASSWORD_DOWNLOAD
OS_TENANT_NAME_DOWNLOAD
OS_TENANT_ID_DOWNLOAD
OS_REGION_NAME_DOWNLOAD
OS_AUTH_URL_DOWNLOAD
ST_AUTH_VERSION_DOWNLOAD
OS_USER_DOMAIN_NAME_DOWNLOAD
VS Environment Variables
These environment variables are used by the VS itself to configure various parts.
Note
These variables are used during the initial stack setup. When these variables are changed, they will not be reflected unless the instance volume is re-created.
-
COLLECTION
- This defines the main collections name. This is used in various parts of the VS and serves as the layer base name. -
UPLOAD_CONTAINER
- This controls the bucket name where the preprocessed images are uploaded to. -
DJANGO_USER
,DJANGO_MAIL
,DJANGO_PASSWORD
- The Django admin user account credentials to use the Admin GUI. -
REPORTING_DIR
- This sets the directory to write the reports of the registered products to.
Note
These variables are used during the initial stack setup. When these variables are changed, they will not be reflected unless the database volume is re-created.
These are the internal access credentials for the database:
POSTGRES_USER
POSTGRES_PASSWORD
POSTGRES_DB
DB
DB_USER
DB_PW
DB_HOST
DB_PORT
DB_NAME
Configuration Files
Such files are passed to the containers in a similar way as environment variables, but usually contain more settings at once and are placed at a specific path in the container at runtime.
Configuration files are passed into the containers using the configs
section of the docker-compose.yaml
file. The following example shows how
such a configuration file is defined and the used in a service:
# ...
configs:
my-config:
file: ./config/example.cfg
# ...
services:
myservice:
# ...
configs:
- source: my-config
target: /example.cfg
The following configuration files are used throughout the VS:
<stack-name>_init-db.sh
This shell script file's purpose is to set up the EOxServer instance used by both the renderer and registrar.
<stack-name>_index-dev.html
/<stack-name>_index-ops.html
The clients main HTML page, containing various client settings. The dev
one
is used for development only, whereas the ops
one is used for operational
deployment.
<stack-name>_mapcache-dev.xml
/<stack-name>_mapcache-ops.xml
The configuration file for MapCache, the software powering the cache service.
Similarly to the client configuration files, the dev
and ops
files
used for development and operational usage respectively. Further
documentation can be found at the official site.
<stack-name>_preprocessor-config.yaml
The configuration for the proprocessing service to use to process to be ingested files.
The files are using YAML as a format and are structured in the following fashion:
source/target
Here, the source file storage and the target file storage are configured. This can either be a local directory or an OpenStack Swift object storage.
workdir
The workdir can be configured, to determine where the intermediate files are placed. This can be convenient for debugging and development.
keep_temp
This boolean decides if the temporary directory for the preprocessing will be cleaned up after being finished. Also, convenient for development.
metadata_glob
This file glob is used to determine the main metadata file to extract the product type from. This file will be searched in the downloaded package.
glob_case
If all globs will be used in a case-sensitive way.
type_extractor
This setting configures how the product type is extracted from the previously extracted metadata. In thexpath
setting one or more XPath expressions can supplied to fetch the product type. Each XPath will be tried until one is found that produces a result. These results can then be mapped using themap
dictionary.
level_extractor
This section works very similar to the type_extractor
but only for the
product level. The product level is currently not used.
preprocessing
This is the actual preprocessing configuration setting. It is split in defaults and product type specific settings. The defaults are applied where there is no setting supplied for that specific type. The product type is the one extracted earlier.
defaults
This section allows to configure any one of the available steps. Each step configuration can be overridden in a specific product type configuration.
The available steps are as follows:
custom_preprocessor
A custom python function to be called.
path
The Python module path to the function to call.args
A list of arguments to pass to the function.kwargs
A dictionary of keyword arguments to pass to the function.subdatasets
What subdatasets to extract and how to name them.
data_file_glob
A file glob pattern to select files to extract from.subdataset_types
Mapping of subdataset identifier to output filename postfix for subdatasets to be extracted for each data file.georeference
How the extracted files shall be georeferenced.
type
The type of georeferencing to apply. One ofgcp
,rpc
,corner
,world
.options
Additional options for the georeferencing. Depends on the type of georeferencing.
order
The polynomial order to use for GCP related georeferencing.projection
The projection to use for ungeoreferenced images.rpc_file_template
The file glob template to use to find the RPC file. Template parameters are {filename}, {fileroot}, and {extension}.warp_options
Warp options. See https://gdal.org/python/osgeo.gdal-module.html#WarpOptions for detailscorner_names
The metadata field name including the corner names. Tuple of four: bottom-left, bottom-right, top-left and top-rightorbit_direction_name
The metadata field name containing the orbit directionforce_north_up
TODOtps
Whether to use TPS transformation instead of GCP polynomials.calc
Calculate derived data using formulas.
formulas
A list of formulas to use to calculate derived data. Each has the following fields
inputs
A map of characters in the range of A-Z to respective inputs. Each has the following properties
glob
The input file globband
The input file band index (1-based)data_type
The GDAL data type name for the outputformula
The formula to apply. See https://gdal.org/programs/gdal_calc.html#cmdoption-calc for details.output_postfix
The postfix to apply for the filename of the created file.nodata_value
The nodata value to be used.stack_bands
Concatenate bands and arrange them in a single file.
group_by
A regex to group the input datasets, if consisting of multiple file. The first regex group is used for the grouping.sort_by
A regex to select a portion of the filename to be used for sorting. The first regex group is used.order
The order of the extracted item used in 'sort_by'. When the value extracted bysort_by
is missing, then that file will be dropped.output
Final adjustments to generate an output file. Add overviews, reproject to a common projection, etc.
options
Options to be passed to gdal.Warp. See https://gdal.org/python/osgeo.gdal-module.html#WarpOptions for details.custom_preprocessor
A custom python function to be called.
path
The Python module path to the function to call.args
A list of arguments to pass to the function.kwargs
A dictionary of keyword arguments to pass to the function.types
This mapping of product type identifier to step configuration allows to define specific step settings, even overriding the values from the defaults.
Sensitive variables
Since environment variables include credentials that are considered sensitive,
avoiding their exposure inside .env
files would be the right practice.
In order to manage transmitting sensitive data securely into the respective containers,
docker secrets with the values of these variables should be created. Currently, three
variables have to be saved as docker secrets before deploying the swarm:
OS_PASSWORD
, OS_PASSWORD_DOWNLOAD
and DJANGO_PASSWORD
.
Two other docker secrets need to be created for traefik basic authentication:
BASIC_AUTH_USERS_AUTH
- used for access to services, BASIC_AUTH_USERS_APIAUTH
- used for admin access to kibana and traefik.
These secrets should be text files containing a list of username:hashedpassword (MD5, SHA1, BCrypt) pairs.
Additionally, the configuration of the sftp
image containes sensitive information, and therefore, is created using docker configs.
An example of creating configurations for sftp image using the following command :
printf "<user>:<password>:<UID>:<GID>" | docker config create sftp-users -
An example of creating OS_PASSWORD
as secret using the following command :
printf "<password_value>" | docker secret create OS_PASSWORD -
An example of creating BASIC_AUTH_USERS_AUTH
secret:
htpasswd -nb user1 3vYxfRqUx4H2ar3fsEOR95M30eNJne >> auth_list.txt
htpasswd -nb user2 YyuN9bYRvBUUU6COx7itWw5qyyARus >> auth_list.txt
docker secret create BASIC_AUTH_USERS_AUTH auth_list.txt
The next section :ref:`management` describes how an operator interacts with a deployed VS stack.