Generator approach in harvesting
For larger harvests, it makes sense to push item to the queue iteratively, instead of waiting for the whole result set.
This can be achieved quite easily by transforming the
endpoint.harvest method to
yield items instead of
return them. Also the
cql_filter function must be adapted. The
stringify function could look like this:
def stringify( result: Generator[dict], mode: str = "item", extract_property: Optional[str] = None ): if mode == "item": yield from (json.dumps(item, default=str) for item in result) elif mode == "property": yield from (item["properties"][extract_property] for item in result)
main looks quite unchanged, only the
client.lpush(harvest_config["queue"], *encoded) must be done in a for-loop.