Skip to content

Commit f93246e

Browse files
authored
enhance support for limits (RFC5) (#1856) (#1892)
* add enhanced support for limits (RFC5) (#1856) * update GitHub actions * update GitHub actions * update GitHub actions * update GitHub actions
1 parent 86e5b78 commit f93246e

25 files changed

Lines changed: 304 additions & 129 deletions

.github/workflows/containers.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jobs:
2525
contents: read
2626
steps:
2727
- name: Check out the repo
28-
uses: actions/checkout@v3
28+
uses: actions/checkout@master
2929

3030
- name: Set up QEMU
3131
uses: docker/setup-qemu-action@v2.1.0

.github/workflows/docs.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ jobs:
2323
include:
2424
- python-version: '3.10'
2525
steps:
26-
- uses: actions/checkout@v2
27-
- uses: actions/setup-python@v2
26+
- uses: actions/checkout@master
27+
- uses: actions/setup-python@v5
2828
name: Setup Python ${{ matrix.python-version }}
2929
with:
3030
python-version: ${{ matrix.python-version }}

.github/workflows/flake8.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ jobs:
77
flake8_py3:
88
runs-on: ubuntu-22.04
99
steps:
10-
- uses: actions/checkout@v3
11-
- uses: actions/setup-python@v3
10+
- uses: actions/checkout@master
11+
- uses: actions/setup-python@v5
1212
name: setup Python
1313
with:
1414
python-version: '3.10'

.github/workflows/main.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ jobs:
3737
- name: Chown user
3838
run: |
3939
sudo chown -R $USER:$USER $GITHUB_WORKSPACE
40-
- uses: actions/checkout@v2
41-
- uses: actions/setup-python@v2
40+
- uses: actions/checkout@master
41+
- uses: actions/setup-python@v5
4242
name: Setup Python ${{ matrix.python-version }}
4343
with:
4444
python-version: ${{ matrix.python-version }}
@@ -69,21 +69,21 @@ jobs:
6969
security-disabled: true
7070
port: 9209
7171
- name: Install and run MongoDB
72-
uses: supercharge/mongodb-github-action@1.5.0
72+
uses: supercharge/mongodb-github-action@1.12.0
7373
with:
7474
mongodb-version: 4.4
7575
- name: Install and run SensorThingsAPI
7676
uses: cgs-earth/sensorthings-action@v0.1.0
7777
- name: Install sqlite and gpkg dependencies
78-
uses: awalsh128/cache-apt-pkgs-action@latest
78+
uses: awalsh128/cache-apt-pkgs-action@v1.4.3
7979
with:
8080
packages: libsqlite3-mod-spatialite
8181
version: 4.3.0a-6build1
8282
- name: Use ubuntuGIS unstable ppa
8383
run: sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable && sudo apt update
8484
shell: bash
8585
- name: Install GDAL with Python bindings
86-
uses: awalsh128/cache-apt-pkgs-action@latest
86+
uses: awalsh128/cache-apt-pkgs-action@v1.4.3
8787
with:
8888
packages: gdal-bin libgdal-dev
8989
version: 3.8.4

docker/default.config.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ server:
4646
cors: true
4747
pretty_print: true
4848
admin: ${PYGEOAPI_SERVER_ADMIN:-false}
49-
limit: 10
49+
limits:
50+
default_items: 10
51+
max_items: 50
5052
# templates: /path/to/templates
5153
map:
5254
url: https://tile.openstreetmap.org/{z}/{x}/{y}.png

docs/source/configuration.rst

Lines changed: 85 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,15 @@ For more information related to API design rules (the ``api_rules`` property in
4949
gzip: false # default server config to gzip/compress responses to requests with gzip in the Accept-Encoding header
5050
cors: true # boolean on whether server should support CORS
5151
pretty_print: true # whether JSON responses should be pretty-printed
52-
limit: 10 # server limit on number of items to return
52+
53+
limits: # server limits on number of items to return. This property can also be defined at the resource level to override global server settings
54+
default_items: 50
55+
max_items: 1000
56+
max_distance_x: 25
57+
max_distance_y: 25
58+
max_distance_units: m
59+
on_exceed: throttle # throttle or error (default=throttle)
60+
5361
admin: false # whether to enable the Admin API
5462
5563
# optional configuration to specify a different set of templates for HTML pages. Recommend using absolute paths. Omit this to use the default provided templates
@@ -254,6 +262,41 @@ default.
254262
.. seealso::
255263
:ref:`plugins` for more information on plugins
256264

265+
Using environment variables
266+
---------------------------
267+
268+
pygeoapi configuration supports using system environment variables, which can be helpful
269+
for deploying into `12 factor <https://12factor.net/>`_ environments for example.
270+
271+
Below is an example of how to integrate system environment variables in pygeoapi.
272+
273+
.. code-block:: yaml
274+
275+
server:
276+
bind:
277+
host: ${MY_HOST}
278+
port: ${MY_PORT}
279+
280+
Multiple environment variables are supported as follows:
281+
282+
.. code-block:: yaml
283+
284+
data: ${MY_HOST}:${MY_PORT}
285+
286+
It is also possible to define a default value for a variable in case it does not exist in
287+
the environment using a syntax like: ``value: ${ENV_VAR:-the default}``
288+
289+
.. code-block:: yaml
290+
291+
server:
292+
bind:
293+
host: ${MY_HOST:-localhost}
294+
port: ${MY_PORT:-5000}
295+
metadata:
296+
identification:
297+
title:
298+
en: This is pygeoapi host ${MY_HOST} and port ${MY_PORT:-5000}, nice to meet you!
299+
257300
Adding links to collections
258301
---------------------------
259302

@@ -389,53 +432,6 @@ If omitted, no header will be added. Common names for this header are ``API-Vers
389432
Note that pygeoapi already adds a ``X-Powered-By`` header by default that includes the software version number.
390433

391434

392-
Validating the configuration
393-
----------------------------
394-
395-
To ensure your configuration is valid, pygeoapi provides a validation
396-
utility that can be run as follows:
397-
398-
.. code-block:: bash
399-
400-
pygeoapi config validate -c /path/to/my-pygeoapi-config.yml
401-
402-
403-
Using environment variables
404-
---------------------------
405-
406-
pygeoapi configuration supports using system environment variables, which can be helpful
407-
for deploying into `12 factor <https://12factor.net/>`_ environments for example.
408-
409-
Below is an example of how to integrate system environment variables in pygeoapi.
410-
411-
.. code-block:: yaml
412-
413-
server:
414-
bind:
415-
host: ${MY_HOST}
416-
port: ${MY_PORT}
417-
418-
Multiple environment variables are supported as follows:
419-
420-
.. code-block:: yaml
421-
422-
data: ${MY_HOST}:${MY_PORT}
423-
424-
It is also possible to define a default value for a variable in case it does not exist in
425-
the environment using a syntax like: ``value: ${ENV_VAR:-the default}``
426-
427-
.. code-block:: yaml
428-
429-
server:
430-
bind:
431-
host: ${MY_HOST:-localhost}
432-
port: ${MY_PORT:-5000}
433-
metadata:
434-
identification:
435-
title:
436-
en: This is pygeoapi host ${MY_HOST} and port ${MY_PORT:-5000}, nice to meet you!
437-
438-
439435
Hierarchical collections
440436
------------------------
441437

@@ -507,6 +503,36 @@ Examples:
507503
curl https://example.org/collections/lakes/items # only the name attribute is returned in properties
508504
curl https://example.org/collections/lakes/items/{item_id} # only the name attribute is returned in properties
509505
506+
Limiting data responses
507+
-----------------------
508+
509+
pygeoapi defines a ``limits`` configuration parameter that will allow a user to define default and maximum limits for multiple data types. This parameter is defined at the server level (``server.limits``) with the ability to override at resource level (``resources[*].limits``). An example of this setting is shown below:
510+
511+
.. code-block:: yaml
512+
513+
limits:
514+
default_items: 10 # applies to vector data
515+
max_items: 500 # applies to vector data
516+
max_distance_x: 123 # applies to all datasets
517+
max_distance_y: 456 # applies to all datasets
518+
max_distance_units: m # as per UCUM https://ucum.org/ucum#section-Tables-of-Terminal-Symbols
519+
on_exceed: error # one of error, throttle
520+
521+
The ``limits`` setting is applied as follows:
522+
523+
- can be defined at both the server and resources levels, with resource limits overriding server wide limits settings
524+
- ``on_exceed`` can be set to ``error`` or ``throttle`` (default). If a client specified limit exceeds those set by the server:
525+
- when set to ``error``, an exception is returned
526+
- when set to ``throttle`` the maximum data allowed by the collection/server/provider is returned
527+
528+
Vector data (features, records)
529+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
530+
- when a limit not specified by the client, ``limits.default_items`` can be used to set the result set size
531+
- when a limit is specified by the client, the minimum of the ``limit`` parameter and ``limits.max_items`` is calculated to set the result set size
532+
533+
Raster data (coverages, environmental data retrieval)
534+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
535+
- when a bbox or spatial subset is specified by the client, ``limits.max_distance_x``, ``limits.max_distance_y`` and ``limits.max_distance_units`` are used to determine whether a request has asked for more data than the collection is configured to provide and respond accordingly (via ``on_exceed``)
510536

511537
Linked Data
512538
-----------
@@ -638,6 +664,17 @@ deployment flexibility, the path can be specified with string interpolation of e
638664
The template ``tests/data/base.jsonld`` renders the unmodified JSON-LD. For more information on the capacities
639665
of Jinja2 templates, see :ref:`html-templating`.
640666

667+
Validating the configuration
668+
----------------------------
669+
670+
To ensure your configuration is valid, pygeoapi provides a validation
671+
utility that can be run as follows:
672+
673+
.. code-block:: bash
674+
675+
pygeoapi config validate -c /path/to/my-pygeoapi-config.yml
676+
677+
641678
Summary
642679
-------
643680

pygeoapi-config.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
#
33
# Authors: Tom Kralidis <tomkralidis@gmail.com>
44
#
5-
# Copyright (c) 2020 Tom Kralidis
5+
# Copyright (c) 2025 Tom Kralidis
66
#
77
# Permission is hereby granted, free of charge, to any person
88
# obtaining a copy of this software and associated documentation
@@ -41,7 +41,9 @@ server:
4141
- fr-CA
4242
# cors: true
4343
pretty_print: true
44-
limit: 10
44+
limits:
45+
default_items: 20
46+
max_items: 50
4547
# templates:
4648
# path: /path/to/Jinja2/templates
4749
# static: /path/to/static/folder # css/js/img

pygeoapi/api/__init__.py

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
Returns content from plugins and sets responses.
4141
"""
4242

43-
from collections import OrderedDict
43+
from collections import ChainMap, OrderedDict
4444
from copy import deepcopy
4545
from datetime import datetime
4646
from functools import partial
@@ -1623,3 +1623,43 @@ def validate_subset(value: str) -> dict:
16231623
subsets[subset_name] = list(map(get_typed_value, values))
16241624

16251625
return subsets
1626+
1627+
1628+
def evaluate_limit(requested: Union[None, int], server_limits: dict,
1629+
collection_limits: dict) -> int:
1630+
"""
1631+
Helper function to evaluate limit parameter
1632+
1633+
:param requested: the limit requested by the client
1634+
:param server_limits: `dict` of server limits
1635+
:param collection_limits: `dict` of collection limits
1636+
1637+
:returns: `int` of evaluated limit
1638+
"""
1639+
1640+
effective_limits = ChainMap(collection_limits, server_limits)
1641+
1642+
default = effective_limits.get('default_items', 10)
1643+
max_ = effective_limits.get('max_items', 10)
1644+
on_exceed = effective_limits.get('on_exceed', 'throttle')
1645+
1646+
LOGGER.debug(f'Requested limit: {requested}')
1647+
LOGGER.debug(f'Default limit: {default}')
1648+
LOGGER.debug(f'Maximum limit: {max_}')
1649+
LOGGER.debug(f'On exceed: {on_exceed}')
1650+
1651+
if requested is None:
1652+
LOGGER.debug('no limit requested; returning default')
1653+
return default
1654+
1655+
requested2 = get_typed_value(requested)
1656+
if not isinstance(requested2, int):
1657+
raise ValueError('limit value should be an integer')
1658+
1659+
if requested2 <= 0:
1660+
raise ValueError('limit value should be strictly positive')
1661+
elif requested2 > max_ and on_exceed == 'error':
1662+
raise RuntimeError('Limit exceeded; throwing errror')
1663+
else:
1664+
LOGGER.debug('limit requested')
1665+
return min(requested2, max_)

pygeoapi/api/environmental_data_retrieval.py

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@
4747
from shapely.wkt import loads as shapely_loads
4848

4949
from pygeoapi import l10n
50+
from pygeoapi.api import evaluate_limit
5051
from pygeoapi.plugin import load_plugin, PLUGINS
5152
from pygeoapi.provider.base import (
5253
ProviderGenericError, ProviderItemNotFoundError)
@@ -342,6 +343,21 @@ def get_collection_edr_query(api: API, request: APIRequest,
342343
HTTPStatus.BAD_REQUEST, headers, request.format,
343344
'InvalidParameterValue', msg)
344345

346+
LOGGER.debug('Processing limit parameter')
347+
if api.config['server'].get('limit') is not None:
348+
msg = ('server.limit is no longer supported! '
349+
'Please use limits at the server or collection '
350+
'level (RFC5)')
351+
LOGGER.warning(msg)
352+
try:
353+
limit = evaluate_limit(request.params.get('limit'),
354+
api.config['server'].get('limits', {}),
355+
collections[dataset].get('limits', {}))
356+
except ValueError as err:
357+
return api.get_exception(
358+
HTTPStatus.BAD_REQUEST, headers, request.format,
359+
'InvalidParameterValue', str(err))
360+
345361
query_args = dict(
346362
query_type=query_type,
347363
instance=instance,
@@ -353,8 +369,8 @@ def get_collection_edr_query(api: API, request: APIRequest,
353369
bbox=bbox,
354370
within=within,
355371
within_units=within_units,
356-
limit=int(api.config['server']['limit']),
357-
location_id=location_id,
372+
limit=limit,
373+
location_id=location_id
358374
)
359375

360376
try:

0 commit comments

Comments
 (0)