Skip to content

Commit be242a8

Browse files
committed
Add a section on downstream testing
1 parent cff76df commit be242a8

1 file changed

Lines changed: 79 additions & 0 deletions

File tree

source/discussions/downstream-packaging.rst

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,3 +192,82 @@ on, and a particular dependency is either missing or incompatible, the build
192192
should fail with an explanatory message, giving the packager an explicit
193193
indication of the problem and a chance to consciously decide on the preferred
194194
course of action.
195+
196+
197+
.. _Support downstream testing:
198+
199+
Support downstream testing
200+
--------------------------
201+
A variety of downstream projects run some degree of testing on the packaged
202+
Python projects. Depending on the particular case, this can range from minimal
203+
smoke testing to comprehensive runs of the complete test suite. There can
204+
be various reasons for doing this, for example:
205+
206+
- Verifying that the downstream packaging did not introduce any bugs.
207+
208+
- Testing on a platform that is not covered by upstream testing.
209+
210+
- Finding subtle bugs that can only be reproduced on a particular hardware,
211+
system package versions, and so on.
212+
213+
- Testing the released package against newer dependency version than the ones
214+
present during upstream release testing.
215+
216+
- Testing the package in an environment closely resembling the production
217+
setup. This can detect issues caused by nontrivial interactions between
218+
different installed packages, including packages that are not dependencies
219+
of your package, but nevertheless can cause issues.
220+
221+
- Testing the released package against newer Python versions (including newer
222+
point releases), or less tested Python implementations such as PyPy.
223+
224+
Admittedly, sometimes downstream testing may yield false positives or
225+
inconvenience you about scenarios that you are not interested in supporting.
226+
However, perhaps even more often it does provide early notice of problems,
227+
or find nontrivial bugs that would otherwise cause issues for your users
228+
in production. And believe me, the majority of downstream packagers are doing
229+
their best to double-check their results, and help you triage and fix the bugs
230+
that they report.
231+
232+
There is a number of things that you can do to help us test your package
233+
better. Some of them were already mentioned in this discussion. Some examples
234+
are:
235+
236+
- Include the test files and fixtures in the source distribution, or make it
237+
possible to easily download them separately.
238+
239+
- Do not write to the package during testing. Downstream test setups sometimes
240+
run tests on top of the installed package, and test-time modifications can
241+
end up being part of the production package!
242+
243+
- Make the test suite work offline. Mock network interactions, using packages
244+
such as responses_ or vcrpy_. If that is not possible, make it possible
245+
to easily disable the tests using Internet access, e.g. via a pytest marker.
246+
Use pytest-socket_ to verify that your tests work offline.
247+
248+
- Make your tests work without a specialized setup, or perform the necessary
249+
setup as part of test fixtures. Do not ever assume that you can connect
250+
to system services such as databases — in an extreme case, you could crash
251+
a production service!
252+
253+
- Do not assume that the test suite will be run with ``-Werror``. Downstreams
254+
often need to disable that, as it causes false positives, e.g. due to newer
255+
dependency versions. Assert for warnings using ``pytest.warns()`` rather
256+
than ``pytest.raises()``!
257+
258+
- Aim to make your test suite reliable. Avoid flaky tests. Avoid depending
259+
on specific platform details, don't rely on exact results of floating-point
260+
computation, or timing of operations, and so on. Fuzzing has its advantages,
261+
but you want to have static test cases for completeness as well.
262+
263+
- Split tests by their purpose, and make it easy to skip categories that are
264+
irrelevant or problematic. Since the primary purpose of downstream testing is
265+
to ensure that the package itself works, we generally are not interested
266+
in e.g. checking code coverage, code formatting, typing or running
267+
benchmarks. These tests can fail as dependencies are upgraded or the system
268+
is under load, without actually affecting the package itself.
269+
270+
271+
.. _responses: https://pypi.org/project/responses/
272+
.. _vcrpy: https://pypi.org/project/vcrpy/
273+
.. _pytest-socket: https://pypi.org/project/pytest-socket/

0 commit comments

Comments
 (0)