@@ -192,3 +192,82 @@ on, and a particular dependency is either missing or incompatible, the build
192192should fail with an explanatory message, giving the packager an explicit
193193indication of the problem and a chance to consciously decide on the preferred
194194course of action.
195+
196+
197+ .. _Support downstream testing :
198+
199+ Support downstream testing
200+ --------------------------
201+ A variety of downstream projects run some degree of testing on the packaged
202+ Python projects. Depending on the particular case, this can range from minimal
203+ smoke testing to comprehensive runs of the complete test suite. There can
204+ be various reasons for doing this, for example:
205+
206+ - Verifying that the downstream packaging did not introduce any bugs.
207+
208+ - Testing on a platform that is not covered by upstream testing.
209+
210+ - Finding subtle bugs that can only be reproduced on a particular hardware,
211+ system package versions, and so on.
212+
213+ - Testing the released package against newer dependency version than the ones
214+ present during upstream release testing.
215+
216+ - Testing the package in an environment closely resembling the production
217+ setup. This can detect issues caused by nontrivial interactions between
218+ different installed packages, including packages that are not dependencies
219+ of your package, but nevertheless can cause issues.
220+
221+ - Testing the released package against newer Python versions (including newer
222+ point releases), or less tested Python implementations such as PyPy.
223+
224+ Admittedly, sometimes downstream testing may yield false positives or
225+ inconvenience you about scenarios that you are not interested in supporting.
226+ However, perhaps even more often it does provide early notice of problems,
227+ or find nontrivial bugs that would otherwise cause issues for your users
228+ in production. And believe me, the majority of downstream packagers are doing
229+ their best to double-check their results, and help you triage and fix the bugs
230+ that they report.
231+
232+ There is a number of things that you can do to help us test your package
233+ better. Some of them were already mentioned in this discussion. Some examples
234+ are:
235+
236+ - Include the test files and fixtures in the source distribution, or make it
237+ possible to easily download them separately.
238+
239+ - Do not write to the package during testing. Downstream test setups sometimes
240+ run tests on top of the installed package, and test-time modifications can
241+ end up being part of the production package!
242+
243+ - Make the test suite work offline. Mock network interactions, using packages
244+ such as responses _ or vcrpy _. If that is not possible, make it possible
245+ to easily disable the tests using Internet access, e.g. via a pytest marker.
246+ Use pytest-socket _ to verify that your tests work offline.
247+
248+ - Make your tests work without a specialized setup, or perform the necessary
249+ setup as part of test fixtures. Do not ever assume that you can connect
250+ to system services such as databases — in an extreme case, you could crash
251+ a production service!
252+
253+ - Do not assume that the test suite will be run with ``-Werror ``. Downstreams
254+ often need to disable that, as it causes false positives, e.g. due to newer
255+ dependency versions. Assert for warnings using ``pytest.warns() `` rather
256+ than ``pytest.raises() ``!
257+
258+ - Aim to make your test suite reliable. Avoid flaky tests. Avoid depending
259+ on specific platform details, don't rely on exact results of floating-point
260+ computation, or timing of operations, and so on. Fuzzing has its advantages,
261+ but you want to have static test cases for completeness as well.
262+
263+ - Split tests by their purpose, and make it easy to skip categories that are
264+ irrelevant or problematic. Since the primary purpose of downstream testing is
265+ to ensure that the package itself works, we generally are not interested
266+ in e.g. checking code coverage, code formatting, typing or running
267+ benchmarks. These tests can fail as dependencies are upgraded or the system
268+ is under load, without actually affecting the package itself.
269+
270+
271+ .. _responses : https://pypi.org/project/responses/
272+ .. _vcrpy : https://pypi.org/project/vcrpy/
273+ .. _pytest-socket : https://pypi.org/project/pytest-socket/
0 commit comments