In this we translate the logic of the database fields and add details that we should know
There are some resources that can be used as reference:
A tree is a set of repos and branches, normally just one pair, but more are possible. Essentially, a tree is a copy of the code base maintained separately for a particular purpose.
Maestro maintains its own database of trees, where each is given a
name (Checkouts.tree_name), those names are generally
accepted and often recognized, but were chosen by
Maestro (KernelCI Legacy before it) developers and maintainers over a
period of time, and are not standardized, other CI systems are free
to report whatever they want there. Most choose not to report anything
in that field.
A checkout is a record of a CI system checking out a particular revision from a particular git repository and branch. The json schema also says that checkouts "represents the way the tested source code was obtained and its original location. E.g. checking out a particular commit from a git repo, and applying a set of patches on top."
Tests are linked to a build
Important thing to look at the tests table is the path column, which is the
type of test that it is performing. They will be split by dots, which means that
it is a child test from that test. eg: boot.start is a child of boot
-
start_timeis the time that a test has started, we can use that to graph X-Axis graphs of tests over time -
statuscolumn show a status that can bePASSSKIPorERROR -
misca JSON field that is used to store miscellaneous data about the test like archicture, compiler, runtime, kernel type, etc. However, it does not have a fixed structure, and therefore can contain any type of information the CI system finds useful to store. It's possible to note this phenomenon when comparing the misc field between origins. Themiscfield might have data related to errors, but's generally not used since it's unreliable
example of a typical misc field of Maestro:
{
"arch": "x86_64",
"runtime": "lava-collabora",
"compiler": "gcc-12",
"kernel_type": "bzimage"
}environment_misca JSON field that we can use for things about the environment like detecting the platform, this is not set in stone and can change, so we should double check the validation. On Maestro it's expected thatplatformshould be a field inside theenvironment_miscJSON.
{
"job_id": "14750408",
"platform": "hp-x360-14a-cb0001xx-zork"
}NOTE: The contents of the columns
miscandenvironment_miscare not standardized, and shouldn't be relied upon to ever be used consistently by more than one CI system ("origin"), or even by the originating CI system itself, over time. The most it could be used for is a proof-of-concept.
environment_compatiblecontains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least.
NOTE: The concept of "Hardware" used througout the dashboard also includes the
platformfound in theenvironment_misccolumn. Up to this point, the rule is to prioritizeplatformoverenvironment_compatiblesince, at least on Maestro,platformprovides a more informative representation of the hardware used during the test/boot/build. This, however, does not apply to the Hardware Listing and Details pages that still prioritize theenvironment_compatibleoverplatformFor more information about
environment_compatiblethese resources can be used as a reference:
output_filesis a JSON field that has an array of objects, each of them having data about the artifacts generated by the test. The Builds table also has aninput_filescolumn that respects the same structure
[
{
"url":"https://storage.kernelci.org/broonie-sound/for-next/v6.11-rc3-218-gc76d5dfbfc97/arm64/defconfig/gcc-12/lab-broonie/lava-meson-sm1-s905d3-libretech-cc.json",
"name":"lava_json"
},
{
"url":"https://storage.kernelci.org/broonie-sound/for-next/v6.11-rc3-218-gc76d5dfbfc97/arm64/defconfig/gcc-12/lab-broonie/kselftest-alsa-meson-sm1-s905d3-libretech-cc.txt",
"name":"txt"
},
{
"url":"https://storage.kernelci.org/broonie-sound/for-next/v6.11-rc3-218-gc76d5dfbfc97/arm64/defconfig/gcc-12/lab-broonie/kselftest-alsa-meson-sm1-s905d3-libretech-cc.html",
"name":"html"
}
]Revision has no respective table, but it is a collection of checkouts with
the same git_commit_hash and patchset_hash. But patchset_hash is not commonly used these days.
Some indexes were defined to improve the database performance, they were chosen based on if the field is used on query filters. Fields that are only selected or only filtered in the Python layer don't need to be indexed, and the excess of indexes can impact insertion performance.
For most fields, a default b-tree index is used; for array fields, a GIN index is used since it can handle containment checks. One expression index was also created for the test platform, which lives inside the environment_misc JSON field.
Django also creates some indexes automatically:
- Index for primary key as b-tree;
- Index for primary key as b-tree with
text_pattern_ops, specialized forLIKEoperations. - Index for foreign key as b-tree;
- Index for foreign key as b-tree with
text_pattern_ops, specialized forLIKEoperations.
One of the operations that we are doing benefits from this double indexes, which is the filtering of dummy builds (builds where id LIKE maestro:_dummy%).
There is only one index that diverges from model to database, which is tests_origin_time_platform. This is because we want an index on the json field environment_misc ->> 'platform' -- with ->> -- but Django only uses -> (check https://docs.djangoproject.com/en/5.2/topics/db/queries/#module-django.db.models.fields.json). The divergence is created in the migration where this index is added.