Reusable pre-commit hook for validating Airflow DAG files without importing Airflow. The validator parses DAG files with Python AST, so it can run in developer machines and CI without a full Airflow runtime.
Add this to the consuming repo's .pre-commit-config.yaml:
repos:
- repo: https://github.com/bsushmith/airflow-pre-commit
rev: v1.0.0
hooks:
- id: airflow-dag-validator
args: [--config, .dag-validator.toml, --show-warnings]For local testing before publishing this repo:
repos:
- repo: /absolute/path/to/airflow-pre-commit
rev: HEAD
hooks:
- id: airflow-dag-validator
args: [--config, .dag-validator.toml, --show-warnings]Then run:
pre-commit install
pre-commit run airflow-dag-validator --all-filesBy default, the hook validates Python files under dags/.
Create .dag-validator.toml in the consuming repo:
[checks]
dag_id_present = true
owner_present = true
schedule_present = true
start_date_present = true
airflow_imports = true
owner_naming_convention = false
dag_tags_present = false
task_failure_callback = false
[output]
show_warnings = true
show_summary = false
quiet_success = true
format = "text"
[paths]
default_dag_folder = "dags"The CLI auto-detects .dag-validator.toml, dag-validator.toml, or .airflow-dag-validator.toml. You can also pass
--config path/to/config.toml.
quiet_success = true keeps pre-commit output readable when pre-commit splits many DAG files into batches: successful
batches stay silent, while failed batches still print errors.
CLI flags override config where applicable:
dag-validator --disable-check task_failure_callback dags/example.py
dag-validator --enable-check task_failure_callback --fail-on-warnings dags/example.py
dag-validator --checks core --dag-folder dags
dag-validator --list-checksEvery active config value can also be set from the CLI:
| TOML key | CLI flag |
|---|---|
validation.default_severity = "error" |
--default-severity error |
checks.<check_name> = true |
--enable-check <check_name> |
checks.<check_name> = false |
--disable-check <check_name> |
check_settings.task_failure_callback.strict = true |
--task-failure-callback-strict |
check_settings.task_failure_callback.strict = false |
--no-task-failure-callback-strict |
check_settings.task_failure_callback.exempt_operators = [...] |
repeat --task-failure-callback-exempt-operator <operator> |
check_settings.owner_present.valid_owners = [...] |
repeat --valid-owner <regex> |
check_settings.owner_naming_convention.pattern = "..." |
--owner-naming-pattern <regex> |
custom_checks.modules = [...] |
repeat --custom-check-module <module> |
output.format = "json" |
--output-format json or --json |
output.format = "text" |
--output-format text or --no-json |
output.show_warnings = true |
--show-warnings |
output.show_warnings = false |
--no-show-warnings |
output.show_summary = true |
--summary |
output.show_summary = false |
--no-summary |
output.quiet_success = true |
--quiet-success |
output.quiet_success = false |
--no-quiet-success |
paths.default_dag_folder = "dags" |
--dag-folder dags |
paths.exclude_patterns = [...] |
repeat --exclude-pattern <glob> |
Custom checks can live in the consuming repo. Add module names in .dag-validator.toml:
[custom_checks]
modules = ["dag_validation_checks"]or pass them on the CLI:
dag-validator --custom-check-module dag_validation_checks --list-checksThen create dag_validation_checks.py in the consuming repo:
import ast
from pathlib import Path
from typing import Any
from dag_validator.checks import BaseCheck
class DagIdPrefixCheck(BaseCheck):
@property
def name(self) -> str:
return "dag_id_prefix"
@property
def description(self) -> str:
return "Ensures DAG IDs start with dp_"
@property
def severity(self) -> str:
return "error"
def check(self, dag_info: dict[str, Any], tree: ast.AST, file_path: Path) -> list[str]:
dag_id = dag_info.get("dag_id")
if dag_info.get("has_dag_object") and dag_id and not dag_id.startswith("dp_"):
return [f"DAG in {file_path} has dag_id '{dag_id}' without dp_ prefix"]
return []
CHECKS = [DagIdPrefixCheck()]Custom checks use the same enable/disable controls as built-in checks:
[checks]
dag_id_prefix = truedag-validator --enable-check dag_id_prefix
dag-validator --disable-check dag_id_prefixCore checks:
dag_id_present: every DAG has a DAG ID.owner_present: every DAG hasownerindefault_args.schedule_present: every DAG hasscheduleorschedule_interval.start_date_present: every DAG hasstart_date.
Quality checks:
airflow_imports: warns when a DAG file appears to miss Airflow imports.owner_naming_convention: warns when owner does not match the configured naming pattern.dag_tags_present: warns when a DAG does not definetags.
Alerting checks:
task_failure_callback: warns when non-empty operator tasks misson_failure_callback, unless the DAG ordefault_argsalready defines it.
uv --cache-dir .uv-cache run pytest tests -q
uv --cache-dir .uv-cache run dag-validator --list-checksThe installed console command is:
dag-validator