Badgerdoc is a human-in-the-loop tool designed for working with documents that have been analyzed by AI. It provides a platform for users to review, validate, and interact with the output of various AI tools, including OCR, table and chart extractions, and more.
- Clone the repository:
git clone <repository-url>
cd badgerdoc-2- Configure environment variables:
cp .env_example .env- Start all services:
make build_all
docker compose up --build- Access the application:
- Web application: http://localhost:80/
- Temporal UI: http://localhost:8080/
- Minio UI: http://localhost:9001/
After the first run:
- Create a superuser:
docker compose exec web uv run python manage.py createsuperuser- Generate token for the superuser:
docker compose exec web uv run python manage.py drf_create_token admin- Put the token in the
.envfile:
BADGERDOC_TOKEN=<token>- Navigate to
http://localhost:9001/, login withminioadmin, create a bucket namedbadgerdocto enable upload of the documents.
MLX (Apple Silicon Machine Learning Framework) is available on MacOS for running VLM (Vision Language Model) inference locally. This project uses MLX-VLM to run OCR models like DeepSeek-OCR-2 and PaddleOCR-VL.
Note: MinIO runs inside Docker and is referenced by the hostname
minioin pre-signed URLs returned by the API. When using MLX locally, the host machine must be able to resolve that hostname. Add the following entry to/etc/hosts:127.0.0.1 minio
Install the MLX dependency group using uv:
uv sync --group mlxOr install it along with dev dependencies:
uv sync --group dev --group mlxAfter installation, start the MLX VLM servers using:
make start_mlxThis will start two VLM servers:
- Port 11434: DeepSeek-OCR-2-bf16
- Port 11435: PaddleOCR-VL-1.5-bf16
Stop the servers using Ctrl+C.
Setting up Badgerdoc locally (see How to install Badgerdoc above) is a mandatory part of contribution. Once the application is running, the contribution guidelines are available at How to Contribute.
- Carefully read the How to Contribute documentation.
- Create a fork of the Badgerdoc GitHub repository.
- Make all your changes in your own fork.
- Create a Pull Request to the Badgerdoc repository targeting the
mainbranch. - Squash your changes into a single commit before submitting — PRs must contain exactly 1 commit.
- One of the core developers will review the PR, approve it, and merge it.