GSoC 2026 Module B — Week 3: Stage 2 LLM relevance classifier by manshusainishab · Pull Request #947 · OWASP/OpenCRE

manshusainishab · 2026-06-25T16:06:45Z

Summary

Adds Stage 2 of Module B (Noise/Relevance Filter): an LLM classifier that labels
each content chunk as KNOWLEDGE, NOISE, or UNCERTAIN under the
recall-first rule. Builds on the Week 1 schemas and Week 2 regex/sanitize stages.

This PR is self-contained (classifier + prompt + config + tests). Pipeline
wiring, the queue/DB model, and the CLI entry point come in later weeks.

What's added

application/utils/noise_filter/config_loader.py — loads Module B settings
from CRE_NOISE_FILTER_* environment variables into a typed NoiseFilterConfig
(model, batch size, per-chunk char cap, confidence threshold), with defaults.
application/utils/noise_filter/prompts.py — the recall-first system prompt
and a few-shot block (5 KNOWLEDGE / 3 NOISE / 2 UNCERTAIN worked examples), plus
a helper that renders a numbered batch of chunks into the user prompt.
application/utils/noise_filter/llm_classifier.py — LLMClassifier, which
classifies a list of ChangeRecords and returns one ClassifyResult per record.
application/tests/noise_filter/llm_classifier_test.py — 14 unit tests,
fully mocked (no network calls).
.env.example — documents the four new CRE_NOISE_FILTER_* variables.

How the classifier works

Sends each chunk's heading_path + text to a dedicated lightweight model via
LiteLLM (default gemini/gemini-2.5-flash-lite).
Processes records in batches (CRE_NOISE_FILTER_BATCH_SIZE, default 10), one
request per batch, and maps results back to input order by index.
Requests a strict JSON-schema response; if the provider doesn't support strict
mode, falls back to JSON-object mode.
Truncates each chunk to CRE_NOISE_FILTER_MAX_CHARS (default 1500) before sending.
Retries on rate-limit/quota errors using the existing CRE_LLM_MAX_RETRIES /
CRE_LLM_RETRY_SLEEP_SECONDS settings.
Returns UNCERTAIN (confidence 0.0) for any unparseable, malformed, or invalid
output, and marks a whole batch UNCERTAIN if the LLM call fails — so a bad
response never aborts a run.

Configuration

Variable	Default	Purpose
`CRE_NOISE_FILTER_LLM_MODEL`	`gemini/gemini-2.5-flash-lite`	Classification model (LiteLLM string)
`CRE_NOISE_FILTER_BATCH_SIZE`	`10`	Chunks per LLM request
`CRE_NOISE_FILTER_MAX_CHARS`	`1500`	Per-chunk character cap before sending
`CRE_NOISE_FILTER_CONFIDENCE_THRESHOLD`	`0.8`	Minimum confidence to enqueue a KNOWLEDGE verdict

The model is Gemini, so it authenticates with the existing GEMINI_API_KEY;
no new credential is required.

Testing

14 new unit tests covering prompt content, batch ordering and splitting,
malformed/invalid/empty output handling, the JSON-schema fallback, rate-limit
retry and exhaustion, and truncation.
Full suite: 369 passing, 1 skipped, 0 failures.
black --check clean across the repo.

coderabbitai · 2026-06-25T16:07:01Z

Warning

Review limit reached

@manshusainishab, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 26 minutes and 28 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: ae4eea5a-e93b-4300-b537-4c11d1beb121

📥 Commits

Reviewing files that changed from the base of the PR and between c30a29c and aba6ce6.

📒 Files selected for processing (2)

application/tests/noise_filter/llm_classifier_test.py
application/utils/noise_filter/llm_classifier.py

Walkthrough

Adds Module B noise-filter configuration, prompt construction, and an LLM classifier that batches ChangeRecord inputs, retries rate-limited calls, truncates long text, and parses ordered verdicts. Includes unit tests for prompt content, batching, malformed output, fallback behavior, retries, and truncation.

Changes

Noise Filter Module B

Layer / File(s)	Summary
Configuration surface `.env.example`, `application/utils/noise_filter/config_loader.py`	Module B env defaults, typed config loading, and the new settings block are added together.
Prompt contract `application/utils/noise_filter/prompts.py`	Defines the base system prompt, few-shot examples, JSON renderers, and user prompt builder for the classifier.
Batch classification flow `application/utils/noise_filter/llm_classifier.py`	Adds batched LLM calls, strict-schema fallback, rate-limit retries, truncation, and result parsing into ordered verdicts.
Classifier test coverage `application/tests/noise_filter/llm_classifier_test.py`, `application/tests/noise_filter/config_loader_test.py`	Adds tests for prompt text, ordering, batching, malformed output, response-format fallback, retries, truncation, and config loading/validation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

Pa04rth

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 27.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: a Stage 2 LLM relevance classifier for Module B.
Description check	✅ Passed	The description is detailed and directly matches the changeset, covering the classifier, prompts, config, tests, and env docs.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@application/utils/noise_filter/config_loader.py`:
- Around line 29-36: NoiseFilterConfig currently allows invalid values to be
constructed directly, so add invariant checks inside its dataclass
initialization path. Update NoiseFilterConfig to validate batch_size >= 1,
max_chars >= 1, and confidence_threshold between 0.0 and 1.0 in __post_init__,
so every construction route fails fast before llm_classifier uses these fields.
Keep the checks centralized in NoiseFilterConfig so direct callers and
load_config() both get the same validation behavior.

In `@application/utils/noise_filter/llm_classifier.py`:
- Around line 151-163: The fallback in llm_classifier.py is too broad: the
try/except around self._completion_with_retry in the strict_schema path retries
on every exception, even non-capability failures. Update the logic in the
classifier method that builds messages and calls _completion_with_retry so only
schema- მხარდაჭာ unsupported capability errors (for example the provider’s
BadRequestError for strict schema) trigger the json_object retry, and let all
other exceptions propagate without a second attempt. Keep the warning/logging
specific to the capability fallback path so the retry only happens when strict
schema is genuinely unsupported.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 6ed01a44-6a87-42ec-9e1b-c7220c2dfc79

📥 Commits

Reviewing files that changed from the base of the PR and between 4485936 and 02d4bd4.

📒 Files selected for processing (5)

.env.example
application/tests/noise_filter/llm_classifier_test.py
application/utils/noise_filter/config_loader.py
application/utils/noise_filter/llm_classifier.py
application/utils/noise_filter/prompts.py

…ported

manshusainishab added 3 commits June 24, 2026 18:50

chore(module-b): add Stage 2 config loader + env vars

d91f0b0

feat(module-b): add Stage 2 recall-first prompt + few-shots

b80c6a1

feat(module-b): add Stage 2 LLM classifier + tests

02d4bd4

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread application/utils/noise_filter/config_loader.py

Comment thread application/utils/noise_filter/llm_classifier.py

manshusainishab added 2 commits June 25, 2026 21:55

fix(module-b): validate NoiseFilterConfig invariants in __post_init__

c30a29c

fix(module-b): fall back to json_object only when strict schema unsup…

aba6ce6

…ported

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GSoC 2026 Module B — Week 3: Stage 2 LLM relevance classifier#947

GSoC 2026 Module B — Week 3: Stage 2 LLM relevance classifier#947
manshusainishab wants to merge 5 commits into
OWASP:mainfrom
manshusainishab:module_b_w3

manshusainishab commented Jun 25, 2026

Uh oh!

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

Review limit reached

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

manshusainishab commented Jun 25, 2026

Summary

What's added

How the classifier works

Configuration

Testing

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading