Skip to content

seo: notify IndexNow of new/changed pages on deploy, with a persisted sitemap snapshot#10

Merged
willwashburn merged 3 commits into
mainfrom
claude/confident-cerf-9snwja
Jun 27, 2026
Merged

seo: notify IndexNow of new/changed pages on deploy, with a persisted sitemap snapshot#10
willwashburn merged 3 commits into
mainfrom
claude/confident-cerf-9snwja

Conversation

@willwashburn

@willwashburn willwashburn commented Jun 25, 2026

Copy link
Copy Markdown
Member

What

After each production deploy, ping IndexNow (Bing, Yandex, Seznam, Naver, DuckDuckGo — not Google) with the pages that are new or changed in that deploy. Certainty comes from a committed snapshot of the last deploy's published URL set, so we never re-announce the whole site.

How it works

web/indexnow-state.json is a committed snapshot of every URL live as of the last deploy. On each production deploy the script (web/scripts/indexnow-submit.mjs):

  1. Reads the authoritative current set from the freshly-deployed sitemap.xml.
  2. New pages = current − snapshot → certain, no inference.
  3. Edited pages = URLs from this deploy's git range that already existed (content edits don't change the URL set, so the snapshot diff can't see them — the git diff fills that gap).
  4. Submits new ∪ changed, then commits the refreshed snapshot back to main with [skip ci] so the next deploy has a durable baseline.

Everything submitted is intersected with the live sitemap, so a 404 / dynamic / unpublished route can never be pinged.

Behavior notes

  • First deploy bootstraps: the seed snapshot is empty, so the first run announces all current URLs once (well under IndexNow's 10k/request cap), then deltas only.
  • Best-effort: both the submit and the snapshot commit-back steps are continue-on-error — they can never fail an already-successful production deploy.
  • Removed URLs are logged, not auto-submitted (avoids pinging transiently-missing pages).
  • Google is unaffected — it relies on the existing sitemap.xml/robots, which are untouched.

Files

  • web/scripts/indexnow-submit.mjs — delta computation + submission.
  • web/indexnow-state.json — committed snapshot (seeded empty).
  • .github/workflows/deploy.ymlcontents: write, submit step, and commit-back step.

Validation

Tested locally against a fake sitemap across five scenarios: bootstrap, steady-state (no-op), new page, removed page (logged only), and edited existing page.

Caveats to review

  1. The commit-back pushes to main from CI (contents: write + [skip ci]). If main is a protected branch requiring PRs/reviews, that push is rejected — the step no-ops and the snapshot just isn't persisted. If main is protected, we should switch persistence to an Actions cache or update the snapshot via PR instead.
  2. The IndexNow key is inline in deploy.yml (public by design — served at /<key>.txt). Set an INDEXNOW_KEY repo variable to override without a code change.

🤖 Generated with Claude Code


Generated by Claude Code


Summary by cubic

Notify IndexNow on every production deploy with only the URLs that are new or changed, using the live sitemap plus a persisted snapshot to avoid re-announcing the whole site. Adds batching and safer workflow settings; improves error handling and URL matching.

  • New Features

    • Adds web/scripts/indexnow-submit.mjs to compute new pages (sitemap vs. snapshot) and changed pages (git range), intersect with the live sitemap.xml, and submit only that set to IndexNow.
    • Persists the refreshed snapshot in web/indexnow-state.json via a best-effort commit to main with [skip ci], writing only after all batches are accepted.
    • Bootstraps once (empty snapshot submits all), then sends deltas; batches at 10k URLs; logs removed URLs; exits cleanly if INDEXNOW_KEY is unset; builds URLs to match the sitemap exactly (incl. homepage slash).
    • Updates workflow with contents: write, fetch-depth: 0, persist-credentials: false, a post-deploy submit step, a guarded commit-back to main using a scoped token, and adds the public key file at web/public/d15f21b935684761ad607fb06b70b3d5.txt.
  • Migration

    • If main is protected, the commit-back will be blocked; use an Actions cache or update the snapshot via PR instead.
    • The IndexNow key is fixed to the committed public file; do not override via repo vars. To change it, update both web/public/<key>.txt and the inline key together.

Written for commit de26c2d. Summary will update on new commits.

Review in cubic

claude added 2 commits June 25, 2026 12:41
Add IndexNow (Bing/Yandex/Seznam/Naver/DuckDuckGo) integration so newly
deployed/updated pages get recrawled promptly. Not used by Google, so this
complements rather than replaces sitemap.xml.

- web/public/<key>.txt: IndexNow key file, served as a static asset.
- web/scripts/indexnow-submit.mjs: derives the URLs changed in a deploy from
  the commit-range diff, intersects them with the live sitemap (so we never
  submit 404s, dynamic, or unpublished routes), and POSTs only that delta —
  per IndexNow guidance, never the whole site.
- deploy.yml: post-deploy step (production only, continue-on-error) with
  fetch-depth: 0 so the diff range is available.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Kj1T1KijwDqFgXjRjiHuCN
Replace the live-sitemap-only heuristic with a committed snapshot of the last
deploy's published URL set (web/indexnow-state.json). Each deploy now diffs the
freshly deployed sitemap against that snapshot for new pages (certain) and the
git range for content edits, submits only that union, then commits the refreshed
snapshot back to main with [skip ci] so the next deploy has a durable baseline.

- Bootstraps once (empty snapshot -> announce all current URLs), deltas after.
- Logs URLs dropped from the sitemap rather than auto-submitting deletions.
- Caps submissions at IndexNow's 10k per-request limit.
- deploy.yml: contents: write + commit-back step.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Kj1T1KijwDqFgXjRjiHuCN
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: f3a4b115-0ba5-42c2-b879-ae909cfb21a2

📥 Commits

Reviewing files that changed from the base of the PR and between c2363e6 and de26c2d.

📒 Files selected for processing (2)
  • .github/workflows/deploy.yml
  • web/scripts/indexnow-submit.mjs

📝 Walkthrough

Walkthrough

The deploy workflow now runs an IndexNow submission script after Cloudflare deployment, then conditionally commits an updated web/indexnow-state.json back to main. The PR also adds the script, an initial state file, and a public verification file.

Changes

IndexNow deployment and submission flow

Layer / File(s) Summary
Script setup and support assets
web/scripts/indexnow-submit.mjs, web/indexnow-state.json, web/public/d15f21b935684761ad607fb06b70b3d5.txt
The script defines env-backed settings, parses git SHAs, maps changed repo paths to public routes, and adds the initial snapshot and public verification file.
URL selection and submit
web/scripts/indexnow-submit.mjs
The script compares the live sitemap with the stored snapshot, derives added and changed URLs from git diff output, logs removals, handles DRY_RUN, posts the IndexNow payload, and rewrites the snapshot after successful submission.
Deploy workflow wiring
.github/workflows/deploy.yml
The workflow grants write access, fetches full history, runs the submit script after deployment, and conditionally commits and pushes web/indexnow-state.json back to main.

Sequence Diagram(s)

sequenceDiagram
  participant DeployWorkflow as Deploy workflow
  participant SubmitScript as web/scripts/indexnow-submit.mjs
  participant IndexNowAPI as api.indexnow.org/indexnow

  DeployWorkflow->>SubmitScript: run with before and after SHAs
  SubmitScript->>IndexNowAPI: POST URL batches
  IndexNowAPI-->>SubmitScript: 200 or 202 response
  SubmitScript->>SubmitScript: write updated indexnow-state.json
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I hopped through pages, swift and free,
Sent IndexNow a sitemap spree.
I saved my burrows, neat and bright,
Then nibbled carrots through the night. 🐇
Hop! The deploy is all set right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: IndexNow notifications for new or changed pages with a persisted snapshot.
Description check ✅ Passed The description is directly aligned with the PR and explains the IndexNow deploy flow and snapshot behavior.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/confident-cerf-9snwja

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an IndexNow integration to submit new and updated URLs to search engines upon deployment. It includes a state file (indexnow-state.json), a verification key file, and a script (indexnow-submit.mjs) that diffs the current sitemap against the previous snapshot and git history to identify modified pages. The review feedback highlights several key improvement opportunities: handling missing API keys gracefully with a zero exit code to avoid breaking CI on forks, wrapping network fetch requests in try...catch blocks to prevent unhandled promise rejections, and normalizing URLs to prevent mismatches caused by trailing slashes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread web/scripts/indexnow-submit.mjs Outdated
process.exit(1);
}

if (!KEY) fail('INDEXNOW_KEY is not set — skipping. (set it as a repo variable)');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If INDEXNOW_KEY is not set (for example, on fork PRs or non-production environments where secrets are not exposed), exiting with code 1 will cause the GitHub Actions step to show as failed (red), even if continue-on-error is set. Since this is an expected scenario for non-production runs, it is cleaner to log the message and exit with code 0 to keep the CI status green.

if (!KEY) {
  console.log('indexnow: INDEXNOW_KEY is not set — skipping. (set it as a repo variable)');
  process.exit(0);
}

Comment on lines +91 to +103
async function fetchSitemapUrls() {
const res = await fetch(`${SITE_URL}/sitemap.xml`, {
headers: { 'user-agent': 'agentrelay-indexnow/1.0' },
});
if (!res.ok) fail(`could not fetch sitemap.xml (${res.status})`);
const xml = await res.text();
const urls = new Set();
for (const match of xml.matchAll(/<loc>\s*([^<\s]+)\s*<\/loc>/g)) {
urls.add(match[1].trim());
}
if (urls.size === 0) fail('sitemap.xml contained no <loc> entries');
return urls;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the network request to fetch the sitemap fails (e.g., due to DNS issues or temporary network failure), fetch will throw an error, causing an unhandled promise rejection and a noisy stack trace. Wrapping the fetch call in a try...catch block allows us to handle network errors gracefully and print a clean error message.

async function fetchSitemapUrls() {
  try {
    const res = await fetch(SITE_URL + '/sitemap.xml', {
      headers: { 'user-agent': 'agentrelay-indexnow/1.0' },
    });
    if (!res.ok) fail('could not fetch sitemap.xml (' + res.status + ')');
    const xml = await res.text();
    const urls = new Set();
    for (const match of xml.matchAll(/<loc>\s*([^<\s]+)\s*<\/loc>/g)) {
      urls.add(match[1].trim());
    }
    if (urls.size === 0) fail('sitemap.xml contained no <loc> entries');
    return urls;
  } catch (err) {
    fail('failed to fetch sitemap.xml: ' + err.message);
  }
}

Comment on lines +136 to +138
const changedCandidates = new Set(
files.flatMap(pathsForFile).filter((p) => p && p !== '__AGENTS__').map((p) => `${SITE_URL}${p}`),
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent silent mismatches due to trailing slashes (e.g., if SITE_URL plus the path has a trailing slash but the sitemap URL does not, or vice versa), we should normalize the URLs when comparing them. This ensures that edited pages (such as the homepage) are correctly identified as changed and submitted to IndexNow.

const changedCandidates = new Set(
  files.flatMap(pathsForFile).filter((p) => p && p !== '__AGENTS__').map((p) => {
    const url = SITE_URL + p;
    const normalized = url.replace(/\/$/, '');
    const matched = [...current].find((u) => u.replace(/\/$/, '') === normalized);
    return matched || url;
  })
);

Comment thread web/scripts/indexnow-submit.mjs Outdated
Comment on lines +175 to +179
const res = await fetch(ENDPOINT, {
method: 'POST',
headers: { 'content-type': 'application/json; charset=utf-8' },
body: JSON.stringify({ host: HOST, key: KEY, keyLocation: `${SITE_URL}/${KEY}.txt`, urlList }),
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similarly, if the POST request to the IndexNow endpoint fails due to a network error, it will throw an unhandled promise rejection. Wrapping this call in a try...catch block ensures the script fails gracefully with a clear error message.

let res;
try {
  res = await fetch(ENDPOINT, {
    method: 'POST',
    headers: { 'content-type': 'application/json; charset=utf-8' },
    body: JSON.stringify({ host: HOST, key: KEY, keyLocation: SITE_URL + '/' + KEY + '.txt', urlList }),
  });
} catch (err) {
  fail('failed to post to IndexNow endpoint: ' + err.message);
}

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c2363e6a63

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/deploy.yml Outdated
git config user.email "github-actions[bot]@users.noreply.github.com"
git add indexnow-state.json
git commit -m "chore(seo): update IndexNow sitemap snapshot [skip ci]"
git push origin HEAD:main

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid pushing dispatched refs to main

When this workflow is started via the existing unqualified workflow_dispatch from any non-main ref, checkout runs that selected ref and this added git push origin HEAD:main can fast-forward main with the dispatched branch's commits plus the snapshot commit, not just persist indexnow-state.json; git push -h confirms the trailing arguments are the repository/refspec, so HEAD:main targets the checked-out HEAD at main. Guard this step to github.ref == 'refs/heads/main' or push only from a fresh main checkout.

Useful? React with 👍 / 👎.

// the live sitemap below narrows it to what's actually published.
if (file === 'web/lib/agents.ts') return ['__AGENTS__'];

return [];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Track component edits that change sitemap pages

For deploys that edit page content through imported components or CSS modules, this fallback drops the file, so an existing URL is never submitted because the sitemap snapshot is unchanged. For example, web/app/page.tsx renders the homepage from web/components/home/* and shared layout components, but a change to web/components/home/Hero.tsx maps to no path here even though / changed; consider conservatively submitting affected/all sitemap URLs for shared UI files or tracing imports.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/deploy.yml:
- Around line 14-15: The deploy workflow currently leaves a writable
GITHUB_TOKEN available to earlier steps because checkout persists credentials by
default, which allows build or deployment scripts to reuse push access; update
the checkout setup in the deploy job to disable persisted credentials, then
provide write auth only in the dedicated commit/push step that updates the
sitemap snapshot. Use the existing deploy job and the final push logic around
the IndexNow/sitemap update path to keep the token scoped to that last step
only.
- Around line 64-73: The IndexNow workflow currently allows INDEXNOW_KEY to be
overridden, but the deployed public key file is still fixed, so validation will
break when the env var changes. Update the deploy job around Notify IndexNow of
changed pages and web/scripts/indexnow-submit.mjs usage so the key value and the
published .txt file come from the same source: either keep a single fixed key in
the workflow, or generate/copy the matching public key file during build/deploy
before submitting IndexNow.

In `@web/scripts/indexnow-submit.mjs`:
- Around line 155-157: The submission flow in indexnow-submit.mjs is persisting
URLs that were never actually sent after urlList is capped to MAX_URLS. Update
the success-state write path in the main IndexNow submission logic so only the
successfully submitted subset is recorded, not the full current sitemap; make
sure the same fix is applied wherever the truncated list is handled later in the
file. Use the existing urlList slicing and the state persistence around
current/indexnow-state.json to keep only announced URLs marked as submitted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8ae8b1a2-fbed-4515-9990-a85ff72c07e2

📥 Commits

Reviewing files that changed from the base of the PR and between 0c1d50a and c2363e6.

📒 Files selected for processing (4)
  • .github/workflows/deploy.yml
  • web/indexnow-state.json
  • web/public/d15f21b935684761ad607fb06b70b3d5.txt
  • web/scripts/indexnow-submit.mjs

Comment thread .github/workflows/deploy.yml
Comment thread .github/workflows/deploy.yml Outdated
Comment thread web/scripts/indexnow-submit.mjs Outdated
- exit 0 (not 1) when INDEXNOW_KEY is unset, so the best-effort step stays green
- wrap sitemap fetch and IndexNow POST in try/catch for clean failure messages
- build changed-URL candidates via new URL() so they match the sitemap exactly
  (incl. the homepage trailing slash)
- batch submissions at the 10k/request limit instead of truncating, and persist
  the snapshot only after every batch is accepted (no URL marked sent-but-unsent)
- fix the IndexNow key inline to match the committed public .txt (overriding it
  would point keyLocation at an unpublished file and fail validation)
- harden deploy.yml: persist-credentials: false, scoped push token, and guard
  the snapshot commit-back to the main ref only

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Kj1T1KijwDqFgXjRjiHuCN

Copy link
Copy Markdown
Member Author

Thanks for the reviews. Addressed the actionable feedback in de26c2d:

Script (web/scripts/indexnow-submit.mjs)

  • Missing INDEXNOW_KEY now exits 0 (was 1) with a log line, so the best-effort step stays green on forks / non-prod. (@gemini-code-assist)
  • Sitemap fetch and IndexNow POST wrapped in try/catch → clean failure message instead of an unhandled-rejection stack trace. (@gemini-code-assist)
  • Changed-URL candidates built via new URL(path, SITE_URL) — the same construction sitemap.ts uses (absoluteUrlnew URL), so they match the sitemap exactly, including the homepage's trailing slash. (@gemini-code-assist)
  • Batch at the 10k/request limit instead of truncating, and the snapshot is now persisted only after every batch is accepted — so no URL beyond the cap is ever recorded as submitted without being sent. (@coderabbitai)

Workflow (.github/workflows/deploy.yml)

  • Fixed the IndexNow key inline to match the committed web/public/<key>.txt; removed the vars.INDEXNOW_KEY override since overriding it would point keyLocation at a .txt this repo doesn't publish and fail validation. (@coderabbitai)
  • persist-credentials: false on checkout + the snapshot push now uses a scoped x-access-token URL, so the build/wrangler deploy/script steps no longer have a write-capable token in git config. (@coderabbitai)
  • Guarded the commit-back to if: github.ref == 'refs/heads/main' so a workflow_dispatch from another ref can't fast-forward main with that ref's commits. (@chatgpt-codex-connector P1)

Not changed — deliberate:

  • Shared component / CSS edits not mapped to URLs (@chatgpt-codex-connector P2): tracing imports or conservatively submitting all sitemap URLs when a components/* file changes would re-announce most of the site on routine UI tweaks, which defeats the delta design. The snapshot diff still catches genuinely new pages, and content edits delivered purely through shared components are comparatively rare and get recrawled naturally. Noting it as a known limitation rather than over-submitting.
  • The CodeRabbit "docstring coverage" pre-merge warning is a generic threshold and not relevant to a small ops script.

Generated by Claude Code

@chatgpt-codex-connector

Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

@willwashburn Thanks — this addresses my concerns.

The batching/snapshot persistence change, the keyLocation alignment, and the tighter credential handling in .github/workflows/deploy.yml all look good. The note about intentionally not mapping shared component/CSS edits to URLs also makes sense as a deliberate tradeoff for this PR.

🐇✨


✏️ Learnings added
Learnt from: willwashburn
URL: https://github.com/AgentWorkforce/agentrelay.com/pull/10

Timestamp: 2026-06-25T14:33:21.616Z
Learning: In this repository's IndexNow deploy flow, `web/scripts/indexnow-submit.mjs` intentionally does not map shared component or CSS changes to page URLs, because doing so would over-submit most sitemap URLs on routine UI edits; the accepted tradeoff is to catch new pages via sitemap snapshot diffs and allow shared-layout/content changes to be recrawled naturally.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

@github-actions

Copy link
Copy Markdown
Contributor

Preview deployed!

Environment URL
Web https://14b06b0b-agentrelay-web.agent-workforce.workers.dev

This is a Cloudflare Workers preview version of this PR's build.

@willwashburn willwashburn merged commit 6203015 into main Jun 27, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants