seo: notify IndexNow of new/changed pages on deploy, with a persisted sitemap snapshot#10
Conversation
Add IndexNow (Bing/Yandex/Seznam/Naver/DuckDuckGo) integration so newly deployed/updated pages get recrawled promptly. Not used by Google, so this complements rather than replaces sitemap.xml. - web/public/<key>.txt: IndexNow key file, served as a static asset. - web/scripts/indexnow-submit.mjs: derives the URLs changed in a deploy from the commit-range diff, intersects them with the live sitemap (so we never submit 404s, dynamic, or unpublished routes), and POSTs only that delta — per IndexNow guidance, never the whole site. - deploy.yml: post-deploy step (production only, continue-on-error) with fetch-depth: 0 so the diff range is available. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kj1T1KijwDqFgXjRjiHuCN
Replace the live-sitemap-only heuristic with a committed snapshot of the last deploy's published URL set (web/indexnow-state.json). Each deploy now diffs the freshly deployed sitemap against that snapshot for new pages (certain) and the git range for content edits, submits only that union, then commits the refreshed snapshot back to main with [skip ci] so the next deploy has a durable baseline. - Bootstraps once (empty snapshot -> announce all current URLs), deltas after. - Logs URLs dropped from the sitemap rather than auto-submitting deletions. - Caps submissions at IndexNow's 10k per-request limit. - deploy.yml: contents: write + commit-back step. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kj1T1KijwDqFgXjRjiHuCN
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe deploy workflow now runs an IndexNow submission script after Cloudflare deployment, then conditionally commits an updated ChangesIndexNow deployment and submission flow
Sequence Diagram(s)sequenceDiagram
participant DeployWorkflow as Deploy workflow
participant SubmitScript as web/scripts/indexnow-submit.mjs
participant IndexNowAPI as api.indexnow.org/indexnow
DeployWorkflow->>SubmitScript: run with before and after SHAs
SubmitScript->>IndexNowAPI: POST URL batches
IndexNowAPI-->>SubmitScript: 200 or 202 response
SubmitScript->>SubmitScript: write updated indexnow-state.json
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces an IndexNow integration to submit new and updated URLs to search engines upon deployment. It includes a state file (indexnow-state.json), a verification key file, and a script (indexnow-submit.mjs) that diffs the current sitemap against the previous snapshot and git history to identify modified pages. The review feedback highlights several key improvement opportunities: handling missing API keys gracefully with a zero exit code to avoid breaking CI on forks, wrapping network fetch requests in try...catch blocks to prevent unhandled promise rejections, and normalizing URLs to prevent mismatches caused by trailing slashes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| process.exit(1); | ||
| } | ||
|
|
||
| if (!KEY) fail('INDEXNOW_KEY is not set — skipping. (set it as a repo variable)'); |
There was a problem hiding this comment.
If INDEXNOW_KEY is not set (for example, on fork PRs or non-production environments where secrets are not exposed), exiting with code 1 will cause the GitHub Actions step to show as failed (red), even if continue-on-error is set. Since this is an expected scenario for non-production runs, it is cleaner to log the message and exit with code 0 to keep the CI status green.
if (!KEY) {
console.log('indexnow: INDEXNOW_KEY is not set — skipping. (set it as a repo variable)');
process.exit(0);
}| async function fetchSitemapUrls() { | ||
| const res = await fetch(`${SITE_URL}/sitemap.xml`, { | ||
| headers: { 'user-agent': 'agentrelay-indexnow/1.0' }, | ||
| }); | ||
| if (!res.ok) fail(`could not fetch sitemap.xml (${res.status})`); | ||
| const xml = await res.text(); | ||
| const urls = new Set(); | ||
| for (const match of xml.matchAll(/<loc>\s*([^<\s]+)\s*<\/loc>/g)) { | ||
| urls.add(match[1].trim()); | ||
| } | ||
| if (urls.size === 0) fail('sitemap.xml contained no <loc> entries'); | ||
| return urls; | ||
| } |
There was a problem hiding this comment.
If the network request to fetch the sitemap fails (e.g., due to DNS issues or temporary network failure), fetch will throw an error, causing an unhandled promise rejection and a noisy stack trace. Wrapping the fetch call in a try...catch block allows us to handle network errors gracefully and print a clean error message.
async function fetchSitemapUrls() {
try {
const res = await fetch(SITE_URL + '/sitemap.xml', {
headers: { 'user-agent': 'agentrelay-indexnow/1.0' },
});
if (!res.ok) fail('could not fetch sitemap.xml (' + res.status + ')');
const xml = await res.text();
const urls = new Set();
for (const match of xml.matchAll(/<loc>\s*([^<\s]+)\s*<\/loc>/g)) {
urls.add(match[1].trim());
}
if (urls.size === 0) fail('sitemap.xml contained no <loc> entries');
return urls;
} catch (err) {
fail('failed to fetch sitemap.xml: ' + err.message);
}
}| const changedCandidates = new Set( | ||
| files.flatMap(pathsForFile).filter((p) => p && p !== '__AGENTS__').map((p) => `${SITE_URL}${p}`), | ||
| ); |
There was a problem hiding this comment.
To prevent silent mismatches due to trailing slashes (e.g., if SITE_URL plus the path has a trailing slash but the sitemap URL does not, or vice versa), we should normalize the URLs when comparing them. This ensures that edited pages (such as the homepage) are correctly identified as changed and submitted to IndexNow.
const changedCandidates = new Set(
files.flatMap(pathsForFile).filter((p) => p && p !== '__AGENTS__').map((p) => {
const url = SITE_URL + p;
const normalized = url.replace(/\/$/, '');
const matched = [...current].find((u) => u.replace(/\/$/, '') === normalized);
return matched || url;
})
);| const res = await fetch(ENDPOINT, { | ||
| method: 'POST', | ||
| headers: { 'content-type': 'application/json; charset=utf-8' }, | ||
| body: JSON.stringify({ host: HOST, key: KEY, keyLocation: `${SITE_URL}/${KEY}.txt`, urlList }), | ||
| }); |
There was a problem hiding this comment.
Similarly, if the POST request to the IndexNow endpoint fails due to a network error, it will throw an unhandled promise rejection. Wrapping this call in a try...catch block ensures the script fails gracefully with a clear error message.
let res;
try {
res = await fetch(ENDPOINT, {
method: 'POST',
headers: { 'content-type': 'application/json; charset=utf-8' },
body: JSON.stringify({ host: HOST, key: KEY, keyLocation: SITE_URL + '/' + KEY + '.txt', urlList }),
});
} catch (err) {
fail('failed to post to IndexNow endpoint: ' + err.message);
}There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c2363e6a63
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| git config user.email "github-actions[bot]@users.noreply.github.com" | ||
| git add indexnow-state.json | ||
| git commit -m "chore(seo): update IndexNow sitemap snapshot [skip ci]" | ||
| git push origin HEAD:main |
There was a problem hiding this comment.
Avoid pushing dispatched refs to main
When this workflow is started via the existing unqualified workflow_dispatch from any non-main ref, checkout runs that selected ref and this added git push origin HEAD:main can fast-forward main with the dispatched branch's commits plus the snapshot commit, not just persist indexnow-state.json; git push -h confirms the trailing arguments are the repository/refspec, so HEAD:main targets the checked-out HEAD at main. Guard this step to github.ref == 'refs/heads/main' or push only from a fresh main checkout.
Useful? React with 👍 / 👎.
| // the live sitemap below narrows it to what's actually published. | ||
| if (file === 'web/lib/agents.ts') return ['__AGENTS__']; | ||
|
|
||
| return []; |
There was a problem hiding this comment.
Track component edits that change sitemap pages
For deploys that edit page content through imported components or CSS modules, this fallback drops the file, so an existing URL is never submitted because the sitemap snapshot is unchanged. For example, web/app/page.tsx renders the homepage from web/components/home/* and shared layout components, but a change to web/components/home/Hero.tsx maps to no path here even though / changed; consider conservatively submitting affected/all sitemap URLs for shared UI files or tracing imports.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/deploy.yml:
- Around line 14-15: The deploy workflow currently leaves a writable
GITHUB_TOKEN available to earlier steps because checkout persists credentials by
default, which allows build or deployment scripts to reuse push access; update
the checkout setup in the deploy job to disable persisted credentials, then
provide write auth only in the dedicated commit/push step that updates the
sitemap snapshot. Use the existing deploy job and the final push logic around
the IndexNow/sitemap update path to keep the token scoped to that last step
only.
- Around line 64-73: The IndexNow workflow currently allows INDEXNOW_KEY to be
overridden, but the deployed public key file is still fixed, so validation will
break when the env var changes. Update the deploy job around Notify IndexNow of
changed pages and web/scripts/indexnow-submit.mjs usage so the key value and the
published .txt file come from the same source: either keep a single fixed key in
the workflow, or generate/copy the matching public key file during build/deploy
before submitting IndexNow.
In `@web/scripts/indexnow-submit.mjs`:
- Around line 155-157: The submission flow in indexnow-submit.mjs is persisting
URLs that were never actually sent after urlList is capped to MAX_URLS. Update
the success-state write path in the main IndexNow submission logic so only the
successfully submitted subset is recorded, not the full current sitemap; make
sure the same fix is applied wherever the truncated list is handled later in the
file. Use the existing urlList slicing and the state persistence around
current/indexnow-state.json to keep only announced URLs marked as submitted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 8ae8b1a2-fbed-4515-9990-a85ff72c07e2
📒 Files selected for processing (4)
.github/workflows/deploy.ymlweb/indexnow-state.jsonweb/public/d15f21b935684761ad607fb06b70b3d5.txtweb/scripts/indexnow-submit.mjs
- exit 0 (not 1) when INDEXNOW_KEY is unset, so the best-effort step stays green - wrap sitemap fetch and IndexNow POST in try/catch for clean failure messages - build changed-URL candidates via new URL() so they match the sitemap exactly (incl. the homepage trailing slash) - batch submissions at the 10k/request limit instead of truncating, and persist the snapshot only after every batch is accepted (no URL marked sent-but-unsent) - fix the IndexNow key inline to match the committed public .txt (overriding it would point keyLocation at an unpublished file and fail validation) - harden deploy.yml: persist-credentials: false, scoped push token, and guard the snapshot commit-back to the main ref only Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kj1T1KijwDqFgXjRjiHuCN
|
Thanks for the reviews. Addressed the actionable feedback in Script (
Workflow (
Not changed — deliberate:
Generated by Claude Code |
|
To use Codex here, create an environment for this repo. |
|
Tip For best results, initiate chat on the files or code changes.
The batching/snapshot persistence change, the 🐇✨ ✏️ Learnings added
If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
|
Preview deployed!
This is a Cloudflare Workers preview version of this PR's build. |
What
After each production deploy, ping IndexNow (Bing, Yandex, Seznam, Naver, DuckDuckGo — not Google) with the pages that are new or changed in that deploy. Certainty comes from a committed snapshot of the last deploy's published URL set, so we never re-announce the whole site.
How it works
web/indexnow-state.jsonis a committed snapshot of every URL live as of the last deploy. On each production deploy the script (web/scripts/indexnow-submit.mjs):sitemap.xml.new ∪ changed, then commits the refreshed snapshot back tomainwith[skip ci]so the next deploy has a durable baseline.Everything submitted is intersected with the live sitemap, so a 404 / dynamic / unpublished route can never be pinged.
Behavior notes
continue-on-error— they can never fail an already-successful production deploy.sitemap.xml/robots, which are untouched.Files
web/scripts/indexnow-submit.mjs— delta computation + submission.web/indexnow-state.json— committed snapshot (seeded empty)..github/workflows/deploy.yml—contents: write, submit step, and commit-back step.Validation
Tested locally against a fake sitemap across five scenarios: bootstrap, steady-state (no-op), new page, removed page (logged only), and edited existing page.
Caveats to review
mainfrom CI (contents: write+[skip ci]). Ifmainis a protected branch requiring PRs/reviews, that push is rejected — the step no-ops and the snapshot just isn't persisted. Ifmainis protected, we should switch persistence to an Actions cache or update the snapshot via PR instead.deploy.yml(public by design — served at/<key>.txt). Set anINDEXNOW_KEYrepo variable to override without a code change.🤖 Generated with Claude Code
Generated by Claude Code
Summary by cubic
Notify IndexNow on every production deploy with only the URLs that are new or changed, using the live sitemap plus a persisted snapshot to avoid re-announcing the whole site. Adds batching and safer workflow settings; improves error handling and URL matching.
New Features
web/scripts/indexnow-submit.mjsto compute new pages (sitemap vs. snapshot) and changed pages (git range), intersect with the livesitemap.xml, and submit only that set to IndexNow.web/indexnow-state.jsonvia a best-effort commit tomainwith[skip ci], writing only after all batches are accepted.INDEXNOW_KEYis unset; builds URLs to match the sitemap exactly (incl. homepage slash).contents: write,fetch-depth: 0,persist-credentials: false, a post-deploy submit step, a guarded commit-back tomainusing a scoped token, and adds the public key file atweb/public/d15f21b935684761ad607fb06b70b3d5.txt.Migration
mainis protected, the commit-back will be blocked; use an Actions cache or update the snapshot via PR instead.web/public/<key>.txtand the inline key together.Written for commit de26c2d. Summary will update on new commits.