Support Unicode identifiers in Handlebars expressions by daaain · Pull Request #114 · daaain/Handlebars

daaain · 2026-06-29T21:51:32Z

Handlebars allows variable, helper, partial and block names in any language, but the grammar's identifier character classes were limited to ASCII (a-zA-Z0-9), so non-Latin names lost highlighting.

Replace the ASCII ranges with Oniguruma Unicode property classes \p{L} (any letter) and \p{N} (any number) in the Handlebars-specific rules: block_helper, end_block, partial_and_var, attribute name/value, layout (!<) and else_token. HTML-structural rules (tag names, entities, generic attributes) keep their ASCII ranges per the HTML spec.

This supersedes PR #90, which only added Cyrillic to a subset of rules (and missed the closing-tag rule); the review on that PR asked for full-language support instead.

Closes #90. Adds test/unicode.test.js covering Cyrillic, CJK, Arabic and Latin-with-diacritics across variables, blocks, partials, hashes and else-if.

Summary by CodeRabbit

New Features
- Handlebars syntax highlighting now supports Unicode letters and digits in identifiers, helper names, partials, block tags, and attribute keys/values.
- Improved recognition for non-ASCII forms of else if, block endings, and layout-style extends syntax.
Tests
- Added automated coverage to verify correct tokenization/highlighting for Unicode identifiers across multiple scripts (including variables, helpers, parameters, partials, and attribute data).

Handlebars allows variable, helper, partial and block names in any language, but the grammar's identifier character classes were limited to ASCII (a-zA-Z0-9), so non-Latin names lost highlighting. Replace the ASCII ranges with Oniguruma Unicode property classes \p{L} (any letter) and \p{N} (any number) in the Handlebars-specific rules: block_helper, end_block, partial_and_var, attribute name/value, layout (!<) and else_token. HTML-structural rules (tag names, entities, generic attributes) keep their ASCII ranges per the HTML spec. This supersedes PR #90, which only added Cyrillic to a subset of rules (and missed the closing-tag rule); the maintainer's review on that PR asked for full-language support instead. Closes #90. Adds test/unicode.test.js covering Cyrillic, CJK, Arabic and Latin-with-diacritics across variables, blocks, partials, hashes and else-if.

coderabbitai · 2026-06-29T21:56:16Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 99c65b5e-c6d1-41f3-ac04-a012e6eb6a2f

📥 Commits

Reviewing files that changed from the base of the PR and between 84d8275 and 9920a4d.

📒 Files selected for processing (1)

test/unicode.test.js

🚧 Files skipped from review as they are similar to previous changes (1)

test/unicode.test.js

📝 Walkthrough

Walkthrough

All three Handlebars grammar formats replace ASCII-only identifier classes with Unicode property escapes in matching rules for blocks, else clauses, extends syntax, attributes, and inline variables/partials. A new test file checks Unicode scoping across several script systems.

Changes

Unicode Identifier Support

Layer / File(s)	Summary
Unicode regex widening across all grammar formats `grammars/Handlebars.json`, `grammars/Handlebars.sublime-syntax`, `grammars/Handlebars.tmLanguage`	Seven regex patterns in each grammar file replace ASCII character classes with `\p{L}`/`\p{N}` Unicode property escapes for `block_helper`, `else_token`, `end_block`, `extends`, `handlebars_attribute_name`, `handlebars_attribute_value`, and `partial_and_var` rules. The `.tmLanguage` file also adds `\b` boundaries around attribute name/value matches.
Unicode grammar test suite `test/unicode.test.js`	New `node:test` coverage adds an `assertScope` helper and cases for Cyrillic, CJK, Arabic, and diacritic Latin identifiers across variable, block open/close, else-if, partial, `extends`, and hash attribute positions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 Hop, hop—new letters join the line,
From Cyrillic stars to scripts that shine.
Blocks and helpers now read with grace,
Unicode dances through the grammar space.
A rabbit nods: “All tongues may play!”

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: widening Handlebars identifiers to support Unicode.
Linked Issues check	✅ Passed	The PR fulfills `#90` by adding Unicode support for Handlebars names and related parsing across the highlighted rules.
Out of Scope Changes check	✅ Passed	The changes stay focused on Unicode identifier support and matching tests, with no obvious unrelated additions.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/unicode-identifiers

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

🧹 Nitpick comments (1)

test/unicode.test.js (1)

25-69: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a Unicode {{!< ...}} regression test.

The suite covers most widened Handlebars rules, but it never asserts the extends pattern that changed in all three grammar files. That leaves the {{!< макет}} path unprotected.

Suggested test

 test('non-ASCII hash key and value', async () => {
   const src = '{{foo имя=значение}}';
   await assertScope(src, 'имя', 'entity.other.attribute-name.handlebars');
   await assertScope(src, 'значение', 'entity.other.attribute-value.handlebars');
 });
+
+test('layout extends with a non-ASCII name', async () => {
+  await assertScope('{{!< макет}}', 'макет', 'support.class.handlebars');
+});

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/unicode.test.js` around lines 25 - 69, Add a regression test in the
unicode test suite for the Handlebars extends form handled by the grammar’s
`extends` pattern, since `{{!< ...}}` is not currently covered. Use
`assertScope` with a non-ASCII template name such as `{{!< макет}}` and verify
the relevant token scope on the Unicode name so the widened rule stays protected
across the grammar files.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/unicode.test.js`:
- Around line 25-69: Add a regression test in the unicode test suite for the
Handlebars extends form handled by the grammar’s `extends` pattern, since `{{!<
...}}` is not currently covered. Use `assertScope` with a non-ASCII template
name such as `{{!< макет}}` and verify the relevant token scope on the Unicode
name so the widened rule stays protected across the grammar files.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 683be924-dd7f-4e12-9ef6-21e36dee496a

📥 Commits

Reviewing files that changed from the base of the PR and between adc200e and 84d8275.

📒 Files selected for processing (4)

grammars/Handlebars.json
grammars/Handlebars.sublime-syntax
grammars/Handlebars.tmLanguage
test/unicode.test.js

The extends rule was widened to \p{L}\p{N} alongside the other identifier rules but had no Unicode coverage; only the ASCII case in embedding.test.js guarded it. Add a test with a non-ASCII template name so the widened rule stays protected.

daaain mentioned this pull request Jun 29, 2026

Support Russian language #90

Closed

coderabbitai Bot reviewed Jun 29, 2026

View reviewed changes

daaain merged commit a5aa65d into master Jun 29, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Unicode identifiers in Handlebars expressions#114

Support Unicode identifiers in Handlebars expressions#114
daaain merged 2 commits into
masterfrom
feature/unicode-identifiers

daaain commented Jun 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

daaain commented Jun 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

daaain commented Jun 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading