Support Unicode identifiers in Handlebars expressions#114
Conversation
Handlebars allows variable, helper, partial and block names in any
language, but the grammar's identifier character classes were limited to
ASCII (a-zA-Z0-9), so non-Latin names lost highlighting.
Replace the ASCII ranges with Oniguruma Unicode property classes
\p{L} (any letter) and \p{N} (any number) in the Handlebars-specific
rules: block_helper, end_block, partial_and_var, attribute name/value,
layout (!<) and else_token. HTML-structural rules (tag names, entities,
generic attributes) keep their ASCII ranges per the HTML spec.
This supersedes PR #90, which only added Cyrillic to a subset of rules
(and missed the closing-tag rule); the maintainer's review on that PR
asked for full-language support instead.
Closes #90. Adds test/unicode.test.js covering Cyrillic, CJK, Arabic
and Latin-with-diacritics across variables, blocks, partials, hashes
and else-if.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAll three Handlebars grammar formats replace ASCII-only identifier classes with Unicode property escapes in matching rules for blocks, else clauses, extends syntax, attributes, and inline variables/partials. A new test file checks Unicode scoping across several script systems. ChangesUnicode Identifier Support
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/unicode.test.js (1)
25-69: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd a Unicode
{{!< ...}}regression test.The suite covers most widened Handlebars rules, but it never asserts the
extendspattern that changed in all three grammar files. That leaves the{{!< макет}}path unprotected.Suggested test
test('non-ASCII hash key and value', async () => { const src = '{{foo имя=значение}}'; await assertScope(src, 'имя', 'entity.other.attribute-name.handlebars'); await assertScope(src, 'значение', 'entity.other.attribute-value.handlebars'); }); + +test('layout extends with a non-ASCII name', async () => { + await assertScope('{{!< макет}}', 'макет', 'support.class.handlebars'); +});🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/unicode.test.js` around lines 25 - 69, Add a regression test in the unicode test suite for the Handlebars extends form handled by the grammar’s `extends` pattern, since `{{!< ...}}` is not currently covered. Use `assertScope` with a non-ASCII template name such as `{{!< макет}}` and verify the relevant token scope on the Unicode name so the widened rule stays protected across the grammar files.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/unicode.test.js`:
- Around line 25-69: Add a regression test in the unicode test suite for the
Handlebars extends form handled by the grammar’s `extends` pattern, since `{{!<
...}}` is not currently covered. Use `assertScope` with a non-ASCII template
name such as `{{!< макет}}` and verify the relevant token scope on the Unicode
name so the widened rule stays protected across the grammar files.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 683be924-dd7f-4e12-9ef6-21e36dee496a
📒 Files selected for processing (4)
grammars/Handlebars.jsongrammars/Handlebars.sublime-syntaxgrammars/Handlebars.tmLanguagetest/unicode.test.js
The extends rule was widened to \p{L}\p{N} alongside the other
identifier rules but had no Unicode coverage; only the ASCII case in
embedding.test.js guarded it. Add a test with a non-ASCII template name
so the widened rule stays protected.
Handlebars allows variable, helper, partial and block names in any language, but the grammar's identifier character classes were limited to ASCII (a-zA-Z0-9), so non-Latin names lost highlighting.
Replace the ASCII ranges with Oniguruma Unicode property classes \p{L} (any letter) and \p{N} (any number) in the Handlebars-specific rules: block_helper, end_block, partial_and_var, attribute name/value, layout (!<) and else_token. HTML-structural rules (tag names, entities, generic attributes) keep their ASCII ranges per the HTML spec.
This supersedes PR #90, which only added Cyrillic to a subset of rules (and missed the closing-tag rule); the review on that PR asked for full-language support instead.
Closes #90. Adds test/unicode.test.js covering Cyrillic, CJK, Arabic and Latin-with-diacritics across variables, blocks, partials, hashes and else-if.
Summary by CodeRabbit
New Features
else if, block endings, and layout-styleextendssyntax.Tests