Allow Unicode letters in bundle variable reference paths#5532
Draft
Rada1302 wants to merge 1 commit into
Draft
Conversation
Extend ${...} segment matching to accept Unicode letters (not just ASCII)
across Go, Python, and JSON Schema validation, with a shared reference_vectors
contract to keep the three implementations in sync.
Collaborator
|
Commit: aee3e7e
23 interesting tests: 15 SKIP, 7 KNOWN, 1 flaky
Top 32 slowest tests (at least 2 minutes):
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extend
${...}segment matching to accept Unicode letters (not just ASCII) across Go, Python, and JSON Schema validation, with a sharedreference_vectorscontract to keep the three implementations in sync.Changes
[a-zA-Z]…) to Unicode letter-aware matching (\p{L}in Go,[^\W\d_]in Python).dynvar.BaseVarDefin Go and reused it in bundle JSON Schema generation so schema validation matches runtime parsing.libs/dyn/dynvar/testdata/reference_vectors.jsonas a shared contract exercised by Go (ref_vectors_test.go) and Python (test_variable_reference_vectors.py) tests.bundle/schema/jsonschema.json.Why
Bundle variable names and references were previously limited to ASCII letters in
${...}paths, which blocked valid Unicode identifiers (e.g. CJK variable names like${var.变量}). The Go runtime matcher, Python pydabs transform layer, and JSON Schema validation each had their own copy of the regex, which made drift likely. Centralizing the segment definition inBaseVarDef/_base_var_defand locking behavior with shared vector tests keeps all three layers consistent.Emoji, symbols, and NFD-normalized characters (e.g.
e+ combining acute instead of precomposedé) remain rejected intentionally — only Unicode letters and digits (after the first character) are allowed in segments, preserving existing separator rules (-,_).Tests
go test ./libs/dyn/dynvar/...— vector-driven tests forNewRef,IsPureVariableReference, andPureReferenceToPathgo test ./bundle/schema/... -run TestSchemaValidate— new pass/fail schema fixtures for Unicode referencescd python && task test— Python vector parity tests against the samereference_vectors.json