feat(extraction): Elixir language support (.ex/.exs)#871
Open
allenwoods wants to merge 1 commit into
Open
Conversation
Adds Elixir to CodeGraph's tree-sitter extraction. The grammar
(tree-sitter-elixir.wasm) already ships in tree-sitter-wasms; this wires it
up and adds the extractor.
tree-sitter-elixir is metaprogramming-first: defmodule, def, defp, alias,
import, if, case — everything parses as the same `(call target:(identifier)
(arguments) (do_block)?)` shape. So extraction runs through the visitNode
hook and dispatches on the macro identifier rather than node types.
Extracted:
- modules (defmodule, nested) and protocols (defprotocol -> interface)
- functions (def/defp/defmacro/defmacrop/defguard/defguardp/defdelegate) with
public/private visibility; multi-clause defs fold into one symbol (same
qualifiedName) so the resolver isn't left with duplicates
- dependencies (alias/import/require/use), including multi-alias
`alias A.{B, C}` expansion
- defimpl with an `implements` edge to the protocol
- defstruct/defexception as struct nodes
- call edges for qualified `Mod.fun` and local calls, descending through
control-flow special forms (if/case/with/for/...) while not recording them
as calls; module attributes (@doc/@spec/...) are skipped
Tested against real Phoenix/OTP source (1900+ LOC modules) — correct module
names, visibility, and hundreds of call edges. 16 new unit tests; full suite
(1506 tests) green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
+1 👌 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds Elixir (
.ex/.exs) to CodeGraph's tree-sitter extraction. Thetree-sitter-elixir.wasmgrammar already ships intree-sitter-wasms, so this wires it intoWASM_GRAMMAR_FILES/EXTENSION_MAPand adds an extractor.Why a custom approach
tree-sitter-elixir is metaprogramming-first — there are almost no dedicated declaration node types.
defmodule,def,defp,alias,import,if,case… all parse as the same shape:So extraction runs through the
visitNodehook and dispatches on the macro identifier instead of node types (similar in spirit to Ruby's module handling and Pascal's custom visitor).callTypesis empty because the core'sextractCallkeys off afunctionfield Elixir lacks — the hook records calls itself.Extracted
defmodule, nested) and protocols (defprotocol→interface)def/defp/defmacro/defmacrop/defguard/defguardp/defdelegate, with public/private visibility. Multi-clause definitions fold into a single symbol (samequalifiedName) so the resolver isn't left with duplicate nodes.alias/import/require/use, including multi-aliasalias A.{B, C}expansion into one import eachdefimplwith animplementsedge to the protocoldefstruct/defexceptionas struct nodesMod.funand local calls, descending through control-flow special forms (if/case/with/for/ pipes) while not recording those forms themselves as calls. Module attributes (@doc/@spec/ …) are skipped in this pass.Files
src/extraction/languages/elixir.ts— new extractorsrc/extraction/grammars.ts— grammar + extension registrationsrc/extraction/languages/index.ts— register inEXTRACTORSsrc/types.ts— addelixirtoLanguage__tests__/extraction.test.ts— 16 testsREADME.md/CHANGELOG.mdTesting
do:shorthand, imports (incl. multi-alias), protocol/impl/struct/delegate, call edgestsc --noEmitcleanNotes
tree-sitter-wasms— no vendored.wasmadded (so it's not in the__dirname/wasmspecial-case list inloadGrammarsForLanguages).🤖 Generated with Claude Code