aboutsummaryrefslogtreecommitdiff
path: root/src/nvim/lua/treesitter.c
Commit message (Collapse)AuthorAge
* refactor(treesitter): always return valid range from parse() #32273Riley Bruins2025-02-02
| | | | | | | | | | | Problem: When running an initial parse, parse() returns an empty table rather than an actual range. In `languagetree.lua`, we manually check if a parse was incremental to determine the changed parse region. Solution: - Always return a range (in the C side) from parse(). - Simplify the language tree code a bit. - Logger no longer shows empty ranges on the initial parse.
* build(deps)!: bump tree-sitter to HEAD, wasmtime to v29.0.1 (#32200)Christian Clason2025-01-27
| | | | | Breaking change: `ts_node_child_containing_descendant()` was removed Breaking change: tree-sitter 0.25 (HEAD) required
* fix(treesitter): uv_dlclose after uv_dlerrorHorror Proton2025-01-14
|
* feat(treesitter): async parsingRiley Bruins2025-01-12
| | | | | | | | | | | | **Problem:** Parsing can be slow for large files, and it is a blocking operation which can be disruptive and annoying. **Solution:** Provide a function for asynchronous parsing, which accepts a callback to be run after parsing completes. Co-authored-by: Lewis Russell <lewis6991@gmail.com> Co-authored-by: Luuk van Baal <luukvbaal@gmail.com> Co-authored-by: VanaIgr <vanaigranov@gmail.com>
* fix: fix broken wasmtime builddundargoc2024-12-23
| | | | | | Regression from 2a7d0ed6145bf3f8b139c2694563f460f829813a, which removed header that is only needed if wasmtime support is enabled. Prevent this from happening again by wrapping the include in a `HAVE_WASMTIME` check.
* refactor: iwyu #31637Justin M. Keyes2024-12-23
| | | Result of `make iwyu` (after some "fixups").
* fix(treesitter): show proper node name error messagesRiley Bruins2024-11-13
| | | | | | | | | | | **Problem:** Currently node names with non-alphanumeric, non underscore/hyphen characters (only possible with anonymous nodes) are not given a proper error message. See tree-sitter issue 3892 for more details. **Solution:** Apply a different scanning logic to anonymous nodes to correctly identify the entire node name (i.e., up until the final double quote)
* fix(treesitter): correct condition in `__has_ancestor`Amaan Qureshi2024-10-27
|
* fix(treesitter): mark supertype nodes as namedRiley Bruins2024-10-12
| | | | | | | | | | | **Problem:** Tree-sitter 0.24.0 introduced a new symbol type to denote supertype nodes (`TSSymbolTypeSupertype`). Now, `language.inspect()` (and the query `omnifunc`) return supertype symbols, but with double quotes around them. **Solution:** Mark a symbol as "named" based on it *not* being an anonymous node, rather than checking that it is a regular node (which a supertype also is not).
* fix(treesitter): remove duplicate symbol names in language.inspect()Riley Bruins2024-10-11
| | | | | | | | | | | | | | | | | **Problems:** - `vim.treesitter.language.inspect()` returns duplicate symbol names, sometimes up to 6 of one kind in the case of `markdown` - The list-like `symbols` table can have holes and is thus not even a valid msgpack table anyway, mentioned in a test **Solution:** Return symbols as a map, rather than a list, where field names are the names of the symbol. The boolean value associated with the field encodes whether or not the symbol is named. Note that anonymous nodes are surrounded with double quotes (`"`) to prevent potential collisions with named counterparts that have the same identifier.
* feat(treesitter): introduce child_with_descendant()Riley Bruins2024-10-11
| | | | | | This commit also marks `child_containing_descendant()` as deprecated (per upstream's documentation), and uses `child_with_descendant()` in its place. Minimum required tree-sitter version will now be `0.24`.
* perf(treesitter): do not use tree cursors with a small lifetimeLewis Russell2024-10-03
| | | | | | | | | Problem: Tree cursors can only be efficient when they are re-used. Short-lived cursors are very slow. Solution: Reimplement functions that use short-lived cursors.
* feat(treesitter): add support for wasm parsersLewis Russell2024-08-26
| | | | | | | | | | | | | | | | | | | | Problem: Installing treesitter parser is hard (harder than climbing to heaven). Solution: Add optional support for wasm parsers with `wasmtime`. Notes: * Needs to be enabled by setting `ENABLE_WASMTIME` for tree-sitter and Neovim. Build with `make CMAKE_EXTRA_FLAGS=-DENABLE_WASMTIME=ON DEPS_CMAKE_FLAGS=-DENABLE_WASMTIME=ON` * Adds optional Rust (obviously) and C11 dependencies. * Wasmtime comes with a lot of features that can negatively affect Neovim performance due to library and symbol table size. Make sure to build with minimal features and full LTO. * To reduce re-compilation times, install `sccache` and build with `RUSTC_WRAPPER=<path/to/sccache> make ...`
* fixup: apply the change on more filesJames Tirta Halim2024-06-04
|
* perf(treesitter): use child_containing_descendant() in has-ancestor? (#28512)vanaigr2024-05-16
| | | | | | | | | Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
* fix(treesitter): make tests for memoize more robustbfredl2024-04-29
| | | | | | | | | | | | Instead of painfully messing with timing to determine if queries were reparsed, we can simply keep a counter next to the call to ts_query_new Also memoization had a hidden dependency on the garbage collection of the the key, a hash value which never is kept around in memory. this was done intentionally as the hash does not capture all relevant state for the query (external included files) even if actual query objects still would be reachable in memory. To make the test fully deterministic in CI, we explicitly control GC.
* refactor(treesitter): language loadingLewis Russell2024-04-21
|
* refactor(treesitter): handle coverity warnings betterLewis Russell2024-03-20
|
* fix(treesitter): treecursor regressionLewis Russell2024-03-20
| | | | | | - Also address some coverity warnings Fixes #27942
* refactor(treesitter): reorder functionsLewis Russell2024-03-19
|
* refactor(treesitter): simplify argument checks for userdataLewis Russell2024-03-19
|
* refactor(treesitter): redesign query iteratingLewis Russell2024-03-19
| | | | | | | | | | | | | | | | Problem: `TSNode:_rawquery()` is complicated, has known issues and the Lua and C code is awkwardly coupled (see logic with `active`). Solution: - Add `TSQueryCursor` and `TSQueryMatch` bindings. - Replace `TSNode:_rawquery()` with `TSQueryCursor:next_capture()` and `TSQueryCursor:next_match()` - Do more stuff in Lua - API for `Query:iter_captures()` and `Query:iter_matches()` remains the same. - `treesitter.c` no longer contains any logic related to predicates. - Add `match_limit` option to `iter_matches()`. Default is still 256.
* refactor: use ml_get_buf_len() in API code (#27825)zeertzjq2024-03-12
|
* fix(treesitter): correctly handle query quantifiers (#24738)Thomas Vigouroux2024-02-16
| | | | | | | | | | | | | | | | | | | Query patterns can contain quantifiers (e.g. (foo)+ @bar), so a single capture can map to multiple nodes. The iter_matches API can not handle this situation because the match table incorrectly maps capture indices to a single node instead of to an array of nodes. The match table should be updated to map capture indices to an array of nodes. However, this is a massively breaking change, so must be done with a proper deprecation period. `iter_matches`, `add_predicate` and `add_directive` must opt-in to the correct behavior for backward compatibility. This is done with a new "all" option. This option will become the default and removed after the 0.10 release. Co-authored-by: Christian Clason <c.clason@uni-graz.at> Co-authored-by: MDeiml <matthias@deiml.net> Co-authored-by: Gregory Anders <greg@gpanders.com>
* refactor(treesitter): typing for Query, TSQuery, and TSQueryInfoJongwook Choi2024-02-08
| | | | | | | | | | | | | | | | | | | | - `TSQuery`: userdata object for parsed query. - `vim.treesitter.Query`: renamed from `Query`. - Add a new field `lang`. - `TSQueryInfo`: - Move to `vim/treesitter/_meta.lua`, because C code owns it. - Correct typing for `patterns`, should be a map from `integer` (pattern_id) to `(integer|string)[][]` (list of predicates or directives). - `vim.treesitter.QueryInfo` is added. - This currently has the same structure as `TSQueryInfo` (exported from C code). - Document the fields (see `TSQuery:inspect`). - Add typing for `vim._ts_parse_query()`.
* docs: enforce "treesitter" spelling #27110Jongwook Choi2024-01-28
| | | It's the "tree-sitter" project, but "treesitter" in our code and docs.
* fixup: raise TS min versionChristian Clason2024-01-25
|
* refactor: fix headers with IWYUdundargoc2023-11-28
|
* refactor: rename types.h to types_defs.hdundargoc2023-11-27
|
* build(IWYU): fix includes for undo_defs.hdundargoc2023-11-27
|
* refactor: move Arena and ArenaMem to memory_defs.h (#26240)zeertzjq2023-11-27
|
* build: remove PVSdundargoc2023-11-12
| | | | | | | We already have an extensive suite of static analysis tools we use, which causes a fair bit of redundancy as we get duplicate warnings. PVS is also prone to give false warnings which creates a lot of work to identify and disable.
* build(lint): remove unnecessary clint.py rulesdundargoc2023-10-23
| | | | | Uncrustify is the source of truth where possible. Remove any redundant checks from clint.py.
* refactor: the long goodbyedundargoc2023-10-09
| | | | | | long is 32 bits on windows, while it is 64 bits on other architectures. This makes the type suboptimal for a codebase meant to be cross-platform. Replace it with more appropriate integer types.
* build(iwyu): add a few more _defs.h mappings (#25435)zeertzjq2023-09-30
|
* refactor(map): enhanced implementation, Clean Code™, etc etcbfredl2023-09-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This involves two redesigns of the map.c implementations: 1. Change of macro style and code organization The old khash.h and map.c implementation used huge #define blocks with a lot of backslash line continuations. This instead uses the "implementation file" .c.h pattern. Such a file is meant to be included multiple times, with different macros set prior to inclusion as parameters. we already use this pattern e.g. for eval/typval_encode.c.h to implement different typval encoders reusing a similar structure. We can structure this code into two parts. one that only depends on key type and is enough to implement sets, and one which depends on both key and value to implement maps (as a wrapper around sets, with an added value[] array) 2. Separate the main hash buckets from the key / value arrays Change the hack buckets to only contain an index into separate key / value arrays This is a common pattern in modern, state of the art hashmap implementations. Even though this leads to one more allocated array, it is this often is a net reduction of memory consumption. Consider key+value consuming at least 12 bytes per pair. On average, we will have twice as many buckets per item. Thus old implementation: 2*12 = 24 bytes per item New implementation 1*12 + 2*4 = 20 bytes per item And the difference gets bigger with larger items. One might think we have pulled a fast one here, as wouldn't the average size of the new key/value arrays be 1.5 slots per items due to amortized grows? But remember, these arrays are fully dense, and thus the accessed memory, measured in _cache lines_, the unit which actually matters, will be the fully used memory but just rounded up to the nearest cache line boundary. This has some other interesting properties, such as an insert-only set/map will be fully ordered by insert only. Preserving this ordering in face of deletions is more tricky tho. As we currently don't use ordered maps, the "delete" operation maintains compactness of the item arrays in the simplest way by breaking the ordering. It would be possible to implement an order-preserving delete although at some cost, like allowing the items array to become non-dense until the next rehash. Finally, in face of these two major changes, all code used in khash.h has been integrated into map.c and friends. Given the heavy edits it makes no sense to "layer" the code into a vendored and a wrapper part. Rather, the layered cake follows the specialization depth: code shared for all maps, code specialized to a key type (and its equivalence relation), and finally code specialized to value+key type.
* fix(query_error): multiline bugLewis Russell2023-08-31
|
* feat(treesitter): improve query error messageAmaan Qureshi2023-08-31
|
* fix(treesitter.c): improve comments on fenv usageLewis Russell2023-08-30
|
* fix(treesitter): fix another TSNode:tree() double freebfredl2023-08-29
| | | | | | | Unfortunately the gc=false objects can refer to a dangling tree if the gc=true tree was freed first. This reuses the same tree object as the node itself is keeping alive via the uservalue of the node userdata. (wrapped in a table due to lua 5.1 restrictions)
* fix(treesitter): fix TSNode:tree() double free (#24796)nwounkn2023-08-29
| | | | | | | | | Problem: `push_tree`, every time its called for the same TSTree with `do_copy=false` argument, creates a new userdata for it. Each userdata, when garbage collected, frees the same TSTree C object. Solution: Add flag to userdata, which indicates, should C object, which userdata points to, be freed, when userdata is garbage collected.
* refactor(memline): distinguish mutating uses of ml_get_buf()bfredl2023-08-24
| | | | | | | | | | | | | | ml_get_buf() takes a third parameters to indicate whether the caller wants to mutate the memline data in place. However the vast majority of the call sites is using this function just to specify a buffer but without any mutation. This makes it harder to grep for the places which actually perform mutation. Solution: Remove the bool param from ml_get_buf(). it now works like ml_get() except for a non-current buffer. Add a new ml_get_buf_mut() function for the mutating use-case, which can be grepped along with the other ml_replace() etc functions which can modify the memline.
* fix(treesitter): logger memory leakLewis Russell2023-08-13
|
* build(deps): bump tree-sitter to HEAD - 0a1c4d846 (#24607)Christian Clason2023-08-08
| | | | adapt to breaking change in `ts_query_cursor_set_max_start_depth` https://github.com/tree-sitter/tree-sitter/pull/2278
* Merge pull request #15534 from bfredl/monomapbfredl2023-05-17
|\ | | | | refactor(map): avoid duplicated khash_t implementations for values and support sets
| * refactor(map): avoid duplicated khash_t types for valuesbfredl2023-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reduces the total number of khash_t instantiations from 22 to 8. Make the khash internal functions take the size of values as a runtime parameter. This is abstracted with typesafe Map containers which are still specialized for both key, value type. Introduce `Set(key)` type for when there is no value. Refactor shada.c to use Map/Set instead of khash directly. This requires `map_ref` operation to be more flexible. Return pointers to both key and value, plus an indicator for new_item. As a bonus, `map_key` is now redundant. Instead of Map(cstr_t, FileMarks), use a pointer map as the FileMarks struct is humongous. Make `event_strings` actually work like an intern pool instead of wtf it was doing before.
* | feat(treesitter): improved logging (#23638)Lewis Russell2023-05-17
|/ | | | | | - Add bindings to Treesitter ts_parser_set_logger and ts_parser_logger - Add logfile with path STDPATH('log')/treesitter.c - Rework existing LanguageTree loggin to use logfile - Begin implementing log levels for vim.g.__ts_debug
* fix(treesitter): reset cursor max_start_depthLewis Russell2023-05-11
|
* feat(treesitter): add support for setting query depthsLewis Russell2023-05-11
|
* refactor: uncrustifydundargoc2023-04-26
| | | | Notable changes: replace all infinite loops to `while(true)` and remove `int` from `unsigned int`.