aboutsummaryrefslogtreecommitdiff
path: root/utf8parse/src
Commit message (Collapse)AuthorAge
* Switch parser to multi-byte processingChristian Duerr2025-01-09
| | | | | | | | | | | | | | | | | | | | | This patch overhauls the `Parser::advance` API to operate on byte slices instead of individual bytes, which allows for additional performance optimizations. VTE does not support C1 escapes and C0 escapes always start with an escape character. This makes it possible to simplify processing if a byte stream is determined to not contain any escapes. The `memchr` crate provides a battle-tested implementation for SIMD-accelerated byte searches, which is why this implementation makes use of it. VTE also only supports UTF8 characters in the ground state, which means that the new non-escape parsing path is able to rely completely on STD's `str::from_utf8` since `memchr` gives us the full length of the plain text character buffer. This allows us to completely remove `utf8parse` and all related code. We also make use of `memchr` in the synchronized escape handling in `ansi.rs`, since it relies heavily on scanning large amounts of text for the extension/termination escape sequences.
* Migrate `ansi` from `alacritty_terminal`Anhad Singh2023-05-14
| | | | | | Signed-off-by: Andy-Python-Programmer <andypythonappdeveloper@gmail.com> Signed-off-by: Anhad Singh <andypythonappdeveloper@gmail.com> Co-authored-by: Nicholas Sim <nsim@posteo.net> Co-authored-by: Christian Duerr <contact@christianduerr.com>
* Add trivial derives to `utf8parser::Parser`Ed Page2023-03-08
| | | | | | | | | | Much like `std::ops::Range`, we likely don't want this to be `Copy` as that makes it too easy to get mixed up on what state you are using but `Clone` should be explicit enough to be safe. `PartialOrd` / `Ord` were left off because there isn't really a user-facing ordering to these types `Hash` was left off as the use cases for it isn't clear.
* Migrate to 2021 editionKirill Chibisov2022-01-16
|
* Pass terminator to osc dispatcherChristian Duerr2020-01-29
| | | | | | | | | | | | | | Even though the ST terminator is the only officially supported terminator, some applications still rely on BEL to work properly. Both have been supported historically, however there was no way for the terminal to tell which terminator was used. Since OSC escapes frequently offer the `?` parameter to query for the current format, some applications expect the response terminator to match the request terminator. To make it possible to support this, the osc_dispatcher is now informed when the BEL terminator was used. Since the C1 ST terminator was not yet supported for OSC escapes, support for it has also been added.
* Remove table generationChristian Duerr2019-12-10
| | | | | | | | | | | | | | | This completely removes the `codegen` project, which relied on outdated libraries to parse DSLs to build the utf8 and vte state tables, to make the library easier to maintain. The utf8 table could be completely removed in favor of a `match` statement, which also lead to a performance improvement with the utf8 parser. The vte table did not benefit from `match` statements at all and instead had significantly worse performance with it. To replace the old codegeneration for vte, the `generate_state_changes` crate has been created instead, which uses the language's proc_macro feature to create a `const fn` which will generate the table at compile time.
* Update to Rust 2018Christian Duerr2019-11-23
| | | | | | This moves all crates in the workspace to the latest Rust standard and resolves various style and formatting issues. Fixes #32.
* Fix the utf8parse tests too.Nathan Lilienthal2018-01-10
|
* no_std (#9)M Farkas-Dyck2017-11-18
|
* adds UTF8parse test and associated UTF-8 test fileLiz Baillie2016-10-21
|
* Move utf8 parsing into separate crateJoe Wilm2016-09-17