aboutsummaryrefslogtreecommitdiff
path: root/src/nvim/generators/gen_unicode_tables.lua
Commit message (Collapse)AuthorAge
* refactor(multibyte): replace generated unicode tables with utf8procbfredl2024-08-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit intentionally aims at preserving existing behavior as much as possible while replacing our build step to convert unicode data files into binary tables, which corresponding lookups in utf8proc. Actual improvements in behavior will be a followup. The only change in behavior is that 'emoji' option will turn some more codepoints into double with. Nvim used the "Emoji" and "Emoji_Presentation" properties to define emojis, while utf8proc only exposes the Extended_Pictographic property from the emoji table. This is a superset of the previous emoji properties. As only codepoints above 0x1f000 are affected by the 'emoji' option, this means that the following chars are now treated as double-width, instead of single-width like in previous nvim versions: ๐Ÿ€€ ๐Ÿ€ ๐Ÿ€‚ ๐Ÿ€ƒ ๐Ÿ€… ๐Ÿ€† ๐Ÿ€‡ ๐Ÿ€ˆ ๐Ÿ€‰ ๐Ÿ€Š ๐Ÿ€‹ ๐Ÿ€Œ ๐Ÿ€ ๐Ÿ€Ž ๐Ÿ€ ๐Ÿ€ ๐Ÿ€‘ ๐Ÿ€’ ๐Ÿ€“ ๐Ÿ€” ๐Ÿ€• ๐Ÿ€– ๐Ÿ€— ๐Ÿ€˜ ๐Ÿ€™ ๐Ÿ€š ๐Ÿ€› ๐Ÿ€œ ๐Ÿ€ ๐Ÿ€ž ๐Ÿ€Ÿ ๐Ÿ€  ๐Ÿ€ก ๐Ÿ€ข ๐Ÿ€ฃ ๐Ÿ€ค ๐Ÿ€ฅ ๐Ÿ€ฆ ๐Ÿ€ง ๐Ÿ€จ ๐Ÿ€ฉ ๐Ÿ€ช ๐Ÿ€ซ ๐Ÿ€ฐ ๐Ÿ€ฑ ๐Ÿ€ฒ ๐Ÿ€ณ ๐Ÿ€ด ๐Ÿ€ต ๐Ÿ€ถ ๐Ÿ€ท ๐Ÿ€ธ ๐Ÿ€น ๐Ÿ€บ ๐Ÿ€ป ๐Ÿ€ผ ๐Ÿ€ฝ ๐Ÿ€พ ๐Ÿ€ฟ ๐Ÿ€ ๐Ÿ ๐Ÿ‚ ๐Ÿƒ ๐Ÿ„ ๐Ÿ… ๐Ÿ† ๐Ÿ‡ ๐Ÿˆ ๐Ÿ‰ ๐ŸŠ ๐Ÿ‹ ๐ŸŒ ๐Ÿ ๐ŸŽ ๐Ÿ ๐Ÿ ๐Ÿ‘ ๐Ÿ’ ๐Ÿ“ ๐Ÿ” ๐Ÿ• ๐Ÿ– ๐Ÿ— ๐Ÿ˜ ๐Ÿ™ ๐Ÿš ๐Ÿ› ๐Ÿœ ๐Ÿ ๐Ÿž ๐ŸŸ ๐Ÿ  ๐Ÿก ๐Ÿข ๐Ÿฃ ๐Ÿค ๐Ÿฅ ๐Ÿฆ ๐Ÿง ๐Ÿจ ๐Ÿฉ ๐Ÿช ๐Ÿซ ๐Ÿฌ ๐Ÿญ ๐Ÿฎ ๐Ÿฏ ๐Ÿฐ ๐Ÿฑ ๐Ÿฒ ๐Ÿณ ๐Ÿด ๐Ÿต ๐Ÿถ ๐Ÿท ๐Ÿธ ๐Ÿน ๐Ÿบ ๐Ÿป ๐Ÿผ ๐Ÿฝ ๐Ÿพ ๐Ÿฟ ๐Ÿ‚€ ๐Ÿ‚ ๐Ÿ‚‚ ๐Ÿ‚ƒ ๐Ÿ‚„ ๐Ÿ‚… ๐Ÿ‚† ๐Ÿ‚‡ ๐Ÿ‚ˆ ๐Ÿ‚‰ ๐Ÿ‚Š ๐Ÿ‚‹ ๐Ÿ‚Œ ๐Ÿ‚ ๐Ÿ‚Ž ๐Ÿ‚ ๐Ÿ‚ ๐Ÿ‚‘ ๐Ÿ‚’ ๐Ÿ‚“ ๐Ÿ‚  ๐Ÿ‚ก ๐Ÿ‚ข ๐Ÿ‚ฃ ๐Ÿ‚ค ๐Ÿ‚ฅ ๐Ÿ‚ฆ ๐Ÿ‚ง ๐Ÿ‚จ ๐Ÿ‚ฉ ๐Ÿ‚ช ๐Ÿ‚ซ ๐Ÿ‚ฌ ๐Ÿ‚ญ ๐Ÿ‚ฎ ๐Ÿ‚ฑ ๐Ÿ‚ฒ ๐Ÿ‚ณ ๐Ÿ‚ด ๐Ÿ‚ต ๐Ÿ‚ถ ๐Ÿ‚ท ๐Ÿ‚ธ ๐Ÿ‚น ๐Ÿ‚บ ๐Ÿ‚ป ๐Ÿ‚ผ ๐Ÿ‚ฝ ๐Ÿ‚พ ๐Ÿ‚ฟ ๐Ÿƒ ๐Ÿƒ‚ ๐Ÿƒƒ ๐Ÿƒ„ ๐Ÿƒ… ๐Ÿƒ† ๐Ÿƒ‡ ๐Ÿƒˆ ๐Ÿƒ‰ ๐ŸƒŠ ๐Ÿƒ‹ ๐ŸƒŒ ๐Ÿƒ ๐ŸƒŽ ๐Ÿƒ‘ ๐Ÿƒ’ ๐Ÿƒ“ ๐Ÿƒ” ๐Ÿƒ• ๐Ÿƒ– ๐Ÿƒ— ๐Ÿƒ˜ ๐Ÿƒ™ ๐Ÿƒš ๐Ÿƒ› ๐Ÿƒœ ๐Ÿƒ ๐Ÿƒž ๐ŸƒŸ ๐Ÿƒ  ๐Ÿƒก ๐Ÿƒข ๐Ÿƒฃ ๐Ÿƒค ๐Ÿƒฅ ๐Ÿƒฆ ๐Ÿƒง ๐Ÿƒจ ๐Ÿƒฉ ๐Ÿƒช ๐Ÿƒซ ๐Ÿƒฌ ๐Ÿƒญ ๐Ÿƒฎ ๐Ÿƒฏ ๐Ÿƒฐ ๐Ÿƒฑ ๐Ÿƒฒ ๐Ÿƒณ ๐Ÿƒด ๐Ÿƒต ๐Ÿ„ ๐Ÿ„Ž ๐Ÿ„ ๐Ÿ„ฏ ๐Ÿ…ฌ ๐Ÿ…ญ ๐Ÿ…ฎ ๐Ÿ…ฏ ๐Ÿ†ญ ๐ŸŒข ๐ŸŒฃ ๐ŸŽ” ๐ŸŽ• ๐ŸŽ˜ ๐ŸŽœ ๐ŸŽ ๐Ÿฑ ๐Ÿฒ ๐Ÿถ ๐Ÿ“พ ๐Ÿ•† ๐Ÿ•‡ ๐Ÿ•ˆ ๐Ÿ• ๐Ÿ•จ ๐Ÿ•ฉ ๐Ÿ•ช ๐Ÿ•ซ ๐Ÿ•ฌ ๐Ÿ•ญ ๐Ÿ•ฎ ๐Ÿ•ฑ ๐Ÿ•ฒ ๐Ÿ•ป ๐Ÿ•ผ ๐Ÿ•ฝ ๐Ÿ•พ ๐Ÿ•ฟ ๐Ÿ–€ ๐Ÿ– ๐Ÿ–‚ ๐Ÿ–ƒ ๐Ÿ–„ ๐Ÿ–… ๐Ÿ–† ๐Ÿ–ˆ ๐Ÿ–‰ ๐Ÿ–Ž ๐Ÿ– ๐Ÿ–‘ ๐Ÿ–’ ๐Ÿ–“ ๐Ÿ–” ๐Ÿ–— ๐Ÿ–˜ ๐Ÿ–™ ๐Ÿ–š ๐Ÿ–› ๐Ÿ–œ ๐Ÿ– ๐Ÿ–ž ๐Ÿ–Ÿ ๐Ÿ–  ๐Ÿ–ก ๐Ÿ–ข ๐Ÿ–ฃ ๐Ÿ–ฆ ๐Ÿ–ง ๐Ÿ–ฉ ๐Ÿ–ช ๐Ÿ–ซ ๐Ÿ–ฌ ๐Ÿ–ญ ๐Ÿ–ฎ ๐Ÿ–ฏ ๐Ÿ–ฐ ๐Ÿ–ณ ๐Ÿ–ด ๐Ÿ–ต ๐Ÿ–ถ ๐Ÿ–ท ๐Ÿ–ธ ๐Ÿ–น ๐Ÿ–บ ๐Ÿ–ป ๐Ÿ–ฝ ๐Ÿ–พ ๐Ÿ–ฟ ๐Ÿ—€ ๐Ÿ— ๐Ÿ—… ๐Ÿ—† ๐Ÿ—‡ ๐Ÿ—ˆ ๐Ÿ—‰ ๐Ÿ—Š ๐Ÿ—‹ ๐Ÿ—Œ ๐Ÿ— ๐Ÿ—Ž ๐Ÿ— ๐Ÿ— ๐Ÿ—” ๐Ÿ—• ๐Ÿ—– ๐Ÿ—— ๐Ÿ—˜ ๐Ÿ—™ ๐Ÿ—š ๐Ÿ—› ๐Ÿ—Ÿ ๐Ÿ—  ๐Ÿ—ข ๐Ÿ—ค ๐Ÿ—ฅ ๐Ÿ—ฆ ๐Ÿ—ง ๐Ÿ—ฉ ๐Ÿ—ช ๐Ÿ—ซ ๐Ÿ—ฌ ๐Ÿ—ญ ๐Ÿ—ฎ ๐Ÿ—ฐ ๐Ÿ—ฑ ๐Ÿ—ฒ ๐Ÿ—ด ๐Ÿ—ต ๐Ÿ—ถ ๐Ÿ—ท ๐Ÿ—ธ ๐Ÿ—น ๐Ÿ›† ๐Ÿ›‡ ๐Ÿ›ˆ ๐Ÿ›‰ ๐Ÿ›Š ๐Ÿ›“ ๐Ÿ›” ๐Ÿ›ฆ ๐Ÿ›ง ๐Ÿ›จ ๐Ÿ›ช ๐Ÿ›ฑ ๐Ÿ›ฒ ๐Ÿด ๐Ÿต ๐Ÿถ ๐Ÿป ๐Ÿผ ๐Ÿฝ ๐Ÿพ ๐Ÿฟ ๐ŸŸ• ๐ŸŸ– ๐ŸŸ— ๐ŸŸ˜ ๐ŸŸ™ ๐Ÿขฐ ๐Ÿขฑ ๐Ÿจ€ ๐Ÿจ ๐Ÿจ‚ ๐Ÿจƒ ๐Ÿจ„ ๐Ÿจ… ๐Ÿจ† ๐Ÿจ‡ ๐Ÿจˆ ๐Ÿจ‰ ๐ŸจŠ ๐Ÿจ‹ ๐ŸจŒ ๐Ÿจ ๐ŸจŽ ๐Ÿจ ๐Ÿจ ๐Ÿจ‘ ๐Ÿจ’ ๐Ÿจ“ ๐Ÿจ” ๐Ÿจ• ๐Ÿจ– ๐Ÿจ— ๐Ÿจ˜ ๐Ÿจ™ ๐Ÿจš ๐Ÿจ› ๐Ÿจœ ๐Ÿจ ๐Ÿจž ๐ŸจŸ ๐Ÿจ  ๐Ÿจก ๐Ÿจข ๐Ÿจฃ ๐Ÿจค ๐Ÿจฅ ๐Ÿจฆ ๐Ÿจง ๐Ÿจจ ๐Ÿจฉ ๐Ÿจช ๐Ÿจซ ๐Ÿจฌ ๐Ÿจญ ๐Ÿจฎ ๐Ÿจฏ ๐Ÿจฐ ๐Ÿจฑ ๐Ÿจฒ ๐Ÿจณ ๐Ÿจด ๐Ÿจต ๐Ÿจถ ๐Ÿจท ๐Ÿจธ ๐Ÿจน ๐Ÿจบ ๐Ÿจป ๐Ÿจผ ๐Ÿจฝ ๐Ÿจพ ๐Ÿจฟ ๐Ÿฉ€ ๐Ÿฉ ๐Ÿฉ‚ ๐Ÿฉƒ ๐Ÿฉ„ ๐Ÿฉ… ๐Ÿฉ† ๐Ÿฉ‡ ๐Ÿฉˆ ๐Ÿฉ‰ ๐ŸฉŠ ๐Ÿฉ‹ ๐ŸฉŒ ๐Ÿฉ ๐ŸฉŽ ๐Ÿฉ ๐Ÿฉ ๐Ÿฉ‘ ๐Ÿฉ’ ๐Ÿฉ“ ๐Ÿฉ  ๐Ÿฉก ๐Ÿฉข ๐Ÿฉฃ ๐Ÿฉค ๐Ÿฉฅ ๐Ÿฉฆ ๐Ÿฉง ๐Ÿฉจ ๐Ÿฉฉ ๐Ÿฉช ๐Ÿฉซ ๐Ÿฉฌ ๐Ÿฉญ
* refactor!: use utf8proc full casefoldingdundargoc2024-08-07
| | | | | | | | | | | | | | | | | | | | According to `CaseFolding-15.1.0.txt`, full casefolding should be preferred over simple casefolding as it's considered to be more correct. Since utf8proc already provides full casefolding it makes sense to switch to it. This will also remove a lot of unnecessary build code. Temporary exceptions are made for two sets characters: - `รŸ` will still be considered `รŸ` (instead of `ss`) as using a full casefolding requires interfering with upstream spell files in some form. - `ฤฐ` will still be considered `ฤฐ` (instead of `iฬ‡`) as using full casefolding requires making a value judgement on the "correct" behavior. There are two, equally valid case-insensetive comparison for this character according to unicode. It is essentially up to the implementor to decide which conversion is correct. For this reason it might make sense to allow users to decide which conversion should be done as an added option to `casemap` in a future PR.
* refactor: replace utf_convert with utf8proc conversion functionsdundargoc2024-06-28
|
* build: enable lintlua for src/ dir #26395Justin M. Keyes2023-12-04
| | | | | | | | | | | Problem: Not all Lua code is checked by stylua. Automating code-style is an important mechanism for reducing time spent on accidental (non-essential) complexity. Solution: - Enable lintlua for `src/` directory. followup to 517f0cc634b985057da5b95cf4ad659ee456a77e
* vim-patch:9.0.0666: spacing-combining characters handled as composing (#20501)zeertzjq2022-10-07
| | | | | | | Problem: Spacing-combining characters handled as composing, causing text to take more space than expected. Solution: Handle characters marked with "Mc" not as composing. (closes vim/vim#11282 https://github.com/vim/vim/commit/7beaf6a720ddc7e2989c8831872bfb98ec78a65d
* vim-patch:8.2.1535: it is not possible to specify cell widths of characterszeertzjq2022-08-08
| | | | | | | | Problem: It is not possible to specify cell widths of characters. Solution: Add setcellwidths(). https://github.com/vim/vim/commit/08aac3c6192f0103cb87e280270a32b50e653be1 Co-Authored-By: delphinus <me@delphinus.dev>
* Fix luacheck errors for all Lua source filesSameed Ali2019-07-04
|
* generators: fix filename typo in help messageJan Edmund Lazo2019-04-13
|
* generators: separate source generators from scriptsBjรถrn Linse2017-05-10