diff options
author | bfredl <bjorn.linse@gmail.com> | 2024-08-08 10:42:08 +0200 |
---|---|---|
committer | bfredl <bjorn.linse@gmail.com> | 2024-08-30 11:49:09 +0200 |
commit | cfdf68a7acde16597fbd896674af68c42361102c (patch) | |
tree | 6113193fda7a7c0f94577a464e39964e74311583 /runtime | |
parent | 4353996d0fa8e5872a334d68196d8088391960cf (diff) | |
download | rneovim-cfdf68a7acde16597fbd896674af68c42361102c.tar.gz rneovim-cfdf68a7acde16597fbd896674af68c42361102c.tar.bz2 rneovim-cfdf68a7acde16597fbd896674af68c42361102c.zip |
feat(mbyte): support extended grapheme clusters including more emoji
Use the grapheme break algorithm from utf8proc to support grapheme
clusters from recent unicode versions.
Handle variant selector VS16 turning some codepoints into double-width
emoji. This means we need to use ptr2cells rather than char2cells when
possible.
Diffstat (limited to 'runtime')
-rw-r--r-- | runtime/doc/mbyte.txt | 6 | ||||
-rw-r--r-- | runtime/doc/news.txt | 6 | ||||
-rw-r--r-- | runtime/doc/options.txt | 9 | ||||
-rw-r--r-- | runtime/lua/vim/_meta/options.lua | 9 |
4 files changed, 24 insertions, 6 deletions
diff --git a/runtime/doc/mbyte.txt b/runtime/doc/mbyte.txt index a8c5670352..47fd4f3343 100644 --- a/runtime/doc/mbyte.txt +++ b/runtime/doc/mbyte.txt @@ -646,6 +646,12 @@ widespread as file format. A composing or combining character is used to change the meaning of the character before it. The combining characters are drawn on top of the preceding character. + +Nvim largely follows the definition of extended grapheme clusters in UAX#29 +in the Unicode standard, with some modifications: An ascii char will always +start a new cluster. In addition 'arabicshape' enables the combining of some +arabic letters, when they are shaped to be displayed together in a single cell. + Too big combined characters cannot be displayed, but they can still be inspected using the |g8| and |ga| commands described below. When editing text a composing character is mostly considered part of the diff --git a/runtime/doc/news.txt b/runtime/doc/news.txt index 80511ccb87..b7e1e0c84f 100644 --- a/runtime/doc/news.txt +++ b/runtime/doc/news.txt @@ -200,6 +200,12 @@ These existing features changed their behavior. top lines are calculated using screen line numbers which take virtual lines into account. +• The implementation of grapheme clusters (or combining chars |mbyte-combining|) + was upgraded to closely follow extended grapheme clusters as defined by UAX#29 + in the unicode standard. Noteworthily, this enables proper display of many + more emoji characters than before, including those encoded with multiple + emoji codepoints combined with ZWJ (zero width joiner) codepoints. + ============================================================================== REMOVED FEATURES *news-removed* diff --git a/runtime/doc/options.txt b/runtime/doc/options.txt index f44e0954a5..4945a1b46d 100644 --- a/runtime/doc/options.txt +++ b/runtime/doc/options.txt @@ -2217,9 +2217,12 @@ A jump table for the options with a short description can be found at |Q_op|. global When on all Unicode emoji characters are considered to be full width. This excludes "text emoji" characters, which are normally displayed as - single width. Unfortunately there is no good specification for this - and it has been determined on trial-and-error basis. Use the - |setcellwidths()| function to change the behavior. + single width. However, such "text emoji" are treated as full-width + emoji if they are followed by the U+FE0F variant selector. + + Unfortunately there is no good specification for this and it has been + determined on trial-and-error basis. Use the |setcellwidths()| + function to change the behavior. *'encoding'* *'enc'* 'encoding' 'enc' string (default "utf-8") diff --git a/runtime/lua/vim/_meta/options.lua b/runtime/lua/vim/_meta/options.lua index b4ac478b61..05c9b89d77 100644 --- a/runtime/lua/vim/_meta/options.lua +++ b/runtime/lua/vim/_meta/options.lua @@ -1829,9 +1829,12 @@ vim.go.ead = vim.go.eadirection --- When on all Unicode emoji characters are considered to be full width. --- This excludes "text emoji" characters, which are normally displayed as ---- single width. Unfortunately there is no good specification for this ---- and it has been determined on trial-and-error basis. Use the ---- `setcellwidths()` function to change the behavior. +--- single width. However, such "text emoji" are treated as full-width +--- emoji if they are followed by the U+FE0F variant selector. +--- +--- Unfortunately there is no good specification for this and it has been +--- determined on trial-and-error basis. Use the `setcellwidths()` +--- function to change the behavior. --- --- @type boolean vim.o.emoji = true |