aboutsummaryrefslogtreecommitdiff
path: root/runtime
diff options
context:
space:
mode:
authorbfredl <bjorn.linse@gmail.com>2024-08-08 10:42:08 +0200
committerbfredl <bjorn.linse@gmail.com>2024-08-30 11:49:09 +0200
commitcfdf68a7acde16597fbd896674af68c42361102c (patch)
tree6113193fda7a7c0f94577a464e39964e74311583 /runtime
parent4353996d0fa8e5872a334d68196d8088391960cf (diff)
downloadrneovim-cfdf68a7acde16597fbd896674af68c42361102c.tar.gz
rneovim-cfdf68a7acde16597fbd896674af68c42361102c.tar.bz2
rneovim-cfdf68a7acde16597fbd896674af68c42361102c.zip
feat(mbyte): support extended grapheme clusters including more emoji
Use the grapheme break algorithm from utf8proc to support grapheme clusters from recent unicode versions. Handle variant selector VS16 turning some codepoints into double-width emoji. This means we need to use ptr2cells rather than char2cells when possible.
Diffstat (limited to 'runtime')
-rw-r--r--runtime/doc/mbyte.txt6
-rw-r--r--runtime/doc/news.txt6
-rw-r--r--runtime/doc/options.txt9
-rw-r--r--runtime/lua/vim/_meta/options.lua9
4 files changed, 24 insertions, 6 deletions
diff --git a/runtime/doc/mbyte.txt b/runtime/doc/mbyte.txt
index a8c5670352..47fd4f3343 100644
--- a/runtime/doc/mbyte.txt
+++ b/runtime/doc/mbyte.txt
@@ -646,6 +646,12 @@ widespread as file format.
A composing or combining character is used to change the meaning of the
character before it. The combining characters are drawn on top of the
preceding character.
+
+Nvim largely follows the definition of extended grapheme clusters in UAX#29
+in the Unicode standard, with some modifications: An ascii char will always
+start a new cluster. In addition 'arabicshape' enables the combining of some
+arabic letters, when they are shaped to be displayed together in a single cell.
+
Too big combined characters cannot be displayed, but they can still be
inspected using the |g8| and |ga| commands described below.
When editing text a composing character is mostly considered part of the
diff --git a/runtime/doc/news.txt b/runtime/doc/news.txt
index 80511ccb87..b7e1e0c84f 100644
--- a/runtime/doc/news.txt
+++ b/runtime/doc/news.txt
@@ -200,6 +200,12 @@ These existing features changed their behavior.
top lines are calculated using screen line numbers which take virtual lines
into account.
+• The implementation of grapheme clusters (or combining chars |mbyte-combining|)
+ was upgraded to closely follow extended grapheme clusters as defined by UAX#29
+ in the unicode standard. Noteworthily, this enables proper display of many
+ more emoji characters than before, including those encoded with multiple
+ emoji codepoints combined with ZWJ (zero width joiner) codepoints.
+
==============================================================================
REMOVED FEATURES *news-removed*
diff --git a/runtime/doc/options.txt b/runtime/doc/options.txt
index f44e0954a5..4945a1b46d 100644
--- a/runtime/doc/options.txt
+++ b/runtime/doc/options.txt
@@ -2217,9 +2217,12 @@ A jump table for the options with a short description can be found at |Q_op|.
global
When on all Unicode emoji characters are considered to be full width.
This excludes "text emoji" characters, which are normally displayed as
- single width. Unfortunately there is no good specification for this
- and it has been determined on trial-and-error basis. Use the
- |setcellwidths()| function to change the behavior.
+ single width. However, such "text emoji" are treated as full-width
+ emoji if they are followed by the U+FE0F variant selector.
+
+ Unfortunately there is no good specification for this and it has been
+ determined on trial-and-error basis. Use the |setcellwidths()|
+ function to change the behavior.
*'encoding'* *'enc'*
'encoding' 'enc' string (default "utf-8")
diff --git a/runtime/lua/vim/_meta/options.lua b/runtime/lua/vim/_meta/options.lua
index b4ac478b61..05c9b89d77 100644
--- a/runtime/lua/vim/_meta/options.lua
+++ b/runtime/lua/vim/_meta/options.lua
@@ -1829,9 +1829,12 @@ vim.go.ead = vim.go.eadirection
--- When on all Unicode emoji characters are considered to be full width.
--- This excludes "text emoji" characters, which are normally displayed as
---- single width. Unfortunately there is no good specification for this
---- and it has been determined on trial-and-error basis. Use the
---- `setcellwidths()` function to change the behavior.
+--- single width. However, such "text emoji" are treated as full-width
+--- emoji if they are followed by the U+FE0F variant selector.
+---
+--- Unfortunately there is no good specification for this and it has been
+--- determined on trial-and-error basis. Use the `setcellwidths()`
+--- function to change the behavior.
---
--- @type boolean
vim.o.emoji = true