aboutsummaryrefslogtreecommitdiff
path: root/runtime/doc/mbyte.txt
diff options
context:
space:
mode:
Diffstat (limited to 'runtime/doc/mbyte.txt')
-rw-r--r--runtime/doc/mbyte.txt88
1 files changed, 34 insertions, 54 deletions
diff --git a/runtime/doc/mbyte.txt b/runtime/doc/mbyte.txt
index c87ed317d4..3bdb682a31 100644
--- a/runtime/doc/mbyte.txt
+++ b/runtime/doc/mbyte.txt
@@ -70,29 +70,24 @@ See |mbyte-locale| for details.
ENCODING
-If your locale works properly, Vim will try to set the 'encoding' option
-accordingly. If this doesn't work you can overrule its value: >
+Nvim always uses UTF-8 internally. Thus 'encoding' option is always set
+to "utf-8" and cannot be changed.
- :set encoding=utf-8
+All the text that is used inside Vim will be in UTF-8. Not only the text in
+the buffers, but also in registers, variables, etc.
-See |encoding-values| for a list of acceptable values.
-
-The result is that all the text that is used inside Vim will be in this
-encoding. Not only the text in the buffers, but also in registers, variables,
-etc. 'encoding' is read-only after startup because changing it would make the
-existing text invalid.
-
-You can edit files in another encoding than what 'encoding' is set to. Vim
+You can edit files in different encodings than UTF-8. Nvim
will convert the file when you read it and convert it back when you write it.
See 'fileencoding', 'fileencodings' and |++enc|.
DISPLAY AND FONTS
-If you are working in a terminal (emulator) you must make sure it accepts the
-same encoding as which Vim is working with.
+If you are working in a terminal (emulator) you must make sure it accepts
+UTF-8, the encoding which Vim is working with. Otherwise only ASCII can
+be displayed and edited correctly.
-For the GUI you must select fonts that work with the current 'encoding'. This
+For the GUI you must select fonts that work with UTF-8. This
is the difficult part. It depends on the system you are using, the locale and
a few other things. See the chapters on fonts: |mbyte-fonts-X11| for
X-Windows and |mbyte-fonts-MSwin| for MS-Windows.
@@ -216,10 +211,9 @@ You could make a small shell script for this.
==============================================================================
3. Encoding *mbyte-encoding*
-Vim uses the 'encoding' option to specify how characters are identified and
-encoded when they are used inside Vim. This applies to all the places where
-text is used, including buffers (files loaded into memory), registers and
-variables.
+In Nvim UTF-8 is always used internally to encode characters.
+ This applies to all the places where text is used, including buffers (files
+ loaded into memory), registers and variables.
*charset* *codeset*
Charset is another name for encoding. There are subtle differences, but these
@@ -240,7 +234,7 @@ matter what language is used. Thus you might see the right text even when the
encoding was set wrong.
*encoding-names*
-Vim can use many different character encodings. There are three major groups:
+Vim can edit files in different character encodings. There are three major groups:
1 8bit Single-byte encodings, 256 different characters. Mostly used
in USA and Europe. Example: ISO-8859-1 (Latin1). All
@@ -255,11 +249,10 @@ u Unicode Universal encoding, can replace all others. ISO 10646.
Millions of different characters. Example: UTF-8. The
relation between bytes and screen cells is complex.
-Other encodings cannot be used by Vim internally. But files in other
+Only UTF-8 is used by Vim internally. But files in other
encodings can be edited by using conversion, see 'fileencoding'.
-Note that all encodings must use ASCII for the characters up to 128.
-Supported 'encoding' values are: *encoding-values*
+Recognized 'fileencoding' values include: *encoding-values*
1 latin1 8-bit characters (ISO 8859-1, also used for cp1252)
1 iso-8859-n ISO_8859 variant (n = 2 to 15)
1 koi8-r Russian
@@ -311,11 +304,11 @@ u ucs-4 32 bit UCS-4 encoded Unicode (ISO/IEC 10646-1)
u ucs-4le like ucs-4, little endian
The {name} can be any encoding name that your system supports. It is passed
-to iconv() to convert between the encoding of the file and the current locale.
+to iconv() to convert between UTF-8 and the encoding of the file.
For MS-Windows "cp{number}" means using codepage {number}.
Examples: >
- :set encoding=8bit-cp1252
- :set encoding=2byte-cp932
+ :set fileencoding=8bit-cp1252
+ :set fileencoding=2byte-cp932
The MS-Windows codepage 1252 is very similar to latin1. For practical reasons
the same encoding is used and it's called latin1. 'isprint' can be used to
@@ -337,8 +330,7 @@ u ucs-2be same as ucs-2 (big endian)
u ucs-4be same as ucs-4 (big endian)
u utf-32 same as ucs-4
u utf-32le same as ucs-4le
- default stands for the default value of 'encoding', depends on the
- environment
+ default the encoding of the current locale.
For the UCS codes the byte order matters. This is tricky, use UTF-8 whenever
you can. The default is to use big-endian (most significant byte comes
@@ -363,13 +355,12 @@ or when conversion is not possible:
CONVERSION *charset-conversion*
Vim will automatically convert from one to another encoding in several places:
-- When reading a file and 'fileencoding' is different from 'encoding'
-- When writing a file and 'fileencoding' is different from 'encoding'
+- When reading a file and 'fileencoding' is different from "utf-8"
+- When writing a file and 'fileencoding' is different from "utf-8"
- When displaying messages and the encoding used for LC_MESSAGES differs from
- 'encoding' (requires a gettext version that supports this).
+ "utf-8" (requires a gettext version that supports this).
- When reading a Vim script where |:scriptencoding| is different from
- 'encoding'.
-- When reading or writing a |shada| file.
+ "utf-8".
Most of these require the |+iconv| feature. Conversion for reading and
writing files may also be specified with the 'charconvert' option.
@@ -408,11 +399,11 @@ Useful utilities for converting the charset:
*mbyte-conversion*
-When reading and writing files in an encoding different from 'encoding',
+When reading and writing files in an encoding different from "utf-8",
conversion needs to be done. These conversions are supported:
- All conversions between Latin-1 (ISO-8859-1), UTF-8, UCS-2 and UCS-4 are
handled internally.
-- For MS-Windows, when 'encoding' is a Unicode encoding, conversion from and
+- For MS-Windows, conversion from and
to any codepage should work.
- Conversion specified with 'charconvert'
- Conversion with the iconv library, if it is available.
@@ -468,8 +459,6 @@ and you will have a working UTF-8 terminal emulator. Try both >
with the demo text that comes with ucs-fonts.tar.gz in order to see
whether there are any problems with UTF-8 in your xterm.
-For Vim you may need to set 'encoding' to "utf-8".
-
==============================================================================
5. Fonts on X11 *mbyte-fonts-X11*
@@ -864,11 +853,11 @@ between two keyboard settings.
The value of the 'keymap' option specifies a keymap file to use. The name of
this file is one of these two:
- keymap/{keymap}_{encoding}.vim
+ keymap/{keymap}_utf-8.vim
keymap/{keymap}.vim
-Here {keymap} is the value of the 'keymap' option and {encoding} of the
-'encoding' option. The file name with the {encoding} included is tried first.
+Here {keymap} is the value of the 'keymap' option.
+The file name with "utf-8" included is tried first.
'runtimepath' is used to find these files. To see an overview of all
available keymap files, use this: >
@@ -950,7 +939,7 @@ this is unusual. But you can use various ways to specify the character: >
A <char-0141> octal value
x <Space> special key name
-The characters are assumed to be encoded for the current value of 'encoding'.
+The characters are assumed to be encoded in UTF-8.
It's possible to use ":scriptencoding" when all characters are given
literally. That doesn't work when using the <char-> construct, because the
conversion is done on the keymap file, not on the resulting character.
@@ -1170,21 +1159,13 @@ Useful commands:
message is truncated, use ":messages").
- "g8" shows the bytes used in a UTF-8 character, also the composing
characters, as hex numbers.
-- ":set encoding=utf-8 fileencodings=" forces using UTF-8 for all files. The
- default is to use the current locale for 'encoding' and set 'fileencodings'
- to automatically detect the encoding of a file.
+- ":set fileencodings=" forces using UTF-8 for all files. The
+ default is to automatically detect the encoding of a file.
STARTING VIM
-If your current locale is in an utf-8 encoding, Vim will automatically start
-in utf-8 mode.
-
-If you are using another locale: >
-
- set encoding=utf-8
-
-You might also want to select the font used for the menus. Unfortunately this
+You might want to select the font used for the menus. Unfortunately this
doesn't always work. See the system specific remarks below, and 'langmenu'.
@@ -1245,10 +1226,9 @@ not everybody is able to type a composing character.
These options are relevant for editing multi-byte files. Check the help in
options.txt for detailed information.
-'encoding' Encoding used for the keyboard and display. It is also the
- default encoding for files.
+'encoding' Internal text encoding, always "utf-8".
-'fileencoding' Encoding of a file. When it's different from 'encoding'
+'fileencoding' Encoding of a file. When it's different from "utf-8"
conversion is done when reading or writing the file.
'fileencodings' List of possible encodings of a file. When opening a file