History
VersionChanges
v14.18.0
Introduced
base64url
encoding.
v6.4.0
Introduced
latin1
as an alias for
binary
.
v5.0.0
Removed the deprecated
raw
and
raws
encodings.
When converting between
Buffer
s and strings, a character encoding may be
specified. If no character encoding is specified, UTF-8 will be used as the
default.
const buf = Buffer.from('hello world', 'utf8');
console.log(buf.toString('hex'));
console.log(buf.toString('base64'));
console.log(Buffer.from('fhqwhgads', 'utf8'));
console.log(Buffer.from('fhqwhgads', 'utf16le'));
The character encodings currently supported by Node.js are the following:
'utf8'
: Multi-byte encoded Unicode characters. Many web pages and other
document formats use
UTF-8
. This is the default character encoding.
When decoding a
Buffer
into a string that does not exclusively contain
valid UTF-8 data, the Unicode replacement character
U+FFFD
� will be used
to represent those errors.
'utf16le'
: Multi-byte encoded Unicode characters. Unlike
'utf8'
, each
character in the string will be encoded using either 2 or 4 bytes.
Node.js only supports the
little-endian
variant of
UTF-16
.
'latin1'
: Latin-1 stands for
ISO-8859-1
. This character encoding only
supports the Unicode characters from
U+0000
to
U+00FF
. Each character is
encoded using a single byte. Characters that do not fit into that range are
truncated and will be mapped to characters in that range.
Converting a
Buffer
into a string using one of the above is referred to as
decoding, and converting a string into a
Buffer
is referred to as encoding.
Node.js also supports the following binary-to-text encodings. For
binary-to-text encodings, the naming convention is reversed: Converting a
Buffer
into a string is typically referred to as encoding, and converting a
string into a
Buffer
as decoding.
'base64'
:
Base64
encoding. When creating a
Buffer
from a string,
this encoding will also correctly accept "URL and Filename Safe Alphabet" as
specified in
RFC 4648, Section 5
. Whitespace characters such as spaces,
tabs, and new lines contained within the base64-encoded string are ignored.
'base64url'
:
base64url
encoding as specified in
RFC 4648, Section 5
. When creating a
Buffer
from a string, this
encoding will also correctly accept regular base64-encoded strings. When
encoding a
Buffer
to a string, this encoding will omit padding.
'hex'
: Encode each byte as two hexadecimal characters. Data truncation
may occur when decoding strings that do exclusively contain valid hexadecimal
characters. See below for an example.
The following legacy character encodings are also supported:
'ascii'
: For 7-bit
ASCII
data only. When encoding a string into a
Buffer
, this is equivalent to using
'latin1'
. When decoding a
Buffer
into a string, using this encoding will additionally unset the highest bit of
each byte before decoding as
'latin1'
.
Generally, there should be no reason to use this encoding, as
'utf8'
(or, if the data is known to always be ASCII-only,
'latin1'
) will be a
better choice when encoding or decoding ASCII-only text. It is only provided
for legacy compatibility.
'binary'
: Alias for
'latin1'
. See
binary strings
for more background
on this topic. The name of this encoding can be very misleading, as all of the
encodings listed here convert between strings and binary data. For converting
between strings and
Buffer
s, typically
'utf-8'
is the right choice.
'ucs2'
: Alias of
'utf16le'
. UCS-2 used to refer to a variant of UTF-16
that did not support characters that had code points larger than U+FFFF.
In Node.js, these code points are always supported.
Buffer.from('1ag', 'hex');
Buffer.from('1a7g', 'hex');
Buffer.from('1634', 'hex');
Modern Web browsers follow the
WHATWG Encoding Standard
which aliases
both
'latin1'
and
'ISO-8859-1'
to
'win-1252'
. This means that while doing
something like
http.get()
, if the returned charset is one of those listed in
the WHATWG specification it is possible that the server actually returned
'win-1252'
-encoded data, and using
'latin1'
encoding may incorrectly decode
the characters.