Text Character Set Encoding Query

TAG text-process string unicode

Input Text

Output Charset

Input Type

Input Charset

Output Mode

Output Type

Related Tools

Unicode Encoding Character View Online

String Unicode Converter Online

Chinese Character Unicode Code Point Range

Character Set Encoding Information Query Online

Text Character Set Encoding Query-summary

Online text character set encoding query, input text data, and view the encoding of text in different character sets. Enter the text encoding to view the decoded text from different character sets.

Text Character Set Encoding Query-instructions

Online text character set encoding query tool, which queries the encoding of output text in different character sets. This tool supports nearly a hundred character sets, such as US-ASCII, UTF-8, UTF16, UTF-32, GBK, GB18030, etc.

Input Content : Input text content, the format of the input text content is determined by the input type.
Input Type : The format of the input content supports input of raw text, HEX string, Base64 string, and Binary string.
Input Charset : When the input type is not raw text, select the character set used to convert the input content to raw text.
Output Charset : Choose to view the encoding of the input text in which character sets.
Output Mode : Choose whether the output content mode is single or batch output.
Output Type : Select the output type for the character set encoding of the text, supporting output as HEX, Base64, and Binary. In the single output mode, Decimal output types are also supported.
The query results include the encoding of the text in each character set. When in single output mode, the Unicode Code Point of the text will also be displayed.
In some character sets, BOM (Byte Order Mark) is added at the beginning of encoding to indicate whether the byte order of encoding is big endian or little endian. For example, UTF-16 encoding will add 0xFEFF bytes. Therefore, there will be additional 0xFEFF bytes before each UTF-16 encoding of every character in single mode and before the entire UTF-16 encoding in batch mode.
Text encoding instructions
1. Unicode is just a standard used to map characters and numbers. The numerical encoding corresponding to each character is called Unicode Code Point. As for how characters are encoded into bytes in memory, as defined by UTF (Unicode Transformation Formats), Unicode itself does not care.
2. UTF-8 and UTF-16 are the two most popular Unicode encoding schemes. UTF-8 encoding is the most widely used globally. UTF-8 is a variable length multi byte encoding that can represent a Unicode character using 1-6 bytes.
3. US-ASCII is a typical single byte encoding scheme, which uses a single byte (8 bits) to represent a character. It occupies the lower 7 bits of a byte and provides an encoding of 128 characters.

English