Emojis

May 24, 2021

tags: miscellaneous

What is an Emoji?

Emojis are part of the Unicode standard, a constantly updating list of all characters that computers are capable of handling.

The first computers encoded text with ASCII, a system designed in 1963 that maps each character to 7 bits. This allowed computers to recognize 128 different characters. This included the standard English alphabet (a-z, A-Z), numbers (0-9), various symbols (&, %, ., =, etc.), and some control characters. These control characters are not displayed as text, but are instructions for computers to read the text. For example, NUL is a ubiquitous character that signifies the end of a string while BS represents a backspace.

However, as anyone who has studied computer architecture knows well, modern computers work with 8-bit chunks of memory. Such a chunk is now called a byte. (historical sidenode: while a "byte" is now standardized to 8 bits, in the past the number of bits per byte was machine-dependent.) In the 1980s, ASCII was extended to 8-bit systems, although different people implemented this change in different ways. One common encoding was ANSI, developed by Microsoft. Another popular encoding is ISO 8859-1, which was originally the standard for HTML pages.

However, 8 bits allows 256 characters, and there are far more than 256 characters in the world. Chinese alone has more than ten thousand characters. Incompatible standards were popping up across the world, breaking transmission of text from one system to another. Something had to be done.

In 1987, the Unicode Consortium was founded by Joe Becker, Lee Collins, and Mark Davis specifically for this purpose. This organization began the Unicode standard, a set of rules for how to encode and format text from every writing system in the world. To handle the immense number of characters while sacrificing as little space as possible, Unicode uses variable-width encoding for its characters. Thus, the Latin alphabet can still be encoded in a single byte (8 bits), but characters from other writing systems may take up to four bytes (32 bits) each. The Unicode standard has become an incredibly important part of the digital ecosystem, and is now the main encoding system used across the web.

Which brings us to the 1990s, and to Japan telephone internet companies. Japanese phones at the time did not use ASCII or Unicode. Many phones used a two byte (16 bit) encoding for Japanese characters in which some codes were not assigned to any character. In 1999, NTT DoCoMo released i-mode, an internet service provider for telephones that included a set of pictograms to fill the rest of this encoding. These first emojis were developed by Shigetaka Kurita.

After the popularity of emojis in i-mode, the characters spread to other telephone internet companies in Japan. When Unicode came to Japan, they decided to add emojis to their character set along with Japense characters. And, thus, emojis entered the global stage.

However, it took the introduction of emojis by Apple onto their phones for the character set to gain the awareness and fame it deserved outside of Japan. Emojis wewre very soon adopted other major phone companies for compatibility with apple phones.

And the rest is history 👋.