ASCII Encoding Explained: Characters, Control Codes, and Extended ASCII

What Is ASCII?

ASCII — the American Standard Code for Information Interchange — is a 7-bit character encoding standard defined in 1963. It maps 128 character codes (0–127) to letters, digits, punctuation, and control characters.

When two early computer systems needed to exchange text, they needed to agree on what binary value represents each character. ASCII provided that agreement and became the universal standard for text in early computing.

The ASCII Table Structure

The 128 code points are divided into three groups:

Range	Type	Examples
0–31	Control characters	NUL, LF, CR, TAB, ESC
32–126	Printable characters	Space, A–Z, a–z, 0–9, punctuation
127	DEL	Delete / backspace

The printable characters are arranged so that:

Uppercase letters A–Z occupy codes 65–90
Lowercase letters a–z occupy codes 97–122
Digits 0–9 occupy codes 48–57

There is a useful pattern: converting between uppercase and lowercase is just flipping bit 5 (add or subtract 32, or XOR with 0x20):

"A".charCodeAt(0); // 65
"a".charCodeAt(0); // 97 = 65 + 32

// Toggle case using XOR
String.fromCharCode("A".charCodeAt(0) ^ 32); // "a"
String.fromCharCode("a".charCodeAt(0) ^ 32); // "A"

Digits 0–9 (codes 48–57) have the useful property that the numeric value equals code - 48:

"7".charCodeAt(0) - 48; // 7
"0".charCodeAt(0) - 48; // 0

Control Characters

The first 32 ASCII codes are control characters. Originally intended to control teletype machines and serial terminals, most are now used only in specific contexts:

Code	Name	Char	Use
0	NUL	`\0`	String terminator in C
7	BEL	`\a`	Terminal bell sound
8	BS	`\b`	Backspace
9	HT	`\t`	Horizontal tab
10	LF	`\n`	Newline (Unix line endings)
13	CR	`\r`	Carriage return (part of Windows CRLF)
26	SUB	`^Z`	End-of-file in Windows (Ctrl+Z)
27	ESC	`\e`	Start of ANSI escape sequences

ASCII in Programming

C-style strings

C strings are arrays of bytes terminated by a NUL byte (\0). The length of a C string is the number of bytes before the first NUL. This is why NUL bytes in binary data can corrupt C string handling — the string appears to end prematurely.

Character arithmetic

Because ASCII is a dense, ordered mapping, many operations reduce to arithmetic:

// Check if a character is a digit
int isDigit(char c) { return c >= '0' && c <= '9'; }

// Parse a decimal digit
int digit = c - '0';

// Convert lowercase to uppercase
char toUpper(char c) {
  if (c >= 'a' && c <= 'z') return c - 32;
  return c;
}

Extended ASCII: Codes 128–255

The original ASCII standard used 7 bits, leaving the 8th bit unused. As personal computers became widespread in the 1980s, vendors started using codes 128–255 for additional characters: accented Latin letters, box-drawing characters, and symbols.

The problem: everyone defined their own mapping. IBM's Code Page 437 (used in DOS) had box-drawing characters. ISO Latin-1 (ISO 8859-1) had Western European accented letters. These were incompatible with each other.

This proliferation of incompatible "extended ASCII" encodings was a major source of mojibake (garbled text).

Unicode: The Successor to ASCII

Unicode was designed to be a universal character set. Crucially, Unicode is a superset of ASCII: the first 128 Unicode code points are identical to ASCII. Code point U+0041 is 'A', code point U+000A is the newline character, and so on.

The most common Unicode encoding, UTF-8, has a further useful property: ASCII characters are encoded as a single byte identical to their ASCII value. This means any ASCII file is also a valid UTF-8 file.

// In a UTF-8 file, these are equivalent:
const a = "A"; // U+0041, one byte: 0x41
const code = 0x41; // same byte value as ASCII 'A'

new TextEncoder().encode("A"); // Uint8Array [65]

For new code, always use UTF-8. Extended ASCII code pages should only be used when reading legacy files that require a specific code page.

Try It Online

The ASCII / String Converter on DevGizmo converts between ASCII codes and characters in both directions — enter a string to get decimal and hex codes, or enter codes to decode them back to text.