Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding.
Each base64 digit represents exactly 6 bits of data.
The particular set of 64 characters chosen to represent the 64 place-values for the base varies between implementations. The general strategy is to choose 64 characters that are both members of a subset common to most encodings, and also printable. This combination leaves the data unlikely to be modified in transit through information systems, such as email, that were traditionally not 8-bit clean. For example, MIME's Base64 implementation uses A
–Z
, a
–z
, and 0
–9
for the first 62 values. Other variations share this property but differ in the symbols chosen for the last two values; an example is UTF-7.
The earliest instances of this type of encoding were created for dialup communication between systems running the same OS — e.g., uuencode for UNIX, BinHex for the TRS-80 (later adapted for the Macintosh) — and could therefore make more assumptions about what characters were safe to use. For instance, uuencode uses uppercase letters, digits, and many punctuation characters, but no lowercase.
A quote from Thomas Hobbes' Leviathan (be aware of spaces between lines):
is represented as a byte sequence of 8-bit-padded ASCII characters encoded in MIME's Base64 scheme as follows:
In the above quote, the encoded value of Man is TWFu. Encoded in ASCII, the characters M, a, and n are stored as the bytes 77
, 97
, and 110
, which are the 8-bit binary values 01001101
, 01100001
, and 01101110
. These three values are joined together into a 24-bit string, producing 010011010110000101101110
. Groups of 6 bits (6 bits have a maximum of 26 = 64 different binary values) are converted into individual numbers from left to right (in this case, there are four numbers in a 24-bit string), which are then converted into their corresponding Base64 character values.