Character Sets

Unicode

Unicode as defined by the Unicode organization has become a universal standard: ISO/IEC 10646, describing the 'Universal Multiple-Octet Coded Character Set' (UCS). See: http://www.unicode.org

Unicode encodings / subsets / character sets:

In mime encoded content, after the "Content-type:" header you will often see a "charset="

charset=UTF-8. This includes all the unicode characters but it encodes them in an efficient way which makes it possible to transfer a Unicode character to another computer reliably. UTF-8 stands for UCS Transformation Format 8. See:

charset=windows-1252 The standard Windows Roman encoding is 'code page 1252'.See: ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT

ASCII is the first 127 Unicode Characters

ISO Latin-1 is the first 256 Unicode Characters

The following table contains the complete ISO Latin-1 character set, corresponding to the first 256 entries of the Unicode character repertoire in Microsoft® Internet Explorer 4.0 and later. The table provides each character, its decimal code, its named entity reference for HTML, and also a brief description. See also:
http://msdn.microsoft.com/workshop/author/dhtml/reference/charsets/charsets.asp

Character Decimal code Named entity Description
--- � --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
--- 	 --- Horizontal tab
--- 
 --- Line feed
---  --- Unused
---  --- Unused
--- 
 --- Carriage Return
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
---  --- Unused
  --- Space
! ! --- Exclamation mark
" " " Quotation mark
# # --- Number sign
$ $ --- Dollar sign
% % --- Percent sign
& & & Ampersand
' ' --- Apostrophe
( ( --- Left parenthesis
) ) --- Right parenthesis
* * --- Asterisk
+ + --- Plus sign
, , --- Comma
- - --- Hyphen
. . --- Period (fullstop)
/ / --- Solidus (slash)
0 0 --- Digit 0
1 1 --- Digit 1
2 2 --- Digit 2
3 3 --- Digit 3
4 4 --- Digit 4
5 5 --- Digit 5
6 6 --- Digit 6
7 7 --- Digit 7
8 8 --- Digit 8
9 9 --- Digit 9
: : --- Colon
; &#59; --- Semicolon
< &#60; &lt; Less than
= &#61; --- Equals sign
> &#62; &gt; Greater than
? &#63; --- Question mark
@ &#64; --- Commercial at
A &#65; --- Capital A
B &#66; --- Capital B
C &#67; --- Capital C
D &#68; --- Capital D
E &#69; --- Capital E
F &#70; --- Capital F
G &#71; --- Capital G
H &#72; --- Capital H
I &#73; --- Capital I
J &#74; --- Capital J
K &#75; --- Capital K
L &#76; --- Capital L
M &#77; --- Capital M
N &#78; --- Capital N
O &#79; --- Capital O
P &#80; --- Capital P
Q &#81; --- Capital Q
R &#82; --- Capital R
S &#83; --- Capital S
T &#84; --- Capital T
U &#85; --- Capital U
V &#86; --- Capital V
W &#87; --- Capital W
X &#88; --- Capital X
Y &#89; --- Capital Y
Z &#90; --- Capital Z
[ &#91; --- Left square bracket
\ &#92; --- Reverse solidus (backslash)
] &#93; --- Right square bracket
^ &#94; --- Caret
_ &#95; --- Horizontal bar (underscore)
` &#96; --- Acute accent
a &#97; --- Small a
b &#98; --- Small b
c &#99; --- Small c
d &#100; --- Small d
e &#101; --- Small e
f &#102; --- Small f
g &#103; --- Small g
h &#104; --- Small h
i &#105; --- Small i
j &#106; --- Small j
k &#107; --- Small k
l &#108; --- Small l
m &#109; --- Small m
n &#110; --- Small n
o &#111; --- Small o
p &#112; --- Small p
q &#113; --- Small q
r &#114; --- Small r
s &#115; --- Small s
t &#116; --- Small t
u &#117; --- Small u
v &#118; --- Small v
w &#119; --- Small w
x &#120; --- Small x
y &#121; --- Small y
z &#122; --- Small z
{ &#123; --- Left curly brace
| &#124; --- Vertical bar
} &#125; --- Right curly brace
~ &#126; --- Tilde
--- &#127; --- Unused
  &#160; &nbsp; Nonbreaking space also use <nobr>...</nobr> around text.
¡ &#161; &iexcl; Inverted exclamation
¢ &#162; &cent; Cent sign
£ &#163; &pound; Pound sterling
¤ &#164; &curren; General currency sign
¥ &#165; &yen; Yen sign
¦ &#166; &brvbar; or &brkbar; Broken vertical bar
§ &#167; &sect; Section sign
¨ &#168; &uml; or &die; Diæresis / Umlaut
© &#169; &copy; Copyright
ª &#170; &ordf; Feminine ordinal
« &#171; &laquo; Left angle quote, guillemet left
¬ &#172; &not Not sign
­ &#173; &shy; Soft hyphen
® &#174; &reg; Registered trademark
¯ &#175; &macr; or &hibar; Macron accent
° &#176; &deg; Degree sign
± &#177; &plusmn; Plus or minus
² &#178; &sup2; Superscript two
³ &#179; &sup3; Superscript three
´ &#180; &acute; Acute accent
µ &#181; &micro; Micro sign
&#182; &para; Paragraph sign
· &#183; &middot; Middle dot
¸ &#184; &cedil; Cedilla
¹ &#185; &sup1; Superscript one
º &#186; &ordm; Masculine ordinal
» &#187; &raquo; Right angle quote, guillemet right
¼ &#188; &frac14; Fraction one-fourth
½ &#189; &frac12; Fraction one-half
¾ &#190; &frac34; Fraction three-fourths
¿ &#191; &iquest; Inverted question mark
À &#192; &Agrave; Capital A, grave accent
Á &#193; &Aacute; Capital A, acute accent
 &#194; &Acirc; Capital A, circumflex
à &#195; &Atilde; Capital A, tilde
Ä &#196; &Auml; Capital A, diæresis / umlaut
Å &#197; &Aring; Capital A, ring
Æ &#198; &AElig; Capital AE ligature
Ç &#199; &Ccedil; Capital C, cedilla
È &#200; &Egrave; Capital E, grave accent
É &#201; &Eacute; Capital E, acute accent
Ê &#202; &Ecirc; Capital E, circumflex
Ë &#203; &Euml; Capital E, diæresis / umlaut
Ì &#204; &Igrave; Capital I, grave accent
Í &#205; &Iacute; Capital I, acute accent
Î &#206; &Icirc; Capital I, circumflex
Ï &#207; &Iuml; Capital I, diæresis / umlaut
Ð &#208; &ETH; Capital Eth, Icelandic
Ñ &#209; &Ntilde; Capital N, tilde
Ò &#210; &Ograve; Capital O, grave accent
Ó &#211; &Oacute; Capital O, acute accent
Ô &#212; &Ocirc; Capital O, circumflex
Õ &#213; &Otilde; Capital O, tilde
Ö &#214; &Ouml; Capital O, diæresis / umlaut
× &#215; &times; Multiply sign
Ø &#216; &Oslash; Capital O, slash
Ù &#217; &Ugrave; Capital U, grave accent
Ú &#218; &Uacute; Capital U, acute accent
Û &#219; &Ucirc; Capital U, circumflex
Ü &#220; &Uuml; Capital U, diæresis / umlaut
Ý &#221; &Yacute; Capital Y, acute accent
Þ &#222; &THORN; Capital Thorn, Icelandic
ß &#223; &szlig; Small sharp s, German sz
à &#224; &agrave; Small a, grave accent
á &#225; &aacute; Small a, acute accent
â &#226; &acirc; Small a, circumflex
ã &#227; &atilde; Small a, tilde
ä &#228; &auml; Small a, diæresis / umlaut
å &#229; &aring; Small a, ring
æ &#230; &aelig; Small ae ligature
ç &#231; &ccedil; Small c, cedilla
è &#232; &egrave; Small e, grave accent
é &#233; &eacute; Small e, acute accent
ê &#234; &ecirc; Small e, circumflex
ë &#235; &euml; Small e, diæresis / umlaut
ì &#236; &igrave; Small i, grave accent
í &#237; &iacute; Small i, acute accent
î &#238; &icirc; Small i, circumflex
ï &#239; &iuml; Small i, diæresis / umlaut
ð &#240; &eth; Small eth, Icelandic
ñ &#241; &ntilde; Small n, tilde
ò &#242; &ograve; Small o, grave accent
ó &#243; &oacute; Small o, acute accent
ô &#244; &ocirc; Small o, circumflex
õ &#245; &otilde; Small o, tilde
ö &#246; &ouml; Small o, diæresis / umlaut
÷ &#247; &divide; Division sign
ø &#248; &oslash; Small o, slash
ù &#249; &ugrave; Small u, grave accent
ú &#250; &uacute; Small u, acute accent
û &#251; &ucirc; Small u, circumflex
ü &#252; &uuml; Small u, diæresis / umlaut
ý &#253; &yacute; Small y, acute accent
þ &#254; &thorn; Small thorn, Icelandic
ÿ &#255; &yuml; Small y, diæresis / umlaut

Questions: