AT & T's Baudot to ASCII and UTF-8

Table of Contents

  1. Click here to go to BAUDOT's background.
  2. Click here to go to ASCII's code page.
  3. Click here to go to the IBM PC's original code page 437, followed shortly after by code page 850.
  4. Click here to go to Microsoft's new UTF-8 Notepad, released May 2019.

Electro-Magnetism

Image from www.solarschools.net

 

Image from www.elmhurst.edu

Introduction

How is electricity created. See pictures.

Magnets


Timeline
1600First use of the Latin adjective "electricus", relating to the attractive properties of amber, was published by William Gilbert, a London physician.
 
The first usage of the English word "electricity" is ascribed to Sir Thomas Browne in his 1646 work, Pseudodoxia Epidemica.
1663Click here for the earliest "electric machines" starting in 1663, called friction machines.
 
Click here for a basic tutorial on static electricity, atoms and ions.
1746First use of the words positive and negative (plus or minus) for the words vitreous (glass) and resinous (gum) electricity by Benjamin Franklin. At the time European scientists had determined there were two kinds of electricity: vitreous, which was produced by rubbing glass with silk, and resinous, produced on resin rubbed with wool or fur.
 
Click here for the excellent article "BEN FRANKLIN SHOULD HAVE SAID ELECTRONS ARE POSITIVE?", and other misunderstandings on the subject of electricity that developed over the next 200 years.
1800Click here for a timeline of the battery or cell – derived from the Greek "kalux" — to conceal (shell) or cover – invented in 1800 by Alessandro Volta.
 
Click here for the relationships in the use of its language: How volts of difference in charge between two points on a closed circuit divided by ohms of resistance produces amps of current and watts of power (by using coulombs of excess electrons/protons to specify joules of energy per second).
1844Click here for background to the electric telegraph and that famous message in Morse code transmitted over an electric circuit between Baltimore and Washington (a distance of 64 kilometres) in 1844 "What hath God wrought".
 
Click here for a definition of magnetism and a simple experiment using a battery.
 
Click here for a Yahoo article on current, both DC and AC (and how it lights a light bulb).
 
Click here for an image and article contrasting the drift velocity (in wire) of free electrons under an applied electric field (in the order of millimetres per second), their actual speed (a million metres per second), and the signal's communication speed (i.e. the speed of light).
 
Click here for further background to electric fields (formed whenever voltages push or pull) and magnetic fields (formed as the coulombs of electrons actually move).
1886Click here for the electromagnetic frequencies and the discovery of radio.
 
Click here for an easy to read article on how radio waves normally pass through walls (they're so big), and gamma waves pass through walls (they're so small), but light waves normally do not.
 
Click here for how the Voyager Spacecraft is then able to transmit over a distance greater than 10 billion kilometres.
 
Lastly, Click here for NASA's article on space radiation (alpha particles, beta particles, gamma rays, etc)

Power Grids, Coal and Renewables, Greenhouse Gases and Global Warming

US Power Grid
Click here for a modern day power grid transmitted from power plant to house.

Australia's electricity requirements in 2017
Click here for an article by www.world-nuclear.org.

Much of it is produced by the grid-connected NEM (National Electricity Market) in the east and south-east of the country. Secondary grids are in Western Australia. In 2015-2016, 44% came from black coal (mostly NSW and Qld) and 20% was from brown coal (mainly Victoria). Hence 64% of that year's needs came from coal. The long ramp up to peak capacity means that coal fired power stations frequently operate 24 hours a day, 7 days a week, so are ideal for providing baseload demand, while other more responsive sources of electricity meet peaks in demand. 20% came from natural gas, 6% from hydro, 5% from wind, 2.7% from solar, 2% from oil and 1% from Biomass (IEA data).

...


Total emissions worldwide 36 billion tonnes.
UK, France and Italy's emissions were 350 - 400 million tonnes each for the year.

Click here for a worldwide list of countries, sortable by each country's electricity production in GWh (Gigawatt hours) and by their individual sources.

Click here for a recent Nuclear Power article in the Australian, in February 2024.

Further update June 2024
Extract: Oil, gas and coal boom shatter decarbonisation myth
A climate graph which mixes fact with prophesy
Greg Sheridan, The Weekend Australian 15 Jun 2024

The Kyoto Protocol was adopted, in the serene and beguiling Japanese city of that name, in 1997, 27 years ago. Kyoto itself built on the 1992 UN Framework Convention on Climate Change.

The graph below from the IEA shows the rise of the use of gas, oil and coal, measured in exajoules (one joule, a measure of energy, to the power of 18; that is to say, lots of joules, one joule being the equivalent of 107 ergs). The left side of the graph’s curve, up to the peak in 2022, which has been maintained in 2023, describes things that have already happened. That part of the graph is indisputable fact. According to the IEA: “Global coal consumption reached an all-time high in 2022, and the world is heading towards a new record in 2023.”

The right side of the graph shows a steep decline in the use of coal, oil and gas. But that’s purely speculative. That’s more or less taking an end point of declared policy, the Paris targets, and plotting a line that gets there.

...

Advanced economies such as the US and the EU are using less coal but, says the IEA, “the growth in China and India, as well as Indonesia, Vietnam and The Philippines, will more than offset these decreases on a global level”. And the price of coal, at $US140 a tonne, is very healthy.

That’s a good thing because our top three export earners are coal, iron ore and gas. We couldn’t afford any fancy green measures, or Medicare, or the National Disability Insurance Scheme, or anything else, without the minerals industry.

According to the IEA, fossil fuels make up about 80 per cent of global energy, just a tick under their level 10 years ago.

The developed countries are reducing greenhouse gas emissions, but the developed countries are no longer the big story. China is the biggest greenhouse gas emitter by far. It accounts for more than 29 per cent of global emissions, more than the US and EU put together. The top 10 emitters are: China, the US, India, Russia, Brazil, Indonesia, Japan, Iran, Mexico and Saudi Arabia. Of those only two, the US and Japan, are rich, developed countries. Almost none of the others has binding targets or any commitment to when their emissions will even peak.

When like is genuinely compared with like, coal is cheaper than renewables. Because with renewables you have to take account of the fact that most of the time they don’t operate so you need vast extra capacity, sometimes there are wind droughts and long cloudy periods so you need vast back-up systems of gas or coal or something else, the transmission infrastructure is enormous and the costs huge, and after 25 years or so you’ve got to throw away all the renewable stuff and replace it.

Almost everywhere that introduces vast renewable energy, apart from hydro, sees big electricity price rises.

The Albanese government has got great mileage from a Climate Change and Energy Department projection that Australia will reach a 42 per cent reduction in greenhouse emissions by 2030, just 1 per cent shy of our target of 43 per cent. Yet a UN committee examining the issue doesn’t think even one G20 country will meet its target. The government is miles behind in the rollout of renewables. Electric vehicle sales are a small fraction of the forecast sales. But still we are, according to the "magic forecast", just 1 per cent off target.

Finally, click here for NASA's greenhouse gases climate warming issue. The planet has warmed by at least 1.1° Celsius since 1880, with the majority of that apparently occurring in the past 50 years.


Background to the Baudot (5-bit) Telegraph (and the teletype)

Introduction

Now, over to computer bits and What is a bit
Every microscopic transistor in a computer is in one of 2 states: charged or not charged (i.e. it holds a one or a zero). When transmitted to another computer by (1) single-ended signalling on one wire using its reference wire (2) differential signalling over twisted pair cable (or high performing coaxial cable) (3) on/off light flashes over fibre optic cable using semi-conductor lasers or (4) frequency variations over mobile wireless, each individual signal is ultimately either a one or a zero. Every 8 signals (or bits) thus have 2 to the 8th power in variations i.e. 256 different characters (or bytes). These 256 characters are sometimes called a character set and, in IBM / Microsoft parlance, a code page. Seven of the eight bits (which control the first 128 characters) have been standardized world wide and are known as ASCII — American Standard Code for Information Interchange, put together in 1960-63 by the American Standards Association subcommittee. In the all encompassing Unicode standard today, this set is referred to as Basic Latin.

The ASCII Timeline, starting with Émile Baudot
In 1874 we find the Baudot Telegraph Code. It was transmitted through a five key piano-like machine (click here for a picture) and was based on its sequencing of five contacts (or channels) that enabled 2 to the 5th power i.e. 32 different characters. It was built by Émile Baudot in France in 1874 as a multiplexed & multi-user system that allowed up to four machines to transmit almost simultaneously (via the use of time-slices) to printing machines synchronized to each message through a start signal. It printed its text on a thin paper strip or "ticker tape" (click here for that picture). The 32 character code (ITA1) provided for 26 upper-case letters, then when numbers were required to be sent, a key would be pressed that acted somewhat like a NumLock key. A different keypress would switch this NumLock status off.

5-bit Teletype/teleprinter

Combining these codes with punched paper tape for higher speed transmission these Baudot codes were rearranged by Donald Murray in such a way as to reduce wear and tear on the tape perforating machine, and patented in the US in 1901. Interest was shown by Western Union, but with Morse code everywhere (that worked over a one-key paper tape system, not five) companies were loath to change across. However, work continued on this five-unit code by the Morkrum Company, later known as the Teletype Corporation, that established a start and a stop signal between each character. This enabled asynchronous communication over both telegraph wire and radio. Being asychronous meant it didn't require an external clock signal coordinating the circuitry. Both simpler, and cheaper, and today used everywhere over the World Wide Web. Click here for a practical account on how it works.
Summarized below

How stop & start signals work?
The stop bit is essentially "enforced line idle time" between characters. That time is also used to absorb some clock mismatch between sender and receiver.

When idle, the receiver monitors the line looking for it to change to the non-idle state. When that happens, it starts a stopwatch. Since it is configured for the same bit rate as the transmitter, it knows when each data bit is sent, whether it differs from a previous bit or not. The middle of the first data bit is at 1½ bit times from the leading edge of the start bit. The second at 2½ bit times, etc.

After the last data bit is received, the receiver waits for the line to go back to the idle state, then waits for the next start bit again.

First to adopt this was the Associated Press news cooperative in the US in 1914, then the Western Union Telegraph Company in 1923, followed by AT & T Corporation who purchased the Morkrum Company in 1930. Murray's Baudot code (known as ITA2) thus became the Teletype (TTY) standard.

Punched Cards, the US Census, and IBM

Back in 1889, Herman Hollerith had patented a mechanism using electrical connections to increment a counter, recording information. The key idea was that a datum could be recorded by the presence or absence of a hole at a specific location on a card. For example, if a specific hole location indicates marital status, then a hole there can indicate married while not having a hole indicates single.

The use of Hollerith's electromechanical tabulators helped reduce the time required to process the US census from eight years for the 1880 census to six years for the 1890 census. In 1911, four corporations, including Hollerith's firm, were amalgamated to form a fifth company, the Computing-Tabulating-Recording Company (CTR).
Under the presidency of Thomas John Watson, CTR was renamed International Business Machines Corporation (IBM) in 1924.

The first computers

The Friden mechanical calculator (in Tokyo). The state of the art in precision scientific and engineering calculation was the ten-digit, electrically powered, mechanical calculator, such as the one above. The electronic computer word length of 36-bits was chosen, in part, to match its precision.

IBM and 6-bit Character Codes

Early computers, from the IBM 704 in 1954 onwards, image on right, replaced Baudot's 5 bit code with a 6 bit code (after 1956 called a "byte"). The IBM 704, weighing nearly 10 tonnes, came with a control console and screen having 36 data-input switches, one for each bit in a register. For human interaction with the computer, programs would be entered on punched cards initially rather than at the console, and human-readable output would be directed to an alphabetic printer.
Awkwardly, other manufacturers e.g. DEC, ICL, and even different IBM machines introduced variations when it came to coding upper-case letters, numerals, punctuation characters, and control characters.

Standardized 7-bit ASCII

Seven-bit ASCII was published in 1963 by the American Standards Association formed in 1961, with lower case added in 1968. It was heavily promoted by AT&T in the US, with significant input also coming from the US Army. Click here for the AT&T Teletype Corporation's Model 33 terminal, image on right, introduced in 1963 with its eight-hole punched tape reader and tape punch. It was one of the most popular terminals in the data-communications industry until the late 1970s with over a half-million Model 32s (that supported 5-bit Baudot) and 33s (supporting 7-bit ASCII) made by 1975.

DEC's PDP-10 family of computers accordingly based their 6-bit character code on ASCII character codes 32 to 95. They then based the 8-bit character code for the PDP-11 on 7-bit ASCII. It was a PDP-10 which Gary Kildall used to develop his CP/M operating system for microcomputers (with ASCII) in 1973-74. ASCII's support for both upper-case and lower-case characters meant that Wang's dedicated Word Processor CRT and printer which used it was "flavour of the month" in 1976. The success of Wordstar, which ran initially on CP/M in 1979, and in 1982 was ported to run on the IBM, the market leader in both typewriters and computers, meant that 7-bit ASCII's future was secure.

8-bit Extended ASCII

But with the next 128 characters defined by the eighth bit — known as the extended set — a large number of proprietary variations (and problems) arose.

During the 1970s, DEC's computers, the various CP/M installations, IBM's PC-DOS in 1981, and then Apple's Macintosh in 1984 all had different extended sets. In 1987, IBM (working with Microsoft) launched PC-DOS 3.3 on the IBM PC and PS/2, having separate 8-bit code pages for numerous languages. In the same year a standardised list of 8-bit character sets was launched as a joint exercise by ISO and IEC in Geneva Switzerland, referred to as ISO/IEC 8859.

In other non-ASCII developments, since 1964 IBM had been providing 8-bit proprietary code pages known as EBCDIC that supported multilingual text on their mainframe and mid-range computers. Click here for a full list. Still in use on IBM's 10,000 or so mainframes nearly 60 years later.

In 1969, the Japanese government published an industrial standard character set called JIS C 6220 which they later called JIS X 0201. In 1967-74 the Soviet authorities designed the KOI-8 character encoding, covering the Cyrillic alphabet. In 1976 in Taiwan, Chu Bong-Foo launched Cangjie for traditional Chinese characters by using multiple bytes. In 1984 in Beijing, Wang Yongmin released the MS Wubi input method on the IBM PC for the simplified Chinese character set used in mainland China. MS Pinyin came in 1996.

16-bit Universal Unicode and UTF-8

Finally we come to Unicode, a 16-bit cross-platform character set (65,536 code points) developed by Joe Becker at Xerox, working with Lee Collins and Mark Davis at Apple, and published in 1991. A second volume covering Han ideographs was published in June 1992.
Also in 1992 UTF-8 was developed by Ken Thompson and Rob Pike at Bell Laboratories as an efficient way of encoding Unicode, using one byte for the first 128 code points (seven-bit ASCII), and up to 4 bytes for the additional characters.

In 2019, UTF-8 has become the dominant encoding on the World Wide Web (used in over 94% of websites). Back in 1996, additional "planes" were added to the Unicode codespace, making it seventeen planes (numbered 0 to 16) or 1,114,112 code points in total. It now contains a repertoire of 137,994 characters covering 150 modern and historic scripts, as well as multiple symbol sets and emoji.

Click here to go to Microsoft's new UTF-8 Notepad released May 2019.

It includes some notes on difficulties you may face with its � ANSI replacement character.

ANSI Format

Below is the ASCII based character set referred to as the ANSI format, or the ISO-8859-1 character set. It came from the Multinational Character Set (MCS) that had been created in 1983 by DEC for use in their cross-platform VT220 terminal, and was launched with Windows in 1985, known as the Windows-1252 code page.

Because of the non-displayable control characters that DEC had specified for character codes 128-159, the displayable characters below were initially regarded as not "cross-platform" safe. However, all modern browsers do display them correctly. They were added as part of Windows 3.1 in 1992, with the euro symbol, first presented to the public by the European Commission on 12 December 1996, added to Windows 98 as character code 128.

When selecting a character from this extended set (i.e. codes 128-255) in Windows, press the Alt key with the left hand, and simultaneously type a "0" then the number on the numeric key pad (not the top row numbers) with the right hand. When you release the Alt key, the character will be displayed.

Note too that characters with a value < 32 which are used for special control commands, do not require the "0" first. The character repertoire for these controls was taken from the character set of Wang word-processing machines, as explicitly admitted by Bill Gates in the interview of him and Paul Allen in the 2nd of October 1995 edition of Fortune Magazine: "... we were also fascinated by dedicated word processors from Wang, because we believed that general-purpose machines could do that just as well. That's why, when it came time to design the keyboard for the IBM PC, we put the funny Wang character set into the machine--you know, smiley faces and boxes and triangles and stuff. We were thinking we'd like to do a clone of Wang word-processing software someday."

0 NUL1 ☺2 ☻3 ♥ 4 ♦5 ♣6 ♠7 BELL8 BS
9 TAB10 LF11 VT12 FF13 CR14 SO15 SI16 ►17 ◄
18 ↕19 ‼20 ¶21 §22 ▬23 ↨24 ↑25 ↓26 →
27 ESC28 ∟29 ↔30 ▲31 ▼3233 !34 "35 #
36 $37 %38 &39 '40 (41 )42 *43 +44 ,
45 —46 .47 /48 049 150 251 352 453 5
54 655 756 857 958 :59 ;60 61 =62 >
63 ?64 @65 A66 B67 C68 D69 E70 F71 G
72 H73 I74 J75 K76 L77 M78 N79 O80 P
81 Q82 R83 S84 T85 U86 V87 W88 X89 Y
90 Z91 [92 \93 ]94 ^95 _96 `97 a98 b
99 c100 d101 e102 f103 g104 h105 i106 j107 k
108 l109 m110 n111 o112 p113 q114 r115 s116 t
117 u118 v119 w120 x121 y122 z123 {124 |125 }
126 ~127 DEL128 €129130 ‚131 ƒ132 „133 …134 †
135 ‡136 ˆ137 ‰138 Š139 ‹140 Œ141142 Ž143
144145 ‘146 ’147 “148 ”149 •150 –151152 ˜
153 ™154 š155 ›156 œ157158 ž159 Ÿ160161 ¡
162 ¢163 £164 ¤165 ¥166 ¦167 §168 ¨169 ©170 ª
171 «172 ¬173174 ®175 ¯176 °177 ±178 ²179 ³
180 ´181 µ182 ¶183 ·184 ¸185 ¹186 º187 »188 ¼
189 ½190 ¾191 ¿192 À193 Á194 Â195 Ã196 Ä197 Å
198 Æ199 Ç200 È201 É202 Ê203 Ë204 Ì205 Í206 Î
207 Ï208 Ð209 Ñ210 Ò211 Ó212 Ô213 Õ214 Ö215 ×
216 Ø217 Ù218 Ú219 Û220 Ü221 Ý222 Þ223 ß224 à
225 á226 â227 ã228 ä229 å230 æ231 ç232 è233 é
234 ê235 ë236 ì237 í238 î239 ï240 ð241 ñ242 ò
243 ó244 ô245 õ246 ö247 ÷248 ø249 ù250 ú251 û
252 ü253 ý254 þ255 ÿ

Official ASCII Names (Originally developed by BELL for telecommunications)


Control Characters
00: NULL
01: START OF HEADING
02: START OF TEXT
03: END OF TEXT
04: END OF TRANSMISSION
05: ENQUIRY
06: ACKNOWLEDGE
07: BELL
08: BACKSPACE
09: HORIZONTAL TABULATION
10: LINE FEED
11: VERTICAL TABULATION
12: FORM FEED
13: CARRIAGE RETURN
14: SHIFT OUT
15: SHIFT IN
16: DATA LINK ESCAPE
17: DEVICE CONTROL ONE
18: DEVICE CONTROL TWO
19: DEVICE CONTROL THREE
20: DEVICE CONTROL FOUR
21: NEGATIVE ACKNOWLEDGE
22: SYNCHRONOUS IDLE
23: END OF TRANSMISSION BLOCK
24: CANCEL
25: END OF MEDIUM
26: SUBSTITUTE
27: ESCAPE
28: FILE SEPARATOR
29: GROUP SEPARATOR
30: RECORD SEPARATOR
31: UNIT SEPARATOR

Printable Characters
32: SPACE
33: EXCLAMATION MARK
34: QUOTATION MARK
35: NUMBER SIGN
36: DOLLAR SIGN
37: PERCENT SIGN
38: AMPERSAND
39: APOSTROPHE
40: LEFT PARENTHESIS
41: RIGHT PARENTHESIS
42: ASTERISK
43: PLUS SIGN
44: COMMA
45: HYPHEN-MINUS
46: FULL STOP
47: SOLIDUS
48-57: INDIVIDUAL NUMBERS 0-9
58: COLON
59: SEMICOLON
60: LESS-THAN SIGN
61: EQUALS SIGN
62: GREATER-THAN SIGN
63: QUESTION MARK
64: COMMERCIAL AT
65-90: INDIVIDUAL LATIN LETTERS UPPER-CASE A-Z
91: LEFT SQUARE BRACKET
92: REVERSE SOLIDUS
93: RIGHT SQUARE BRACKET
94: CIRCUMFLEX ACCENT
95: LOW LINE
96: GRAVE ACCENT
97-122: INDIVIDUAL LATIN LETTERS LOWER-CASE a-z
123: LEFT CURLY BRACKET
124: VERTICAL LINE
125: RIGHT CURLY BRACKET
126: TILDE

Control Character
127: DELETE

The next 32 characters are actually ANSI control characters and are not C1 (Cross Platform) safe from a display perspective. While modern browsers are fine, different characters are displayed on the Apple MAC. See this excellent article for their Unicode equivalents.


char dec col/row oct hex  description
[€]  128  08/00  200  80  EURO SYMBOL
[]   129  08/01  201  81  (NOT USED)
[‚]  130  08/02  202  82  LOW 9 SINGLE QUOTE
[ƒ]  131  08/03  203  83  FLORIN SIGN
[„]  132  08/04  204  84  LOW 9 DOUBLE QUOTE
[…]  133  08/05  205  85  ELLIPSIS
[†]  134  08/06  206  86  DAGGER
[‡]  135  08/07  207  87  DOUBLE DAGGER
[ˆ]  136  08/08  210  88  CIRCUMFLEX
[‰]  137  08/09  211  89  PER MIL SIGN
[Š]  138  08/10  212  8A  CAPITAL LETTER S WITH CARON
[‹]  139  08/11  213  8B  LEFT SINGLE QUOTE BRACKET
[Œ]  140  08/12  214  8C  CAPITAL DIGRAPH OE
[]   141  08/13  215  8D  (NOT USED)
[Ž]  142  08/14  216  8E  CAPITAL LETTER Z WITH CARON
[]   143  08/15  217  8F  (NOT USED)
[]   144  09/00  220  90  (NOT USED)
[‘]  145  09/01  221  91  HIGH 6 SINGLE QUOTE
[’]  146  09/02  222  92  HIGH 9 SINGLE QUOTE
[“]  147  09/03  223  93  HIGH 6 DOUBLE QUOTE
[”]  148  09/04  224  94  HIGH 9 DOUBLE QUOTE
[•]  149  09/05  225  95  LARGE CENTERED DOT
[–]  150  09/06  226  96  EN DASH
[—]  151  09/07  227  97  EM DASH
[˜]  152  09/08  230  98  TILDE
[™]  153  09/09  231  99  TRADEMARK SIGN
[š]  154  09/10  232  9A  SMALL LETTER S WITH CARON
[›]  155  09/11  233  9B  RIGHT SINGLE QUOTE BRACKET
[œ]  156  09/12  234  9C  SMALL DIGRAPH OE
[]   157  09/13  235  9D  (NOT USED)
[ž]  158  09/14  236  9E  SMALL LETTER Z WITH CARON
[Ÿ]  159  09/15  237  9F  CAPITAL LETTER Y WITH DIAERESIS

The remaining characters are C1 (Cross Platform) safe. In Unicode they are referred to as the Latin-1 Supplement.


char dec col/row oct hex  description
[ ]  160  10/00  240  A0  NO-BREAK SPACE
[¡]  161  10/01  241  A1  INVERTED EXCLAMATION MARK
[¢]  162  10/02  242  A2  CENT SIGN
[£]  163  10/03  243  A3  POUND SIGN
[¤]  164  10/04  244  A4  CURRENCY SIGN
[¥]  165  10/05  245  A5  YEN SIGN
[¦]  166  10/06  246  A6  BROKEN BAR
[§]  167  10/07  247  A7  PARAGRAPH SIGN
[¨]  168  10/08  250  A8  DIAERESIS
[©]  169  10/09  251  A9  COPYRIGHT SIGN
[ª]  170  10/10  252  AA  FEMININE ORDINAL INDICATOR
[«]  171  10/11  253  AB  LEFT ANGLE QUOTATION MARK
[¬]  172  10/12  254  AC  NOT SIGN
[ ]  173  10/13  255  AD  SOFT HYPHEN
[®]  174  10/14  256  AE  REGISTERED TRADE MARK SIGN
[¯]  175  10/15  257  AF  MACRON
[°]  176  11/00  260  B0  DEGREE SIGN, RING ABOVE
[±]  177  11/01  261  B1  PLUS-MINUS SIGN
[²]  178  11/02  262  B2  SUPERSCRIPT TWO
[³]  179  11/03  263  B3  SUPERSCRIPT THREE
[´]  180  11/04  264  B4  ACUTE ACCENT
[µ]  181  11/05  265  B5  MICRO SIGN
[¶]  182  11/06  266  B6  PILCROW SIGN
[·]  183  11/07  267  B7  MIDDLE DOT
[¸]  184  11/08  270  B8  CEDILLA
[¹]  185  11/09  271  B9  SUPERSCRIPT ONE
[º]  186  11/10  272  BA  MASCULINE ORDINAL INDICATOR
[»]  187  11/11  273  BB  RIGHT ANGLE QUOTATION MARK
[¼]  188  11/12  274  BC  VULGAR FRACTION ONE QUARTER
[½]  189  11/13  275  BD  VULGAR FRACTION ONE HALF
[¾]  190  11/14  276  BE  VULGAR FRACTION THREE QUARTERS
[¿]  191  11/15  277  BF  INVERTED QUESTION MARK
[À]  192  12/00  300  C0  CAPITAL LETTER A WITH GRAVE ACCENT
[Á]  193  12/01  301  C1  CAPITAL LETTER A WITH ACUTE ACCENT
[Â]  194  12/02  302  C2  CAPITAL LETTER A WITH CIRCUMFLEX ACCENT
[Ã]  195  12/03  303  C3  CAPITAL LETTER A WITH TILDE
[Ä]  196  12/04  304  C4  CAPITAL LETTER A WITH DIAERESIS
[Å]  197  12/05  305  C5  CAPITAL LETTER A WITH RING ABOVE
[Æ]  198  12/06  306  C6  CAPITAL DIPHTHONG A WITH E
[Ç]  199  12/07  307  C7  CAPITAL LETTER C WITH CEDILLA
[È]  200  12/08  310  C8  CAPITAL LETTER E WITH GRAVE ACCENT
[É]  201  12/09  311  C9  CAPITAL LETTER E WITH ACUTE ACCENT
[Ê]  202  12/10  312  CA  CAPITAL LETTER E WITH CIRCUMFLEX ACCENT
[Ë]  203  12/11  313  CB  CAPITAL LETTER E WITH DIAERESIS
[Ì]  204  12/12  314  CC  CAPITAL LETTER I WITH GRAVE ACCENT
[Í]  205  12/13  315  CD  CAPITAL LETTER I WITH ACUTE ACCENT
[Î]  206  12/14  316  CE  CAPITAL LETTER I WITH CIRCUMFLEX ACCENT
[Ï]  207  12/15  317  CF  CAPITAL LETTER I WITH DIAERESIS
[Ð]  208  13/00  320  D0  CAPITAL ICELANDIC LETTER ETH
[Ñ]  209  13/01  321  D1  CAPITAL LETTER N WITH TILDE
[Ò]  210  13/02  322  D2  CAPITAL LETTER O WITH GRAVE ACCENT
[Ó]  211  13/03  323  D3  CAPITAL LETTER O WITH ACUTE ACCENT
[Ô]  212  13/04  324  D4  CAPITAL LETTER O WITH CIRCUMFLEX ACCENT
[Õ]  213  13/05  325  D5  CAPITAL LETTER O WITH TILDE
[Ö]  214  13/06  326  D6  CAPITAL LETTER O WITH DIAERESIS
[×]  215  13/07  327  D7  MULTIPLICATION SIGN
[Ø]  216  13/08  330  D8  CAPITAL LETTER O WITH OBLIQUE STROKE
[Ù]  217  13/09  331  D9  CAPITAL LETTER U WITH GRAVE ACCENT
[Ú]  218  13/10  332  DA  CAPITAL LETTER U WITH ACUTE ACCENT
[Û]  219  13/11  333  DB  CAPITAL LETTER U WITH CIRCUMFLEX ACCENT
[Ü]  220  13/12  334  DC  CAPITAL LETTER U WITH DIAERESIS
[Ý]  221  13/13  335  DD  CAPITAL LETTER Y WITH ACUTE ACCENT
[Þ]  222  13/14  336  DE  CAPITAL ICELANDIC LETTER THORN
[ß]  223  13/15  337  DF  SMALL GERMAN LETTER SHARP s
[à]  224  14/00  340  E0  SMALL LETTER a WITH GRAVE ACCENT
[á]  225  14/01  341  E1  SMALL LETTER a WITH ACUTE ACCENT
[â]  226  14/02  342  E2  SMALL LETTER a WITH CIRCUMFLEX ACCENT
[ã]  227  14/03  343  E3  SMALL LETTER a WITH TILDE
[ä]  228  14/04  344  E4  SMALL LETTER a WITH DIAERESIS
[å]  229  14/05  345  E5  SMALL LETTER a WITH RING ABOVE
[æ]  230  14/06  346  E6  SMALL DIPHTHONG a WITH e
[ç]  231  14/07  347  E7  SMALL LETTER c WITH CEDILLA
[è]  232  14/08  350  E8  SMALL LETTER e WITH GRAVE ACCENT
[é]  233  14/09  351  E9  SMALL LETTER e WITH ACUTE ACCENT
[ê]  234  14/10  352  EA  SMALL LETTER e WITH CIRCUMFLEX ACCENT
[ë]  235  14/11  353  EB  SMALL LETTER e WITH DIAERESIS
[ì]  236  14/12  354  EC  SMALL LETTER i WITH GRAVE ACCENT
[í]  237  14/13  355  ED  SMALL LETTER i WITH ACUTE ACCENT
[î]  238  14/14  356  EE  SMALL LETTER i WITH CIRCUMFLEX ACCENT
[ï]  239  14/15  357  EF  SMALL LETTER i WITH DIAERESIS
[ð]  240  15/00  360  F0  SMALL ICELANDIC LETTER eth
[ñ]  241  15/01  361  F1  SMALL LETTER n WITH TILDE
[ò]  242  15/02  362  F2  SMALL LETTER o WITH GRAVE ACCENT
[ó]  243  15/03  363  F3  SMALL LETTER o WITH ACUTE ACCENT
[ô]  244  15/04  364  F4  SMALL LETTER o WITH CIRCUMFLEX ACCENT
[õ]  245  15/05  365  F5  SMALL LETTER o WITH TILDE
[ö]  246  15/06  366  F6  SMALL LETTER o WITH DIAERESIS
[÷]  247  15/07  367  F7  DIVISION SIGN
[ø]  248  15/08  370  F8  SMALL LETTER o WITH OBLIQUE STROKE
[ù]  249  15/09  371  F9  SMALL LETTER u WITH GRAVE ACCENT
[ú]  250  15/10  372  FA  SMALL LETTER u WITH ACUTE ACCENT
[û]  251  15/11  373  FB  SMALL LETTER u WITH CIRCUMFLEX ACCENT
[ü]  252  15/12  374  FC  SMALL LETTER u WITH DIAERESIS
[ý]  253  15/13  375  FD  SMALL LETTER y WITH ACUTE ACCENT
[þ]  254  15/14  376  FE  SMALL ICELANDIC LETTER THORN
[ÿ]  255  15/15  377  FF  SMALL LETTER y WITH DIAERESIS

The following characters make up the extended set of the IBM PC or MS-DOS code page 437, often abbreviated to CP437 and also known as OEM Extended ASCII — called OEM as it was built into the original equipment, the video card ROM manufactured with the IBM PC.

To select a character from the OEM extended set, press the Alt key with the left hand, and simultaneously type the number on the numeric key pad (not the top row numbers) with the right hand. When you release the Alt key, the character will be displayed.
If there is no ANSI (single byte) equivalent to the character being displayed, and you are using Notepad to save the file, be sure to then select Unicode / UTF-8 encoding, and not ANSI. The OEM codes will then be converted by Notepad to their Unicode / UTF-8 equivalent.
This following list includes the hex value of the Unicode (2 byte) equivalent


128= Ç U+00C7 : LATIN CAPITAL LETTER C WITH CEDILLA 
129= ü U+00FC : LATIN SMALL LETTER U WITH DIAERESIS
130= é U+00E9 : LATIN SMALL LETTER E WITH ACUTE
131= â U+00E2 : LATIN SMALL LETTER A WITH CIRCUMFLEX
132= ä U+00E4 : LATIN SMALL LETTER A WITH DIAERESIS
133= à U+00E0 : LATIN SMALL LETTER A WITH GRAVE
134= å U+00E5 : LATIN SMALL LETTER A WITH RING ABOVE
135= ç U+00E7 : LATIN SMALL LETTER C WITH CEDILLA
136= ê U+00EA : LATIN SMALL LETTER E WITH CIRCUMFLEX
137= ë U+00EB : LATIN SMALL LETTER E WITH DIAERESIS
138= è U+00E8 : LATIN SMALL LETTER E WITH GRAVE
139= ï U+00EF : LATIN SMALL LETTER I WITH DIAERESIS
140= î U+00EE : LATIN SMALL LETTER I WITH CIRCUMFLEX
141= ì U+00EC : LATIN SMALL LETTER I WITH GRAVE
142= Ä U+00C4 : LATIN CAPITAL LETTER A WITH DIAERESIS
143= Å U+00C5 : LATIN CAPITAL LETTER A WITH RING ABOVE
144= É U+00C9 : LATIN CAPITAL LETTER E WITH ACUTE
145= æ U+00E6 : LATIN SMALL LETTER AE
146= Æ U+00C6 : LATIN CAPITAL LETTER AE
147= ô U+00F4 : LATIN SMALL LETTER O WITH CIRCUMFLEX
148= ö U+00F6 : LATIN SMALL LETTER O WITH DIAERESIS
149= ò U+00F2 : LATIN SMALL LETTER O WITH GRAVE
150= û U+00FB : LATIN SMALL LETTER U WITH CIRCUMFLEX
151= ù U+00F9 : LATIN SMALL LETTER U WITH GRAVE
152= ÿ U+00FF : LATIN SMALL LETTER Y WITH DIAERESIS
153= Ö U+00D6 : LATIN CAPITAL LETTER O WITH DIAERESIS
154= Ü U+00DC : LATIN CAPITAL LETTER U WITH DIAERESIS
155= ¢ U+00A2 : CENT SIGN
156= £ U+00A3 : POUND SIGN
157= ¥ U+00A5 : YEN SIGN
158= ₧ U+20A7 : PESETA SIGN
159= ƒ U+0192 : LATIN SMALL LETTER F WITH HOOK
160= á U+00E1 : LATIN SMALL LETTER A WITH ACUTE
161= í U+00ED : LATIN SMALL LETTER I WITH ACUTE
162= ó U+00F3 : LATIN SMALL LETTER O WITH ACUTE
163= ú U+00FA : LATIN SMALL LETTER U WITH ACUTE
164= ñ U+00F1 : LATIN SMALL LETTER N WITH TILDE
165= Ñ U+00D1 : LATIN CAPITAL LETTER N WITH TILDE
166= ª U+00AA : FEMININE ORDINAL INDICATOR
167= º U+00BA : MASCULINE ORDINAL INDICATOR
168= ¿ U+00BF : INVERTED QUESTION MARK
169= ⌐ U+2310 : REVERSED NOT SIGN
170= ¬ U+00AC : NOT SIGN
171= ½ U+00BD : VULGAR FRACTION ONE HALF
172= ¼ U+00BC : VULGAR FRACTION ONE QUARTER
173= ¡ U+00A1 : INVERTED EXCLAMATION MARK
174= « U+00AB : LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
175= » U+00BB : RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
176= ░ U+2591 : LIGHT SHADE
177= ▒ U+2592 : MEDIUM SHADE
178= ▓ U+2593 : DARK SHADE
179= │ U+2502 : BOX DRAWINGS LIGHT VERTICAL
180= ┤ U+2524 : BOX DRAWINGS LIGHT VERTICAL AND LEFT
181= ╡ U+2561 : BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
182= ╢ U+2562 : BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
183= ╖ U+2556 : BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
184= ╕ U+2555 : BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
185= ╣ U+2563 : BOX DRAWINGS DOUBLE VERTICAL AND LEFT
186= ║ U+2551 : BOX DRAWINGS DOUBLE VERTICAL
187= ╗ U+2557 : BOX DRAWINGS DOUBLE DOWN AND LEFT
188= ╝ U+255D : BOX DRAWINGS DOUBLE UP AND LEFT
189= ╜ U+255C : BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
190= ╛ U+255B : BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
191= ┐ U+2510 : BOX DRAWINGS LIGHT DOWN AND LEFT
192= └ U+2514 : BOX DRAWINGS LIGHT UP AND RIGHT
193= ┴ U+2534 : BOX DRAWINGS LIGHT UP AND HORIZONTAL
194= ┬ U+252C : BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
195= ├ U+251C : BOX DRAWINGS LIGHT VERTICAL AND RIGHT
196= ─ U+2500 : BOX DRAWINGS LIGHT HORIZONTAL
197= ┼ U+253C : BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
198= ╞ U+255E : BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
199= ╟ U+255F : BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
200= ╚ U+255A : BOX DRAWINGS DOUBLE UP AND RIGHT
201= ╔ U+2554 : BOX DRAWINGS DOUBLE DOWN AND RIGHT
202= ╩ U+2569 : BOX DRAWINGS DOUBLE UP AND HORIZONTAL
203= ╦ U+2566 : BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
204= ╠ U+2560 : BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
205= ═ U+2550 : BOX DRAWINGS DOUBLE HORIZONTAL
206= ╬ U+256C : BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
207= ╧ U+2567 : BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
208= ╨ U+2568 : BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
209= ╤ U+2564 : BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
210= ╥ U+2565 : BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
211= ╙ U+2559 : BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
212= ╘ U+2558 : BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
213= ╒ U+2552 : BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
214= ╓ U+2553 : BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
215= ╫ U+256B : BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
216= ╪ U+256A : BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
217= ┘ U+2518 : BOX DRAWINGS LIGHT UP AND LEFT
218= ┌ U+250C : BOX DRAWINGS LIGHT DOWN AND RIGHT
219= █ U+2588 : FULL BLOCK
220= ▄ U+2584 : LOWER HALF BLOCK
221= ▌ U+258C : LEFT HALF BLOCK
222= ▐ U+2590 : RIGHT HALF BLOCK
223= ▀ U+2580 : UPPER HALF BLOCK
224= α U+03B1 : GREEK SMALL LETTER ALPHA
225= ß U+00DF : LATIN SMALL LETTER SHARP S
226= Γ U+0393 : GREEK CAPITAL LETTER GAMMA
227= π U+03C0 : GREEK SMALL LETTER PI
228= Σ U+03A3 : GREEK CAPITAL LETTER SIGMA
229= σ U+03C3 : GREEK SMALL LETTER SIGMA
230= µ U+00B5 : MICRO SIGN
231= τ U+03C4 : GREEK SMALL LETTER TAU
232= Φ U+03A6 : GREEK CAPITAL LETTER PHI
233= Θ U+0398 : GREEK CAPITAL LETTER THETA
234= Ω U+03A9 : GREEK CAPITAL LETTER OMEGA
235= δ U+03B4 : GREEK SMALL LETTER DELTA
236= ∞ U+221E : INFINITY
237= φ U+03C6 : GREEK SMALL LETTER PHI
238= ε U+03B5 : GREEK SMALL LETTER EPSILON
239= ∩ U+2229 : INTERSECTION
240= ≡ U+2261 : IDENTICAL TO
241= ± U+00B1 : PLUS-MINUS SIGN
242= ≥ U+2265 : GREATER-THAN OR EQUAL TO
243= ≤ U+2264 : LESS-THAN OR EQUAL TO
244= ⌠ U+2320 : TOP HALF INTEGRAL
245= ⌡ U+2321 : BOTTOM HALF INTEGRAL
246= ÷ U+00F7 : DIVISION SIGN
247= ≈ U+2248 : ALMOST EQUAL TO
248= ° U+00B0 : DEGREE SIGN
249= ∙ U+2219 : BULLET OPERATOR
250= · U+00B7 : MIDDLE DOT
251= √ U+221A : SQUARE ROOT
252= ⁿ U+207F : SUPERSCRIPT LATIN SMALL LETTER N
253= ² U+00B2 : SUPERSCRIPT TWO
254= ■ U+25A0 : BLACK SQUARE
255=   U+00A0 : NO-BREAK SPACE

 

In place of Code Page 437, many Western European and English speaking countries including Australia use OEM Code Page 850 also known as DOS Latin 1.


Microsoft Notepad

Click here for the Wikipedia article on Notepad.

Notepad's New Rules since 2019

Notepad's New UTF-8 Rules since May 2019

** End of article

Go Top