ASCII: Difference between revisions

From Rosetta Code
Content added Content deleted
m (→‎Control Codes: Double spaced for better readability)
m (→‎Punctuation: added explanation since the comma and period are hard to read)
Line 40: Line 40:
* 4 <-> $
* 4 <-> $
* 5 <-> %
* 5 <-> %
* , <-> <
* , <-> < (Comma and Less Than Sign)
* . <-> >
* . <-> > (Period and Greater Than Sign)
* / <-> ?
* / <-> ?


Other punctuation keys do not follow this "bit 4 rule" anymore. They most likely did on keyboards in the late 20th century but don't follow it now.
Other punctuation keys do not follow this "bit 4 rule" anymore. They most likely did on keyboards in the late 20th century but don't follow it now.



==Citations==
==Citations==

Revision as of 15:41, 28 October 2021

ASCII stands for American Standard Code For Information Interchange. It was first created in 1963 and is the basis for standardized data encoding methods such as Unicode that almost all computers follow today. The original ASCII standard defines 128 bytes, each of which represent different characters, such as the alphabet, numbers, punctuation, etc.

Control Codes

Control codes make up the first 32 ASCII characters. With a few exceptions, these do not have a corresponding key on your keyboard. They are used to tell a computer program various information such as where a new line begins, where a file ends, etc. Of course, what these characters actually do depends on the program itself, but the ASCII standard is intended to have these codes do the same thing regardless of what program is using them. This list is not (currently) exhaustive, but showcases a few control codes in common use today, as well as a few historic ones that are no longer used.

  • 0: (NUL). This is probably one of the most important codes of all. This marks the end of a text string, or other various data fields. Without it, your typical "putS" (Print String) routine would go on forever and eventually crash! Computers don't understand the concept of the end of a data range natively, and often rely on NUL to know when to stop reading. (Some languages place the string size as metadata before the actual string itself, but others use a null terminator)


  • 7: Bell (BEL). The computer makes a beeping sound when reading this control code.


  • 8: Backspace (BS). This will delete the character placed before the cursor.


  • 9: Horizontal Tab (HT). This is your Tab key.


  • 10: Line Feed (LF). This causes the text cursor to move down to the next line, but its horizontal position is unchanged. The phrase "line feed" is also from typewriters, where turning the knob would feed more paper through the carriage. ASCII 13 followed by ASCII 10 makes up a "new line" command (aka \n in C)


  • 13: Carriage Return (CR). This causes the text cursor to go back to the far-left side of the screen (in the days of ASCII, computers weren't designed for languages other than English, so this assumed you were writing left to right. The term "carriage return" comes from typewriters, when pressing the "return" key would make the carriage (the cylinder that held the paper) slide back to the left.


  • 27: ESC. This is the Escape key!

Numbers

The digits 0 through 9 are mapped to hexadecimal values 0x30 to 0x39 respectively. This allows for easy conversion from actual numeric data (which every computer stores internally as hexadecimal) to their ASCII equivalent. A number stored in "unpacked" Binary Coded Decimal (where each digit 0-9 gets its own byte, with the top 4 bits of the byte being zero) can easily be converted to ASCII by adding 0x30 to each byte.

Letters

The upper-case letter A is equal to 0x41, and the upper case letter Z equals 0x5A. The lower case letters are all 32 spaces after the lower-case ones, which means that adding 0x20 to any upper-case letter of the alphabet will return its lower-case form. As you can see, ASCII was designed to make conversions easy. Strangely enough, this 0x20 conversion factor also applies to brackets [] which become curly braces{} and the backslash \ which becomes the vertical bar |

Punctuation

Unfortunately, this is where ASCII no longer "lines up" with modern keyboards, so to speak. The keys above the number keys (!@#$%^&*()) are a bit strange. Some of them are 16 spaces apart from their "no-Shift" counterparts on the modern keyboard layout, and others are not! As computers moved away from ASCII in favor of Unicode, and the need for certain characters grew, I imagine the keyboard became less ASCII-friendly over time.

The following ASCII values can be toggled by flipping bit 4. Each pair of characters occupy the same key on a standard US keyboard:

  • 1 <-> !
  • 3 <-> #
  • 4 <-> $
  • 5 <-> %
  • , <-> < (Comma and Less Than Sign)
  • . <-> > (Period and Greater Than Sign)
  • / <-> ?

Other punctuation keys do not follow this "bit 4 rule" anymore. They most likely did on keyboards in the late 20th century but don't follow it now.

Citations

See Also