Half-width kana
From Wikipedia, the free encyclopedia
Half-width kana (半角カナ) is half of fullwidth form. It refers to the katakana character portion of the character set specified by JIS X 0201.
Although an official name is JIS X 0201 katakana, half-width kana is the commonly known name and this term will be used in this article.
Contents |
[edit] History
ASCII is defined as a 7-bit character set and has room for 128 characters. However, since this standard was designed for the United States, it does not contain characters and symbols (for example, the ¥ yen currency symbol) needed for representation of Japanese.
JIS X 0201 was developed in 1969, and since computers at that time simply did not have the computational power and memory necessary to process the thousands of Kanji (Chinese-based) characters that exist in written Japanese, thereforeo as a simplification, Kanji characters were always represented by katakana.
Half-width kana were developed as "...the first Japanese characters encoded on computers because they are used for Japanese telegrams. As single-byte characters..."
To make katakana fit into the area allowed, some compromises were made: the diacritical marks Dakuten and Handakuten are treated as separate characters instead of being part of the preceding character. This led to the so-called "half-width kana" and these compromises still cause problems today for computer programs, apart from frequently being considered to be visually unattractive.
[edit] Half-width table
| \Trailing 4 bits→ ↓Leading 4 bits |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ||||||||||||||||
| 1 | ||||||||||||||||
| 2 | ||||||||||||||||
| 3 | ||||||||||||||||
| 4 | ||||||||||||||||
| 5 | ||||||||||||||||
| 6 | ||||||||||||||||
| 7 | ||||||||||||||||
| 8 | ||||||||||||||||
| 9 | ||||||||||||||||
| a | 。 | 「 | 」 | 、 | ・ | ヲ | ァ | ィ | ゥ | ェ | ォ | ャ | ュ | ョ | ッ | |
| b | ー | ア | イ | ウ | エ | オ | カ | キ | ク | ケ | コ | サ | シ | ス | セ | ソ |
| c | タ | チ | ツ | テ | ト | ナ | ニ | ヌ | ネ | ノ | ハ | ヒ | フ | ヘ | ホ | マ |
| d | ミ | ム | メ | モ | ヤ | ユ | ヨ | ラ | リ | ル | レ | ロ | ワ | ン | ゙ | ゚ |
| e | ||||||||||||||||
| f |
[edit] Half-width kana on the Internet
[edit] E-mail
Since the SMTP and NNTP protocols (used to deliver e-mail and Usenet, respectively) were formerly only able to transmit 7-bits, it was then the convention to use ISO-2022-JP for sending e-mail in Japanese.
Since half-width kana is not contained in ISO-2022-JP, half-width kana cannot be included in a message, but when half-width kana was accidentally included in a message, it can become garbled during transmission.
This is no longer such a problem since most e-mail servers today use ESMTP, and hence 8-bit characters are acceptable. Alternatively, an encoding system such as Base64 can be used and specified in the message using MIME.
[edit] Web pages
The problems that exists in e-mail do not exist with Web pages since HTTP accepts 8-bit characters.
A problem that does exist is that computer programs have difficulties whether to treat a character as Shift JIS,EUC-JP, or UTF-7 - hence character code information should be specified with a HTTP response header or a Meta tag.
[edit] Misunderstanding of JIS X 0201
In fact, JIS X 0201 katakana is not half-width katakana. The standard doesn't define character's width. It defines only the code representation of katakana characters. The term "half-width" is just the remains of the old devices that displayed single-byte characters in half-width (as compared with double-byte ones). In JIS X 0201 standard, katakana characters in its code chart are printed in normal width, not half-width.
However, the misunderstanding that the standard defines "half-width" characters is widespread. People who know the standard will often say "so-called half-width kana."
[edit] See also
[edit] References
- ^ Lunde, Ken. CJKV Information Processing. 1st ed. O'Reilly, 1999. p. 144-145

