Quotation mark glyphs

From Wikipedia, the free encyclopedia

(Redirected from Quotation mark, glyphs)
Jump to: navigation, search
Because of technical limitations, some web browsers may not display some special characters in this article.

Different typefaces, character encodings and computer languages use various encodings and glyphs for quotation marks. This article lists some of these glyphs along with their Unicode code points and HTML entities. The Unicode standard defines two general character categories, “Ps” (punctuation quote start) and “Pe” (punctuation quote end), for all quotation mark characters.

“Ambidextrous” quotation marks were introduced on typewriters to reduce the number of keys on the keyboard, and were inherited by computer keyboards and character sets. However, modern word processors have started to convert text to use curved quotes (see below). Some computer systems designed in the past had character sets with proper opening and closing quotes, with a few systems even making a distinction between apostrophes indicating omission (e.g. couldn’t) and apostrophes indicating possession (e.g. Dave’s car)[citation needed]. However, the ASCII character set, which has been used on a wide variety of computers since the 1960s, only made three quotation marks available: ", ', and the dubious backquote ` (also referred to as backtick or letterless grave accent). The Unicode standard includes typographic and a variety of international quotation marks.

Sample Unicode (decimal) HTML and XML Description
'O' U+0027 (39) ' in XML, but usually '. ' is not part of the HTML specification. Apostrophe (single quote)
"O" U+0022 (34) ", but usually ". Straight quotation mark (double quote)

Many systems, like the personal computers of the 1980s and early ’90s, actually drew straight quotes like curved closing quotes on-screen and in printouts, so text would appear like this (approximately):

”Good morning, Dave,” said HAL.
’Good morning, Dave,’ said HAL.

The grave accent (`, U+0060) could then be used to supply single quote marks. The typesetting application TeX still uses this convention for input files. This use resulted in fonts with an open quote glyph at the grave accent position. This gives a proper appearance at the cost of semantic correctness. Nothing similar was available for the double-quote, so many people resorted to using sets of two single quotes for punctuation, which would look like the following:

``Good morning, Dave,'' said HAL. ⇒ ‘‘Good morning, Dave,’’ said HAL.
`Good morning, Dave,' said HAL. ⇒ ‘Good morning, Dave,’ said HAL.

However, the appearance of these characters has varied greatly from font to font. On systems which provide straight quotes and grave accents the appearance is poor. Unicode specifies that the glyphs for U+0027 and U+0022 should be vertical rather than angled, which means if such tricks are used with a font that follows the rules, like most do today, the result will look rather messy. Of course Unicode also provides the ability to do angled quotes properly:

“Good morning, Dave,” said HAL.
‘Good morning, Dave,’ said HAL.

English curved quotes, also called “book quotes” or “curly quotes”, resemble small figures six and nine raised above the baseline (like 6...9 and 66...99), but then solid, i.e., with the counters filled. In many typefaces, the shapes are the same as those of an inverted (upside down) and normal comma. They are preferred in formal writing and printed typography. In e-mail and on Usenet they can only be used by using a MIME type with a character set outside of the ISO-8859 series such as a Unicode encoding or one of the Windows-125x series. In most cases (the exceptions being if UTF-7 is used or if the 8BITMIME extension is present) this also requires the use of a content-transfer encoding. While not a problem for modern mail clients, use of smart quotes in this way slightly increases the size of the mail message and makes the raw source code harder to follow. For these reasons, some believe it is bad practice (in much the same way that some think that HTML e-mail is a bad thing). A few mail clients send curved quotes using the windows-1252 codes, but mark the text as ISO-8859-1, causing problems for decoders that do not make the dubious assumption that C1 control codes in ISO-8859-1 text were meant to be windows-1252 printable characters.

Curved and straight quotes are also sometimes referred to as “smart quotes” and "dumb quotes" respectively; these names are in reference to the name of a function (found in word processors like Microsoft Word) that automatically converts straight quotes typed by the user into curved quotes. This function was developed for systems which lack separate open- and close-quote keyboard keys, such as Microsoft Windows. (In contrast, Apple Macintosh users can type open and close single and double quotes directly using the Option and [ ] { } keys.) A quote followed by a letter generally converts to an "open quote", whereas a quote with a letter or period (full stop) preceding it and a space after it converts to a "close quote". This function is usually referred to as "educating quotes".

Samples Unicode (decimal) HTML Description
‘O’ U+2018 (8216), U+2019 (8217) ‘ ’ Single quotes (left and right)
“O” U+201C (8220), U+201D (8221) “ ” Double quotes (left and right)

Variants of and are:

– U+201B (HTML: ‛) – single high-reversed-9, or single reversed comma, quotation mark (This is sometimes used to show dropped sounds at the end of words, such as goin‛ instead of using goin‘, goin’, goin`, or goin')
– U+201F (HTML: ‟) – double high-reversed-9, or double reversed comma, quotation mark

Supporting curved quotes has been a problem in information technology, primarily because the widely used ASCII character set did not include a representation for them (as discussed above).

Word processors have traditionally offered curved quotes to users, because in printed documents curved quotes are preferred to straight ones. Before Unicode was widely accepted and supported, this meant representing the curved quotes in whatever 8-bit encoding the software and underlying operating system were using — but the character sets for Windows and Macintosh used two different pairs of values for curved quotes, and ISO 8859-1 (typically the default character set for the Unices and, until recently, Linux) has no curved quotes, making cross-platform compatibility a nightmare.

Compounding the problem is the “smart quotes” feature mentioned above, which some word processors (including Microsoft Word and OpenOffice.org) use by default. With this feature turned on, users may not have realised that the ASCII-compatible straight quotes they were typing on their keyboards ended up as something entirely different.

Unicode support has since become the norm for operating systems. Thus, in at least some cases, transferring content containing curved quotes (or any other non-ASCII characters) from a word processor to another application or platform has sometimes been less troublesome, provided all steps in the process (including the clipboard if applicable) are Unicode-aware. But there are many applications which still use the older character sets, or output data using them, and thus problems still occur.

There are other considerations for including curved quotes in the widely used markup languages HTML, XML, and SGML. If the encoding of the document supports direct representation of the characters, they can be used, but doing so can result in difficulties if the document needs to be edited by someone who is using an editor that cannot support the encoding. For example, many simple text editors only handle a few encodings or assume that the encoding of any file opened is a platform default, so the quote characters may appear as "garbage". HTML includes a set of entities for curved quotes: (left single), (right single), (low 9 single), (left double), (right double), and (low 9 double). XML does not define these by default, but specifications based on it can do so, and XHTML does. In addition, while the HTML 4, XHTML and XML specifications allow specifying numeric character references in either hexadecimal or decimal, SGML and older versions of HTML (and many old implementations) only support decimal references. Thus, to represent curly quotes in XML and SGML, it is safest to use the decimal numeric character references. That is, to represent the double curly quotes use and , and to represent single curly quotes use and . In HTML, it is also safest to use decimal numeric character references, although the named entity references (, etc.) can be processed by most web browsers (with Netscape 4 being a notable exception).

There has been some argument in recent years about the appropriateness of book quotes, since they are perceived by some as distracting. Editors who are against book quotes generally argue for ASCII-style straight quotes.

View Description Unicode name Unicode hexadecimal (decimal) HTML
« Double angle quote (chevron, guillemet, duck-foot quote), left LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 00AB (171) «
» Double angle quote, right RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK 00BB (187) »
Single angle quote, left SINGLE LEFT-POINTING ANGLE QUOTATION MARK 2039 (8249)
Single angle quote, right SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 203A (8250)
Double curved quote, or “curly quote”, left LEFT DOUBLE QUOTATION MARK 201C (8220)
Double curved quote, right RIGHT DOUBLE QUOTATION MARK 201D (8221)
Low double curved quote, left DOUBLE LOW-9 QUOTATION MARK 201E (8222)
Single curved quote, left LEFT SINGLE QUOTATION MARK 2018 (8216)
Single curved quote, right RIGHT SINGLE QUOTATION MARK 2019 (8217)
, Low single curved quote, left SINGLE LOW-9 QUOTATION MARK 201A (8218)
" Typewriter (“programmer’s”) quote, ambidextrous QUOTATION MARK 0022 (0034) "
Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.