C trigraph

From Wikipedia, the free encyclopedia

In the C family of programming languages, a trigraph is a sequence of three characters, the first two of which are both question marks, that represents a single character.

The reason for their existence is that the basic character set of C is a subset of the ASCII character set, but nine of its characters lie outside the smaller ISO 646 invariant character set. The ISO 646 character set is largely equivalent to ASCII, except that certain punctuation characters present in ASCII are allowed to be substituted by "national characters". In other words, users of non-English languages are free to reassign those characters to additional alphabetic symbols needed in their language. However, this poses a problem for C programming, since those removed punctuation characters are heavily used in C. The ANSI C committee invented trigraphs to permit programs to be written using any version of the ISO 646 character set. Non-ASCII ISO 646 character sets are not much used today, but trigraphs remain in the C standard.

Trigraphs may also be useful with some EBCDIC code pages that lack characters such as { and }.

Trigraphs are rarely used outside compiler test suites. Many compilers either have an option to turn recognition of trigraphs off, or disable trigraphs by default and require an option to turn them on. Some can issue warnings when they encounter trigraphs in source files. Borland supplied a separate program, the trigraph preprocessor, to be used only when trigraph processing is desired.

Processing of trigraphs may be considered a performance burden on compilers as every character in every source file has to be checked to see if it introduces a trigraph. However, the source characters have to be examined individually anyway, so the additional overhead is small.

Contents

The C preprocessor replaces all occurrences of the following nine trigraph sequences by their single-character equivalents before any other processing.

    Trigraph     Equivalent
    ========     ==========
      ??=            #
      ??/            \
      ??'            ^
      ??(            [
      ??)            ]
      ??!            |
      ??<            {
      ??>            }
      ??-            ~

Note that ??? is not a trigraph sequence.

Note also that the problematic characters are nevertheless required to exist within the implementation, in both the source and execution character sets.

The ??/ trigraph can be used to introduce an escaped newline for line splicing; this must be taken into account for correct and efficient handling of trigraphs within the preprocessor. It can also cause surprises, particularly within comments. For example:

 // Will the next line be executed????????????????/
 a++;

which is a single logical comment line, and

 /??/
 * A comment *??/
 /

which is a correctly formed block comment.

An example of a C program that uses all the defined trigraphs:

??=include                          /* #          */

int main(void)
??<                                          /* {          */
        char n??(5??);                       /* [ and ]    */

        n??(4??) = '0' - (??-0 ??' 1 ??! 2); /* ~, ^ and | */
        printf("%c??/n", n??(4??));          /* ??/ = \    */
        return 0;
??>                                          /* }          */

A programmer may want to place two question marks together yet not have the compiler treat them as introducing a trigraph. The C grammar does not permit two subsequent ? tokens, so the only places in a C file where two question marks in a row may be used are in multi-character constants, string literals, and comments. To safely place two consecutive question marks within a string literal, concatenation "...?""?..." can be used.

In 1994 a normative amendment to the C standard, included in C99, supplied so-called digraphs as more readable alternatives to trigraphs. They are:

    Digraph     Equivalent
    =======     ==========
      <:             [
      :>             ]
      <%             {
      %>             }
      %:             #
      %:%:           ##

Unlike trigraphs, digraphs are not substituted within quoted strings, character constants, or comments.

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.