Lex programming tool

From Wikipedia, the free encyclopedia

The correct title of this article is lex programming tool. The initial letter is shown capitalized due to technical restrictions.

lex is a program that generates lexical analyzers ("scanners" or "lexers"). Lex is commonly used with the yacc parser generator. Lex, originally written by Eric Schmidt and Mike Lesk, is the standard lexical analyzer generator on Unix systems, and is included in the POSIX standard. Lex reads an input stream specifying the lexical analyzer and outputs source code implementing the lexer in the C programming language.

Though traditionally proprietary software, versions of Lex based on the original AT&T code are available as open source, as part of systems such as OpenSolaris and Plan 9 from Bell Labs. Another popular open source version of Lex is Flex, the "fast lexical analyzer".

Contents

The structure of a lex file is intentionally similar to that of a yacc file; files are divided up into three sections, separated by lines that contain only two percent signs, as follows:

Definition section
%%
Rules section
%%
C code section
  • The definition section is the place to define macros and to import header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file.
  • The rules section is the most important section; it associates patterns with C statements. Patterns are simply regular expressions. When the lexer sees some text in the input matching a given pattern, it executes the associated C code. This is the basis of how lex operates.
  • The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file and link it in at compile time.

The following is an example lex file for the flex version of lex. It recognizes strings of numbers (integers) in the input, and simply prints them out.

/*** Definition section ***/

%{
/* C code to be copied verbatim */
#include 
%}

/* This tells flex to read only one input file */
%option noyywrap

%%
    /*** Rules section ***/

    /* [0-9]+ matches a string of one or more digits */
[0-9]+  {
            /* yytext is a string containing the matched text. */
            printf("Saw an integer: %s\n", yytext);
        }

.       {   /* Ignore all other characters. */   }

%%
/*** C Code section ***/

int main(void)
{
    /* Call the lexer, then quit. */
    yylex();
    return 0;
}

If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers. For example, given the input:

abc123z.!&*2ghj6

the program will print:

Saw an integer: 123
Saw an integer: 2
Saw an integer: 6

Lex and Yacc (a parser generator) are commonly used together. Yacc uses a formal grammar to parse an input stream, something which Lex cannot do using simple regular expressions (Lex is limited to simple finite state automata). However, Yacc cannot read from a simple input stream - it requires a series of tokens. Lex is often used to provide Yacc with these tokens.

The make utility can be used to maintain programs that involve lex. make assumes that a file that has an extension of .l is a lex source file. It knows how such a file must be processed to create an object file.

Suppose that a list of dependencies in a makefile contains a filename x.o, and there exists a file x.l. If x.l was modified later than the file x.o (or if x.o does not exist), then make will cause lex to be run on x.l, and then cause the object file x.o to be created from the resulting lex.yy.c. The make internal macro LFLAGS can be used to specify lex options to be invoked automatically by make.

Eg: Let us assume we have a file ex.l and we want an executable EXE as end result. The usual way to go about it is to follow the following steps -:

$>lex ex.l

You get a file named lex.yy.c

$>gcc -o EXE lex.yy.c -ll (or -lfl)

This way we get the file EXE.

Using make, in the makefile we write -:

EXE : ex.o

gcc -o EXE ex.o -ll

The is needed before the command description by make. That's it! Every time just run make -:

$> make

and voila you have got the result you needed. It doesn't look like much saving here, but if there are 10 files you need to compile then this method is really useful.

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.