Step One: Lexical Analysis Lexical Scanning Lexical Analysis Regular .

1y ago

10 Views

2 Downloads

727.06 KB

7 Pages

Last View : 2d ago

Last Download : 3m ago

Upload by : Baylee Stein

Report this link

Download PDF

Transcription

Concepts 6 Overview of syntax and semantics Step one: lexical analysis –Lexical scanning –Regular expressions –DFAs and FSMs –Lex Lexical Analysis CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Lexical analysis in perspective LEXICAL ANALYZER: Transforms character stream to token stream – Also called scanner, lexer, linear analysis source program lexical analyzer token parser get next token symbol table LEXICAL ANALYZER – Performs Syntax Analysis – Removes whitespace, newlines, – Actions Dictated by Token Order – Identifies Tokens – Updates Symbol Table Entries – Creates Symbol Table – Inserts Tokens into symbol table This is an overview of the standard process of turning a text file into an executable program. PARSER – Scans Input – Generates Errors – Creates Abstract Rep. of Source – Generates Errors – Sends Tokens to Parser CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Where we are Basic lexical analysis terms Token Total price tax; Total price tax Lexical analyzer ; – A classification for a common set of strings – Examples: identifier , number , operator , open paren , etc. Pattern assignment id Parser Expr id price CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. id tax – The rules which characterize the set of strings for a token – Recall file and OS wildcards (*.java) Lexeme – Actual sequence of characters that matches pattern and is classified by a token – Identifiers: x, count, name, etc – Integers: -12, 101, 0, CMSC 331, Some material 1998 by Addison Wesley Longman, Inc.

Examples of token, lexeme and pattern Regular expression (REs) if (price gst – rebate 10.00) gift : false Scanners are based on regular expressions that define simple patterns Simpler and less expressive than BNF Examples of a regular expression Token lexeme Informal description of pattern if if if Lparen ( ( Identifier price String consists of letters and numbers and starts with a letter operator identifier gst String consists of letters and numbers and starts with a letter operator - - identifier rebate String consists of letters and numbers and starts with a letter Operator Less than or equal to constant 10.00 Any numeric constant rparen ) ) identifier gift String consists of letters and numbers and starts with a letter Operator : Assignment symbol identifier false String consists of letters and numbers and starts with a letter CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. letter: a b c . z A B C. Z digit: 0 1 2 3 4 5 6 7 8 9 identifier: letter (letter digit)* Basic operations are (1) set union, (2) concatenation and (3) Kleene closure Plus: parentheses, naming patterns No recursion! CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Regular expression (REs) Example letter: a b c . z A B C. Z digit: 0 1 2 3 4 5 6 7 8 9 identifier: letter (letter digit)* letter ( letter digit ) * concatenation: one pattern followed by another letter ( letter digit ) * set union: one pattern or another letter ( letter digit ) * Kleene closure: zero or more repetions of a pattern CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Regular expression example revisited Examples of regular expression Letter: a b c . z A B C. Z Digit: 0 1 2 3 4 5 6 7 8 9 Identifier: letter (letter digit)* Q: why it is an regular expression? – Because it only uses the operations of union, concatenation and Kleene closure Being able to name patterns is just syntactic sugar Using parentheses to group things is just syntactic sugar provided we specify the precedence and associatively of the operators (i.e., , * and “concat”) CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Regular expressions are extremely useful in many applications. Mastering them will serve you well. CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Another common operator: The operator is commonly used to mean “one or more repetitions” of a pattern For example, letter means one or more letters We can always do without this, e.g. letter is equivalent to letter letter* So the operator is just syntactic sugar CMSC 331, Some material 1998 by Addison Wesley Longman, Inc.

Epsilon Precedence of operators In interpreting a regular expression Parens scope sub-expressions * and have the highest precedence Concanenation comes next is lowest. All the operators are left associative Example – (A) ((B)* (C)) is equivalent to A B * C – What strings does this generate or match? Either an A or any number of Bs followed by a C CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Sometimes we’d like a token that represents nothing This makes a regular expression matching more complex, but can be useful We use the lower case Greek letter epsilon, ε, for this special token Example: digit: 0 1 2 3 4 5 6 7 8 9 0 sign: - ε int: sign digit CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Notational shorthand of regular expression Regular grammar and regular expression One or more instance – L L L* – L* L ε – Examples » digits: digit digit* » digits: digit Zero or one instance More syntatic sugar – L? L ε – Examples » Optional fraction .digits ε » optional fraction (.digits)? Character classes They are equivalent –Every regular expression can be expressed by regular grammar –Every regular grammar can be expressed by regular expression Example – An identifier must begin with a letter and can be followed by arbitrary number of letters and digits. Regular expression ID: LETTER (LETTER DIGIT)* – [abc] a b c – [a-z] a b c. z CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Formal definition of tokens A set of tokens is a set of strings over an alphabet {read, write, , -, *, /, : , 1, 2, , 10, , 3.45e-3, } A set of tokens is a regular set that can be defined by using a regular expression For every regular set, there is a finite automaton (FA) that can recognize it – Aka deterministic Finite State Machine (FSM) – i.e. determine whether a string belongs to the set or not – Scanners extract tokens from source code in the same way FAs determine membership CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Regular grammar ID LETTER ID REST ID REST LETTER ID REST DIGIT ID REST EMPTY CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. FSM FA Finite state machine and finite automaton are different names for the same concept The basic concept is important and useful in almost every aspect of computer science The concept provides an abstract way to describe a process that – Has a finite set of states it can be in – Gets a sequence of inputs – Each input causes the process to go from its current state to a new state (which might be the same!) – If after the input ends, we are in one of a set of accepting state, the input is accepted by the FA CMSC 331, Some material 1998 by Addison Wesley Longman, Inc.

Example Deterministic finite automaton (DFA) This example shows a FA that determines whether a binary number has an odd or even number of 0's, where S1 is an accepting state. In a DFA there is only one choice for a given input in every state There are no states with two arcs that match the same input that transition to different states Is this a DFA? CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Deterministic finite automaton (DFA) If there is an input symbol that matches no arc for the current state, the input is not accepted This FA accepts only binary numbers that are multiples of three S0 is both the start state and an accept state. REs can be represented as DFAs Regular expression for a simple identifier Letter: a b c . z A B C. Z Digit: 0 1 2 3 4 5 6 7 8 9 Identifier: letter (letter digit)* letter letter Is this a DFA? CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Incoming arrow identifies a single start state * marks a possible final (accepting) state State transitions enabled by input Arcs represent transitions and are labeled with required input 0,1,2,3,4 9 CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. REs can be represented as DFAs Regular expression for a simple identifier Letter: a b c . z A B C. Z Digit: 0 1 2 3 4 5 6 7 8 9 Identifier: letter (letter digit)* letter letter Incoming arrow identifies a single start state * marks a possible final (accepting) state State transitions enabled by input Arcs represent transitions and are labeled with required input CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. * An alternate notation Uses a double circle For an accepting state 0,1,2,3,4 9 Token Definition Example Numeric literals in Pascal, e.g. 1, 123, 3.1415, 10e-3, 3.14e4 Definition of token unsignedNum DIG 0 1 2 3 4 5 6 7 8 9 unsignedInt DIG DIG* unsignedNum unsignedInt (( . unsignedInt) ) ((e ( – ) unsignedInt) ) Note: – Recursion restricted to leftmost or rightmost position on LHS – Parentheses used to avoid ambiguity – It’s always possible to rewrite by removing epsilons ( ) CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. DIG DIG * . DIG * DIG e DIG * DIG DIG Accepting states marked with a * FAs with epsilons are nondeterministic NFAs are harder to implement, use backtracking Every NFA can be rewritten as a DFA (gets larger, tho)

Simple Problem Write a C program which reads in a character string, consisting of a’s and b’s, one character at a time. If the string contains a double aa, then print string accepted else print string rejected. An abstract solution to this can be expressed as a DFA a b 1 Start state 2 a b The state transitions of a DFA can be encoded as a table which specifies the new state for a given current state and input 3 a, b An accepting state a current state 1 2 3 input 2 3 3 b 1 1 3 CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. using a table simplifies the program CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Scanner Generators CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. one approach in C CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. #include stdio.h main() { enum State {S1, S2, S3}; enum Label {A, B}; enum State currentState S1; enum State table[3][2] {{S2, S1}, {S3, S1}, {S3, S3}}; int label; int c getchar(); while (c ! EOF) { if (c ‘a’) label A; if (c ‘b’) label B; currentState table[currentState][label]; c getchar(); } if (currentState S3) printf(“string accepted\n”); else printf(“string rejected\n”); } E.g. lex, flex These programs take a table as their input and return a program (i.e. a scanner) that can extract tokens from a stream of characters A very useful programming utility, especially when coupled with a parser generator (e.g., yacc) standard in Unix #include stdio.h main() { enum State {S1, S2, S3}; enum State currentState S1; int c getchar(); while (c ! EOF) { switch(currentState) { case S1: if (c ‘a’) currentState S2; if (c ‘b’) currentState S1; break; case S2: if (c ‘a’) currentState S3; if (c ‘b’) currentState S1; break; case S3: break; } c getchar(); } if (currentState S3) printf(“string accepted\n”); else printf(“string rejected\n”); } Lex Lexical analyzer generator – It writes a lexical analyzer Assumption – each token matches a regular expression Needs – set of regular expressions – for each expression an action Produces – A C program Automatically handles many tricky problems flex is the gnu version of the venerable unix tool lex. – Produces highly optimized code CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Lex example lex foo.l input cc foolex.c foolex foolex tokens flex -ofoolex.c foo.l cc -ofoolex foolex.c -lfl more input begin if size 10 then size * -3.1415 end CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. foolex input Keyword: begin Keyword: if Identifier: size Operator: Integer: 10 (10) Keyword: then Identifier: size Operator: * Operator: Float: 3.1415 (3.1415) Keyword: end

A Lex Program Examples The examples to follow can be access on gl See /afs/umbc.edu/users/p/a/park/pub/331/lex % ls -l /afs/umbc.edu/users/p/a/park/pub/331/lex total 8 drwxr-xr-x 2 park faculty 2048 Sep 27 13:31 aa drwxr-xr-x 2 park faculty 2048 Sep 27 13:32 defs drwxr-xr-x 2 park faculty 2048 Sep 27 11:35 footranscanner drwxr-xr-x 2 park faculty 2048 Sep 27 11:34 simplescanner CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. definitions %% rules %% subroutines CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Simplest Example %% . \n ECHO; %% main() { yylex(); } DIG [0-9] ID [a-z][a-z0-9]* %% {DIG} printf("Integer\n”); {DIG} "."{DIG}* printf("Float\n”); {ID} printf("Identifier\n”); [ \t\n] /* skip whitespace */ . printf(“Huh?\n"); %% main(){yylex();} No definitions One rule Minimal wrapper Echoes input CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Strings containing aa %% (a b)*aa(a b)* {printf("Accept %s\n", yytext);} [a b] {printf("Reject %s\n", yytext);} . \n ECHO; %% main() {yylex();} CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. Rules Definitions Each rule has a pattern and an action Patterns are regular expressions Only one action is performed – The action corresponding to the pattern matched is performed – If several patterns match the input, the one corresponding to the longest sequence is chosen – Among the rules whose patterns match the same number of characters, the rule given first is preferred The definitions block allows you to name a RE If the name appears in curly braces in a rule, the RE will be substituted CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. DIG [0-9] %% {DIG} {DIG} "."{DIG}* . %% main(){yylex();} CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. printf("int: %s\n", yytext); printf("float: %s\n", yytext); /* skip anything else */

/* scanner for a toy Pascal-like language */ %{ #include math.h /* needed for call to atof() */ %} DIG [0-9] ID [a-z][a-z0-9]* %% {DIG} printf("Integer: %s (%d)\n", yytext, atoi(yytext)); {DIG} "."{DIG}* printf("Float: %s (%g)\n", yytext, atof(yytext)); if then begin end printf("Keyword: %s\n",yytext); {ID} printf("Identifier: %s\n",yytext); " " "-" "*" "/" printf("Operator: %s\n",yytext); "{"[ }\n]*"}" /* skip one-line comments */ [ \t\n] /* skip whitespace */ . printf("Unrecognized: %s\n",yytext); %% main(){yylex();} CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. x . [xyz] [abj-oZ] Flex’s RE syntax character 'x' any character except newline character class, in this case, matches either an 'x', a 'y', or a 'z' character class with a range in it; matches 'a', 'b', any letter from 'j' through 'o', or 'Z' [ A-Z] negated character class, i.e., any character but those in the class, e.g. any character except an uppercase letter. [ A-Z\n] any character EXCEPT an uppercase letter or a newline r* zero or more r's, where r is any regular expression r one or more r's r? zero or one r's (i.e., an optional r) {name} expansion of the "name" definition "[xy]\"foo" the literal string: '[xy]"foo' (note escaped ") \x if x is an 'a', 'b', 'f', 'n', 'r', 't', or 'v', then the ANSI-C interpretation of \x. Otherwise, a literal 'x' (e.g., escape) rs RE r followed by RE s (e.g., concatenation) r s either an r or an s EOF end-of-file CMSC 331, Some material 1998 by Addison Wesley Longman, Inc.

Lexical analyzer generator -It writes a lexical analyzer Assumption -each token matches a regular expression Needs -set of regular expressions -for each expression an action Produces -A C program Automatically handles many tricky problems flex is the gnu version of the venerable unix tool lex. -Produces highly .

Related Documents:

RECEIYW - Guam

grade step 1 step 11 step 2 step 12 step 3 step 13 step 4 step 14 step 5 step 15 step 6 step 16 step 7 step 17 step 8 step 18 step 9 step 19 step 10 step 20 /muimn 17,635 18,737 19,840 20,942 22,014 22,926 23,808 24,689 325,57! 26,453 /2qsohrs steps 11-20 8.48 9.0! 9.54 10.07 10.60 11.02 11.45 11.87 12.29 12.72-

37 Views

10m ago

Special Salary Rates - 562-600 - Veterans Affairs

Special Rates 562-600 Station Number 564 Duty Sta Occupation 0083-00 City: FAYETTEVILL State: AR Grade Suppl Rate Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8 Step 9 Step 10 Min OPM Tab Eff Date Duty Sta Occupation 0601-13 City: FAYETTEVILL State: AR Grade Suppl Rate Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8 Step 9 Step 10 Min OPM Tab Eff Date

36 Views

6m ago

Effective Pay Minimum Date Job Classification Grade Step 1 Step 2 Step ...

Grade Minimum Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Mid-Point Step 8 Step 9 Step 10 Step 11 Step 12 Step 13 Step 14 Maximum Step 15 12/31/2022 Accounting Services Coordinator O-19 45.20 55.15 65.10 Hourly 94,016 114,712 135,408 Appx Annual 12/31/2022 Accounting Services Manager O-20 47.45 57.90 68.34 Hourly

24 Views

6m ago

Auditory lexical decision in children with specific ...

test whether temporal speech processing limitation in SLI could interfere with the autonomous pre-lexical process (Montgomery, 2002) -lexical contact and lexical . It is worth noting that the auditory lexical decision task and the receptive vocabulary measure taps two different levels of processing; the last one. Lexical decision in children .

11 Views

2y ago

Lexical and Syntax Analysis - Virginia Tech

Reasons to Separate Lexical and Syntax Analysis Simplicity - less complex approaches can be used for lexical analysis; separating them simplifies the parser Efficiency - separation allows optimization of the lexical analyzer Portability - parts of the lexical analyzer may not be portable, but the parser is always portable

8 Views

1y ago

MyCo Domicilia - colorado.edu

Shake the bag so that everything mixes together (at least 1 min.) Store in a dark, dry place for 5 days Primary Growing Process Steps one Step two Step three Step four Step five Final step 11 12 Step two Step three Step five Step four Step one Step three Step 7, 8, & 9 Step four Step ten Step 3 &am

99 Views

2y ago

An Analysis of Lexical Collocation Errors in Students' Writing

lexical collocations, and using the correct lexical collocations continuously in oral and written communication. The study of lexical collocation has been conducted by many researchers in the past few decades. The first previous study was by Martelli (2004) about a study of English lexical collocations written by Italian

8 Views

1y ago

Runners, repeaters, strangers and aliens: operationalising ...

Zrunners-repeaters-strangers-aliens [ (RRSA) (Parnaby, 1988; Aitken et al., 2003). This model segments inputs of demand from customers (in this case, the requests from researchers for data cleared for publication) and uses the different characteristics of those segments to develop optimal operational responses. Using this framework, we contrast how the rules-based and principles-based .

88 Views

3y ago

Recent Views

Columbus,Ohio 1890

Slicing Steaks 3563 Beef Tender, Select In Stock 3852 Angus XT Shoulder Clod, Choice In Stock 3853 Angus XT Chuck Roll, Choice 20/up In Stock 3856 Angus XT Peeled Knuckle In Stock 3857 Angus XT Inside Rounds In Stock 3858 Angus XT Flats, Choice In Stock 3859 Angus XT Eye Of Round, Choice In Stock 3507 Point Off Bnls Beef Brisket, Choice In Stock

2y ago

268 Views

Buying Your First Stock - Stock-Trak

Stock Market Game Time: 15 Minutes Requires: StockTrak Curriculum , Computer Access Buying Your First Stock This lesson is an introduction to buying a stock. Students will be introduced to basic vocabulary that is involved with a buying and owning a stock. Stu-dents will be going through the entire process of buying a stock from looking

1y ago

164 Views

TRAINING - CamInstructor

Mastercam Training Guide Mill-Lesson-4-9 6. Change the parameters to match the Stock Setup screenshot below: Stock Setup Stock Origin The stock origin is the X-Y-Z coordinate position of the point indicated by the cross in the picture of the stock model. Use it so Mastercam knows where your stock model is located relative to your part and

3y ago

242 Views

WPX Energy, Inc. - Feltl and Company

WPX Energy, Inc. Common Stock We are offering 27,000,000 shares of our common stock. Our common stock is listed on the New York Stock Exchange under the symbol “WPX.” On July 10, 2015, the last reported sale price for our common stock on the New York Stock Exchange (the “NYSE”) was 11.22 per share.

3y ago

172 Views

Spray 2020 Corporate Profiles - industry-publications

Custom plastic tubes (mono & multi-layer, ABL and Polyami) Stock and custom plastic, metal, and wood caps and closures Stock and custom fine mist, treatment and lotion pumps Stock and custom droppers Stock and custom rollerballs/roll-ons Stock sampler bottles and vials Stock German Quality cosmetic pencil sharpeners

2y ago

180 Views

The Stock Market Profits Blueprint - Liberated Stock Trader

The stock market profits blueprint has been hand crafted to enable you to understand all the factors that play on the stock market. It is called a blueprint because a blueprint is in effect an architectural document to show how something is designed. The Blueprint will show you a powerful way to envisage how the stock market and the stock market

1y ago

181 Views

The Impact of Persian News on Stock Returns Through Text Mining Techniques

Persian news - on the stock prices has been neglected. Consequently, this study aimed to fill this gap. To this aim, the stock index values were collected from the Tehran Stock Exchange along with the . Stock market prediction is a way to understand the future fluctuations of a company's stock price (Jishag et al., 2020). Generally, two .

1y ago

225 Views

Stock Market Uncertainty and the Stock-Bond Return Relation

implied volatility and stock turnover may prove useful for ﬁnancial applications that need to under-stand and predict stock and bond return co-movements. Finally, our empirical results suggest that the beneﬁts of stock-bond diversiﬁcation increase during periods of high stock market uncertainty. This study is organized as follow.

1y ago

158 Views

Operation of Stock Exchange - Williams College

Class Notes Operation of Stock Exchange - 3 - Buying on Margin "Margin" is borrowing money from your broker to buy a stock and using your invest-ment as collateral. Example Buy paying full price Buy stock at 60. Stock price goes to 90. Return (90 - 60)/60 50% Buy on "margin" Buy stock at 60. Borrow 30; you pay 30.

1y ago

138 Views

Stock Market Development and Economic Growth: Empirical Evidence from China

measures used to proxy for stock market size and the size of real economy. Most of the existing studies use stock market index as a proxy for measuring the growth and development of stock market in a country. We argue that stock market index may not be a good measure of stock market size when looking at its association with economic growth.

1y ago

263 Views

A Hybrid Prediction Method for Stock Price Using LSTM and . - Hindawi

the relationship between stock prices and these factors. Although these factors will temporarily change the stock price, in essence, these factors will be reﬂected in the stock price and will not change the long-term trend of the stock price. erefore, stock prices can be predicted simply with historical data.

1y ago

159 Views

A voyage to more stable safety stock and service levels - apics

safety stock targets. Most enterprise resources planning (ERP) systems perform a safety stock calculation. But very few include in the system all sources of variability as inputs to the safety stock formula. Furthermore, ERP tools rarely calculate accurate safety stock inputs or correct erroneous data. Figure 1 shows 13 basic safety stock inputs.

1y ago

149 Views

Factors Affecting Performance of Stock Market: Evidence from . - HRMARS

We used the data of Colombo Stock Exchange (CSE) for Sri Lankan stock market in this research which is the main stock exchange of Sri Lanka. The market capitalization of CSE is over 20 billion USD. Colombo stock exchange is the first south Asian region stock market and overall 52nd who obtain the membership of World Federation of Exchanges.

11m ago

103 Views

Forecasting Stock Price Turning Points in The Tehran Stock Exchange .

Forecasting Stock Price Turning Points in the Tehran Stock Exchange Using Weighted Support Vector Machine. Journal of Entrepreneurship Education, 25(5), 1-12 . 2 1528-2651-25-5-797 Citation Information: Sayrani., M & Sharif, J.S. (2022). Forecasting Stock Price Turning Points in the Tehran Stock Exchange Using Weighted Support Vector Machine. .

7m ago

96 Views

Water Physical Stock Account: 1995-2010 - Tableau Public

10 Water physical stock account for year ended June 2003, by region . 11 Water physical stock account for year ended June 2004, by region . 12 Water physical stock account for year ended June 2005, by region . 13 Water physical stock account for year ended June 2006, by region . 14 Water physical stock account for year ended June 2007, by region

3m ago

22 Views

Step One: Lexical Analysis Lexical Scanning Lexical Analysis Regular .

It looks like you're using an ad-blocker