Context-free Grammars (CFGs) - Knox College

2y ago
7 Views
3 Downloads
473.89 KB
38 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Wade Mabry
Transcription

Context-free grammars (CFGs)10/8/21(Using slides adapted from the book)

A Little English An article can be the word a or the:A aA the A noun can be the word dog, cat or rat:N dogN catN ratA noun phrase is an article followed by a noun:P AN

A Little English An verb can be the word loves, hates or eats:V lovesV hatesV eatsA sentence can be a noun phrase, followed by a verb, followed byanother noun phrase:S PVP

The Little English Grammar Taken all together, a grammar G1 for a small subset ofunpunctuated English:S PVPP ANV lovesV hatesV eatsA aA theN dogN catN rat Each production says how to modify strings by substitution x y says, substring x may be replaced by y

S PVPP ANV lovesV hatesV eatsA aA theN dogN catN rat Often there is more than one place in a string where a production could beapplied For example, P loves P: P loves P AN loves P P loves P P loves AN The derivations on the previous slide chose the leftmost substitution at everystep, but that is not a requirement The language defined by a grammar is the set of lowercase strings that have atleast one derivation from the start symbol S

S PVPP ANV loves hates eatsA a theN dog cat rat Often, a grammar contains more than one production with the same left-handside Those productions can be written in a compressed form The grammar is not changed by this This example still has ten productions

Complaint letter generatorhttps://www.pakin.org/complaint/

4-Tuple Definition A grammar G is a 4-tuple G (V, Σ, S, P), where: V is an alphabet, the nonterminal alphabetΣ is another alphabet, the terminal alphabet, disjoint from VS V is the start symbolP is a finite set of productions, each of the formx y, where x and y are strings over Σ V andx ε

NFA to Grammar To show that there is a grammar for every regular language, we will show how toconvert any NFA into an equivalent grammar That is, given an NFA M, construct a grammar G with L(M) L(G) First, an example

Example: The grammar we will construct generates L(M)In fact, its derivations will mimic what M doesFor each state, our grammar will have a nonterminal symbol (S, R and T)The start state will be the grammar's start symbolThe grammar will have one production for each transition of the NFA,and one for each accepting state

Example: For each possible transition Y δ(X,z) in the NFA, our grammar has a productionX zY That gives us these four to start with:

Example: In addition, for each accepting state in the NFA, our grammar has an εproduction That adds one more:

Example: The complete grammar has one production for each transition, and one for eachaccepting state:S aSS bRR cRR TT ε

S aSS bRR cRR TT ε Compare the behavior of M as it accepts abc with thebehavior of G as it generates abc: Every time the NFA reads a symbol, the grammar generatesthat symbol In general, M can be in state Y after reading string x if andonly if G can derive the string xY

Theorem 10.4Every regular language is generated by some grammar. Proof is by construction; let M (Q, Σ, δ, S, F) be any NFA Construct G (Q, Σ, S, P) Q, Σ, and S are the same as for M P is constructed from δ and F: Wherever M has Y δ(X,z), P contains X zY And for each X F, P contains X ε Now G has X zY whenever By induction we can extend this to any string z Σ*:G has X * zY whenever And by construction, G has Y ε whenever M has Y F So for all strings z Σ*, δ*(S,z) contains at least one element of F if and only if S * z L(M) L(G)

Recall: {anbn} Is Not Regular1. Proof is by contradiction using the pumping lemma for regularlanguages. Assume that L {anbn} is regular, so the pumping lemmaholds for L. Let k be as given by the pumping lemma.2. Choose x, y, and z as follows:x aky bkz εNow xyz akbk L and y k as required.3Let u, v, and w be as given by the pumping lemma, so that uvw y, v 0, and for all i 0, xuviwz L.4Choose i 2. Since v contains at least one b and nothing but bs, uv2whas more bs than uvw. So xuv2wz has more bs than as, and so xuv2wz L.5By contradiction, L {anbn} is not regular.

Examples We've proved that these languages are not regular, yet theyhave grammars {anbn} {xxR x {a,b}*} {anbjan n 0, j 1} Although not right-linear, these grammars still follow arather restricted form

Context-Free Grammars A context-free grammar (CFG) is one in which everyproduction has a single nonterminal symbol on the left-handside A production like R y is permitted; It says that R can be replaced withy, regardless of the context of symbols around R in the string One like uRz uyz is not permitted. That would be context-sensitive: itsays that R can be replaced with y only in a specific context

Context-Free Languages A context-free language (CFL) is one that is L(G) for some CFG G Every regular language is a CFL Every regular language has a right-linear grammar Every right-linear grammar is a CFG But not every CFL is regular {anbn} {xxR x {a,b}*} {anbjan n 0, j 1}

Language Classes So Far

Writing CFGs Programming: A program is a finite, structured, mechanical thing that specifies apotentially infinite collection of runtime behaviors You have to imagine how the code you are crafting will unfold when itexecutes Writing grammars: A grammar is a finite, structured, mechanical thing that specifies apotentially infinite language You have to imagine how the productions you are crafting will unfold in thederivations of terminal strings Programming and grammar-writing use some of the same mentalmuscles Here follow some techniques and examples

Regular Languages If the language is regular, we already have a technique for constructing a CFG Start with an NFA Convert to a right-linear grammar using the construction from chapter 10

ExampleL {x {0,1}* the number of 0s in x is divisible by 3}S 1S 0T εT 1T 0UU 1U 0S

ExampleL {x {0,1}* the number of 0s in x is divisible by 3} The conversion from NFA to grammar always works But it does not always produce a pretty grammar It may be possible to design a smaller or otherwise more readable CFGmanually:S 1S 0T εT 1T 0UU 1U 0SS T0T0T0S TT 1T ε

Balanced Pairs CFLs often seem to involve balanced pairs {anbn}: every a paired with b on the other side {xxR x {a,b}*}: each symbol in x paired with its mirror image in xR {anbjan n 0, j 1}: each a on the left paired with one on the right To get matching pairs, use a recursive production of the form R xRy This generates any number of xs, each of which is matched with a y on the otherside

Examples We've seen these before: {anbn} {xxR x {a,b}*} {anbjan n 0, j 1}S aSb εS aSa bSb εS aSa RR bR b Notice that they all use the R xRy trick

Examples {anb3n} Each a on the left can be paired with three bs on the right That givesS aSbbb ε {xy x {a,b}*, y {c,d}*, and x y } Each symbol on the left (either a or b) can be paired with one on the right(either c or d) That givesS XSY εX a bY c d

Concatenations A divide-and-conquer approach is often helpful For example, L {anbncmdm} We can make grammars for {anbn} and {cmdm}:S1 aS1b εS2 cS2d ε Now every string in L consists of a string from the first followed by a stringfrom the second So combine the two grammars and add a new start symbol:S S1S2S1 aS1b εS2 cS2d ε

Concatenations, In General Sometimes a CFL L can be thought of as the concatenationof two languages L1 and L2 That is, L L1L2 {xy x L1 and y L2} Then you can write a CFG for L by combining separate CFGsfor L1 and L2 Be careful to keep the two sets of nonterminals separate, so nononterminal is used in both In particular, use two separate start symbols S1 and S2 The grammar for L consists of all the productions from thetwo sub-grammars, plus a new start symbol S with theproduction S S1S2

Unions, In General Sometimes a CFL L can be thought of as the union of twolanguages L L1 L2 Then you can write a CFG for L by combining separate CFGsfor L1 and L2 Be careful to keep the two sets of nonterminals separate, so nononterminal is used in both In particular, use two separate start symbols S1 and S2 The grammar for L consists of all the productions from thetwo sub-grammars, plus a new start symbol S with theproduction S S1 S2

ExampleL {z {a,b}* z xxR for some x, or z is odd} This can be thought of as a union: L L1 L2 L1 {xxR x {a,b}*} L2 {z {a,b}* z is odd} So a grammar for L isS1 aS1a bS1b εS2 XXS2 XX a bS S1 S2S1 aS1a bS1b εS2 XXS2 XX a b

ExampleL {anbm n m} This can be thought of as a union: L {anbm n m} {anbm n m} Each of those two parts can be thought of as aconcatenation: L1 {anbn}L2 {bi i 0}L3 {ai i 0}L L1L2 L3L1 The resulting grammar:S S1S2 S3S1S1 aS1b εS2 bS2 bS3 aS3 a

BNF John Backus and Peter Naur A way to use grammars to define the syntax of programminglanguages (Algol), 1959-1963 BNF: Backus-Naur Form A BNF grammar is a CFG, with notational changes: Nonterminals are written as words enclosed in angle brackets: exp instead of E Productions use :: instead of The empty string is empty instead of ε CFGs (due to Chomsky) came a few years earlier, but BNFwas developed independently

Example exp :: exp - exp exp * exp exp exp exp exp ( exp ) a b c This BNF generates a little language of expressions: a b (a-(b*c))

Example stmt :: exp-stmt while-stmt compound-stmt . exp-stmt :: exp ; while-stmt :: while ( exp ) stmt compound-stmt :: { stmt-list } stmt-list :: stmt stmt-list empty This BNF generates C-like statements, like while (a b) {c c * a;a a a;} This is just a toy example; the BNF grammar for a full language mayinclude hundreds of productions

Formal vs. Programming Languages A formal language is just a set of strings: DFAs, NFAs, grammars, and regular expressions define these sets in a purelysyntactic way They do not ascribe meaning to the strings Programming languages are more than that: Syntax, as with formal languages Plus semantics: what the program means, what it is supposed to do The BNF grammar specifies not only syntax, but a bit of semantics as well

Parse Trees We've treated productions as rules for building strings Now think of them as rules for building trees: Start with S at the root Add children to the nodes, always following the rules of the grammar: R x says that the symbols in x may be added as children of the nonterminalsymbol R Stop only when all the leaves are terminal symbols The result is a parse tree

Example exp :: exp - exp exp * exp exp exp exp exp ( exp ) a b c exp exp * exp exp - exp * exp a- exp * exp a-b* exp a-b*c

Context-Free Grammars A context-free grammar (CFG) is one in which every production has a single nonterminal symbol on the left-hand side A production like R yis permitted; It says that Rcan be replaced with y, regardless of the context of symbols around Rin the string One like uRz uyzis not permitted. That would be context .

Related Documents:

4 Lecture 10: NDPDAs/CFGs/PDAs 19 Closure Properties of CFLs If A and B are context free languages then: AR is a context-free language TRUE A* is a context-free language TRUE A is a context-free language (complement)? A B is a context-free language ? A B is a context-free language ? Lecture 10: NDPDAs/CFGs/PDAs 20 CFLsClosed Under Union Given two CFLs A and B is A B a

Context-Free Languages A language class larger than the class of regular languages Supports natural, recursive notation called “context- free grammar” Applications: Parsetreescompilers Context-Parse trees, compilers XML Regular (FA/RE) free (PDA/CFG) 3

Probabilistic Context‐Free Grammars Raphael Hoffmann 590AI, Winter 2009. Outline PCFGs: Inference and Learning Parsing English Discriminative CFGs Grammar Induction.

Context-Free Grammars another grammar with precedence and associativity – and – are left-associative operators in mathematics –* and / have higher precedence than and – Grammar G 1 Source: Tucker & Noonan (2007) Context-Free Grammars parse tree for 4**2**3 5 * 6 7 Source: Tucker & Noonan (2007) Context-Free Grammars

Non-Context Free Languages Portions c 2000 Rance Cleaveland c 2004 James Riely Automata Theory and Formal Grammars: Lecture 7 – p.1/48 - Non-Context Free Languages Last Time Context-free grammars and languages Closure properties of CFLs Relating regular languages and CFLs Today

formal grammars and the devices that are used to carry them out by briefly comparing Context-Free Phrase Structure Grammars with the framework object of our studying, Categorial Grammars (CG)1. Phrase Structure Grammars. In formal language theory a language is defined as a set of st

q Closure properties for CFL CFL Review of CFL. Human-aware Robotics 3 FA (DFA & NFA) express Regular Expressions (RE) Context free langauges o Context free grammars q Design CFGs q Ambiguities q Chomsky normal form o Pushdown automata q Definition of PDA q

Hardware Design Description Introduction The PCB scope is the result of a challenge I set for myself – to build a practically usable oscilloscope with a minimum amount of components and for minimum cost. The practical benefit is of course that this is an instrument that I hope will be interesting to many teachers, students and hobbyists looking for an affordable, simple tool for their .