Complexity Theory Formal Languages & Automata Theory

3y ago
55 Views
6 Downloads
1.24 MB
190 Pages
Last View : 28d ago
Last Download : 3m ago
Upload by : Wade Mabry
Transcription

Complexity TheoryFormal Languages &Automata TheoryCharles E. HughesCOT6410 – Spring 2021 Notes

Regular LanguagesI Hope This is Mostly ReviewRead Sipser or Aho, Motwani, andUllman if not old stuff for you

Finite-State Automata A Finite-State Automaton (FSA) has only oneform of memory, its current state. As any automaton has a predetermined finitenumber of states, this class of machines is quitelimited, but still very useful. There are two classes: Deterministic Finite-StateAutomata (DFAs) and Non-Deterministic FiniteState Automata (NFAs) We focus on DFAs for now.1/26/21UCF @ CS3

Concrete Model of FSAA (Q,Σ,δ,q0,F): Deterministic Final State Automaton (DFA)L L(A) is a finite-state (regular) language over finite alphabet SEach xi is a character in Sw x1 x2 xn is a string to be tested for membership in Lx1x2x3 Xn-1 xnq0 Blue arrow above represents read head that starts on left. q0 Q (finite state set) is initial state of machine. Only action at each step is to change state based oncharacter being read and current state. State change isdetermined by a transition function d: Q S Q. Once state is changed, read head moves right. Machine stops when head passes last input character. Machine accepts a string as a member of L if it ends up ina state from Final State set F Q.1/26/21UCF @ CS4

Deterministic Finite-StateAutomata (DFA) A deterministic finite-state automaton (DFA) A is definedby a 5-tupleA (Q,Σ,δ,q0,F), where– Q is a finite set of symbols called the states of A– Σ is a finite set of symbols called the alphabet of A– δ is a function from Q Σ into Q (δ: Q Σ Q) calledthe transition function of A– q0 Q is a unique element of Q called the start state– F is a subset of Q (F Q) called the final states (canbe empty)1/26/21UCF @ CS5

DFA Transitions Given a DFA, A (Q,Σ,δ,q0,F), we can definition the reflexivetransitive closure of δ, δ*:Q Σ* Q, by– δ*(q,l) q where l is the string of length 0 Some use rather than l as symbol for string of length zero– δ*(q,ax) δ*(δ(q,a),x), where a Σ and x Σ*– Note that this meansδ*(q,a) δ(q,a), where a Σ as a al – Also, if δ*(q,x) p and δ*(p,y) r then δ*(q,xy) rWe also define the transitive closure of δ, δ , by– δ (q,w) δ*(q,w) when w 0 or, equivalently, w Σ The function δ* describes every step of computation by theautomaton starting in some state until it runs out of characters toread1/26/21UCF @ CS6

Regular Languages and DFAs Given a DFA, A (Q,Σ,δ,q0,F), we can define thelanguage accepted by A as those strings that cause it toend up in a final state once it has consumed the entirestring Formally, the language accepted by A is– { w δ*(q0,w) F } We generally refer to this language as L(A) We define the notion of a Regular Language by sayingthat a language is Regular if and only if it is accepted(recognized) by some DFA1/26/21UCF @ CS7

State Diagram A finite-state automaton can be described by astate diagram, where– Each state is represented by a node labelled with thatstate, e.g., q– The start state has an arc entering it with no source,e.g.,q0– Each transition δ(q,a) s is represented by a directedarc from node q to node s that is labelled with theletter a, e.g., q a s– Each final state has an extra circle around its node,e.g.,f1/26/21UCF @ CS8

Sample DFAs # 1, 20A01EO1A ( {E,O}, {0,1}, d, E, {O}), where d is defined by above diagram.L(A) { w w is a binary string of odd parity }0001,10SA’C 11NC 00,11X01,10A’ ( {C,NC,X}, {00,01,10,11}, d’, C, {NC}), where d’ is defined by abovediagram.L(A’) { w w is a pair of binary strings where the bottom string is the 2’scomplement of the top one, both read least (lsb) to most significant bit (msb) }1/26/21UCF @ CS9

Sample DFA # 3A”A” ( {0,1,2,3,4}, {0,1}, d”, 0, {2,3}), where d” is defined by abovediagram. L(A”) { w w is a binary string that, read left to right (msb tolsb), when interpreted as a decimal number divided by 5, has aremainder of 2 or 3 }1/26/21 UCF CS10

Sample DFA # 4A”’LRNLERLSRWL RA”’ ( {N,E,W,S}, {R,L}, d”’, N, {N}), where d”’ is defined by above diagram.L(A”’) { w w is a set of commands passed to a sentinel that starts facingNorth and changes directions R(ight)/clockwise or L(eft)/counterclockwisebased on the corresponding input character. w must eventually lead thesentinel back to facing North }1/26/21 UCF EECS11

State Transition Table A finite-state automaton can be described by a statetransition table with Q rows and Σ columns Rows are labelled with state names and columns withinput letters The start state has some indicator, e.g., a greater thansign ( q) and each final state has some indicator, e.g.,an underscore (f) The entry in row q, column a, contains δ(q,a) In general we will use state diagrams, but transitiontables are useful in some cases (state minimization)1/26/21UCF @ CS12

Sample DFA # 4Accept %5A’’’ ( {0%5,1%5,2%5,3%5,4%5}, {0,1}, d’’’, 0, {3%5}), where d’’’ is definedby above diagram.L(A’’) { w w is a binary string of length at least 1 being read left to right(msb to lsb) that, when interpreted as a decimal number divided by 5, has aremainder of 3 }Really, this is better done as a state diagram similar to what you saw earlierbut have put this up so you can see the pattern.1/26/21UCF @ CS13

Sample DFA # 0@Aa0Aa0@A0@a0@Aa0@@# % &@A@a@0@@Aa@A0@A@a0@a@0@Aa0@Aa@A0@a0@Aa0@This checks a string to see if it’s a legal password. In our case, a legalpassword must contain at least one of each of the following: lower case letter,upper case letter, number, and special character from the following set{!@# % &}. No other characters are allowed1/26/21UCF @ CS14

FSAs and Applications A synchronous sequential circuit has– Binary input lines (input admitted at clock tick)– Binary output lines (simple case is one line) 1 accepts; 0 rejects input– Internal flip flops (memory) that define state (n flip flops 2n states)– Simple combinatorial circuits (and, or, not) that combine current stateand input to alter state– Simple combinatorial circuits (and, or, not) that use state to determineoutput Think about FSA to recognize the string PAPAPATappearing somewhere in a corpus of text, say with asubstring PAPAPAPATRICK Comments about GREP and Lexical Analysis1/26/21UCF @ CS15

Complement of Regular Sets Let A (Q,Σ,δ,q0,F) and let L L(A) thenw L(A) iff δ*(q0,w) F iff δ*(q0,w) Q-F Simply create new automatonAC (Q,Σ,δ,q0,Q-F) L(AC) { w δ*(q0,w) Q-F } { w δ*(q0,w) F } { w w L(A) } Choosing the right representation can make a very bigdifference in how easy or hard it is to prove someproperty is true1/26/21UCF @ CS16

Parallelizing DFAs Regular sets can be shown closed under many binary operationsusing the notion of parallel machine simulationLet A1 (Q1,Σ,δ1,q0,F1) and A2 (Q2,Σ,δ2,s0,F2) whereQ1 Q2 ØB (Q1 Q2,Σ,δ3, q0,s0 ,F3) whereδ3( q,s ,a) δ1(q,a), δ2(s,a) , qÎQ1, sÎQ2, aÎΣUnion is F3 F1 Q2 Q1 F2 Intersection is F3 F1 F2 – Can also do by combining union and complement Difference is F3 F1 (Q2 – F2)– Can also do by combining intersection and complement Exclusive Or is F3 F1 (Q2-F2) (Q1-F1) F21/26/21UCF @ CS17

Reversal of L If x is a string over Σ and x a1 a2 an,then xR (x reversed) an a2 a1 If L is some language, thenLR { xR x L } Trying to show if L is Regular that LR is alsoRegular, using DFAs is problematic Could change start state to final, all final to startstates and reverse all arcsthat is, if δ(q,a) p then δR(p,a) q, but then theautomaton is no longer deterministic1/26/21UCF @ CS18

Non-determinism NFA A non-deterministic finite-state automaton (NFA) A isdefined by a 5-tupleA (Q,Σ,δ,q0,F), where– Q is a finite set of symbols called the states of A– Σ is a finite set of symbols called the alphabet of A– δ is a function from Q Σe into P(Q) 2Q ;Note: Σe (Σ {l})(δ: Q Σe P(Q)) called the transition function of A;by definition q δ(q,l)– q0 Q is a unique element of Q called the start state– F is a subset of Q (F Q) called the final states1/26/21UCF @ CS19

Comments on NFAs A state/input (called a discriminant) can lead nowhere,one place or many places in an NFA; moreover, an NFAcan jump between states even without reading any inputsymbol For simplicity, we often extend the definition of δ: Q Σeto a variant that handles sets of states, where δ: P(Q) Σe is defined asδ(S,a) q S δ(q,a), where a Σeif S Ø, q S δ(q,a) Ø1/26/21UCF @ CS20

NFA Transitions Given an NFA, A (Q,Σ,δ,q0,F), we can define thereflexive transitive closure of δ, δ*: P(Q) Σ* P(Q), by– l-Closure(S) { t t δ*(S,l)}, S P(Q) – extended δ– δ*(S,l) l-Closure(S)– δ*(S,ax) δ*(l-Closure(δ(S,a)),x), where a Σ and x Σ* Note that δ*(S,ax) q S p l-Closure(δ(q,a)) δ*(p,x), where a Σ and x Σ* We also define the transitive closure of δ, δ , by– δ (S,w) δ*(S,w) when w 0 or, equivalently, w Σ The function δ* describes every “possible” step ofcomputation by the non-deterministic automaton startingin some state until it runs out of characters to read1/26/21UCF @ CS21

NFA Languages Given an NFA, A (Q,Σ,δ,q0,F), we can define thelanguage accepted by A as those strings that allow it toend up in a final state once it has consumed the entirestring – here we just mean that there is some acceptingpath Formally, the language accepted by A is– { w (δ*(l-Closure({q0}),w) F) Ø } Notice that we accept if there is any set of choices oftransitions that lead to a final state1/26/21UCF @ CS22

Finite-State Diagram A non-deterministic finite-state automaton canbe described by a finite-state diagram, except– We now can have transitions labeled with l– The same letter can appear on multiple arcs from astate q to multiple distinct destination states1/26/21UCF @ CS23

Equivalence of DFA and NFA Clearly every DFA is an NFA except thatδ(q,a) s becomes δ(q,a) {s}, so anylanguage accepted by a DFA can beaccepted by an NFA. The challenge is to show every languageaccepted by an NFA is accepted by anequivalent DFA. That is, if A is an NFA,then we can construct a DFA A’, such thatL(A’) L(A).1/26/21UCF @ CS24

Constructing DFA from NFA Let A (Q,Σ,δ,q0,F) be an arbitrary NFA Let S be an arbitrary subset of Q.– Construct the sequence seq(S) to be a sequence that containsall elements of S in lexicographical order, using angle bracketsto indicate a sequence not a set. That is, if S {q1, q3, q2} thenseq(S) q1,q2,q3 . If S Ø then seq(S) Our goal is to create a DFA, A’, whose state set containsseq(S), whenever there is some w such that S δ*(q0,w) To make our life easier, we will act as if the states of A’are ordered sets, knowing that we really are talkingabout corresponding sequences1/26/21UCF @ CS25

l-Closure Define the l-Closure of a state q as the set of states one can arriveat from q, without reading any additional input.Formally l-Closure(q) { t t δ*(q,l) }We can extend this to S P(Q) byl-Closure(S) { t t δ*(q,l), q S } { t t l-Closure(q),q C}UCF @ CS{C}D{ D, E }E{E}26

DFA from F @ CS27

Details of DFA Let A (Q,Σ,δ,q0,F) be an arbitrary NFA In an abstract sense,A’ ( P(Q) , Σ, δ’, l-Closure({q0}) , F’),where P(Q) is the power set of Q, but we really don’tneed so many states (2 Q ) and we can iterativelydetermine those needed by starting at l-Closure({q0})and keeping only states reachable from here Define δ’( S ,a) l-Closure(δ(S,a)) q S l-Closure(δ(q,a)) , where a Σ, S P(Q) F’ { S P(Q) (S F) Ø }1/26/21UCF @ CS28

Regular Languages and NFAs Showing that every DFA can be simulated by an NFAthat accepts the same language and every NFA can besimulated by a DFA that accepts the same languageproves the following A language is Regular if and only if it is accepted(recognized) by some NFA We now have two equivalent classes of recognizers forRegular Languages1/26/21UCF @ CS29

Simple Exercise:Convert from NFA to DFAlaAAaBlCaDl1/26/21UCF @ CS30

Regular ExpressionsRegular Sets

Regular Expressions Primitive:– Φ– λ– adenotes {}denotes {λ}where a is in Σ denotes {a} Closure:– If R and S are regular expressions then so are R ・ S, R S andR*, where R ・ S denotes RS { xy x is in R and y is in S } R S denotes RÈS { x x is in R or x is in S } R* denotes R* Parentheses are used as needed1/26/21UCF @ CS32

Lexical Analysis Consider distinguishing variable names from keywordslike– IF– INT– [a-zA-Z]([a-zA-Z0-9 ])*return(IFSY);return(INT);return(IDENT); Equivalent to a b z, etc. This really screams for non-determinism– With added constraints of finding longest/first match Non-deterministic automata typically have fewer states However, non-deterministic FSA (NFA) interpretation isnot as fast as deterministic1/26/21UCF @ CS33

Regular Sets Regular Languages Show every regular expression denotes alanguage recognized by a finite-stateautomaton (can do deterministic or nondeterministic) Show every Finite-State Automatarecognizes a language denoted by aregular expression1/26/21UCF @ CS34

Every Regular Set is aRegular Language Primitive:– Φ– λ– adenotes { }denotes {λ}where a is in Σ denotes {a}λaa Closure: (Assume that R’s and S’s states do not overlap)– R・S– R S– R*1/26/21start with machine for R, add l transitions fromevery final state of R’s recognizer to start state of S,making final state of S final states of new machinecreate new start state and add l transitions from newstate to start states of each of R and S, making unionof R’s and S’s final states the new final statesadd l transitions from each original final state of R back to itsstart state; keeping original start and making it only final stateUCF @ CS35

Every Regular Language is aRegular Set Using Rijk This is a challenge that can be addressed in multiple ways.but I like to start with the Rijk approach. Here’s how it works. Let A (Q,Σ,δ,q1,F) be a DFA, where Q {q1,q2, , qn} Rijk {w δ*(qi,w) qj, and no intermediate state visitedbetween qi and qj, while reading w, has index k Basis: k 0, Rij0 { a δ(qi,a) qj } sets are either Φ, λ, orelements of Σ, or λ elements of Σ, and so are regular sets Inductive hypothesis: Assume Rijm are regular sets for0 m k, 1 i,j n Inductive step: k 1, Rijk 1 (Rijk Rik 1k ・ ( Rk 1k 1k )* ・ Rk 1jk) L(A) qf F R1fn1/26/21UCF @ CS36

Convert to RE0q11q201/26/2111q30, 1UCF @ CS37

011q2q11q300, 1 R110 lR210 0R310 fR120 0R220 l 1R320 1R130 fR230 0 1R330 l 1 R111 lR211 0R311 fR121 0R221 l 1 00R321 1R131 fR231 0 1R331 l 1 R112 l 0(1 00)*0R212 (1 00)*0R312 1(1 00)*0R122 0(1 00)*R222 (1 00)*R322 1(1 00)*R132 0(1 00)*(0 1)R232 (1 00)*(0 1)R332 l 1 1(1 00)*(0 1) L R123 0(1 00)* 0(1 00)*(0 1) (1 1(1 00)*(0 1))* 1(1 00)*THIS IS GREAT WAY TO GET FORMAL PROOF1/26/21UCF @ CS38

State Ripping Concept This is like the generalized automata approach you might see inSipser and other places but with fewer arcs than text. It gets some ofits motivation from Rijk approach as well.Add a new start state and add a l–transition to existing start stateAdd a new final state qf and insert l–transitions from all existing finalstates to the new one; make the old final states non-finalExcluding start and final states, successively pick states to removeFor each state to be removed, change the arcs of every pair ofexternally entering and exiting arcs to reflect the regular expressionthat describes all strings that could result is such a double transition;be sure to account for loops in the state being removed. Also, or ( )together expressions that have the same start and end nodesWhen have just start and final, the regular expression that leadsfrom start to final denotes the associated regular set1/26/21UCF @ CS39

State Ripping Details Let B be the node to be removedLet e1 be the regular expression on the arc from some node A to somenode B (A B); e2 be the expression from B back to B (or l if there is norecursive arc); e3 be the expression on the arc from B to some other nodeC (C B but C could be A); e4 be the expression from A to Ce2Ae1Be3Ce4 Note that this just says, what if I allowed the path from A to C to includetransitions through B, then what is new regular expression? The form isexactly what we saw in Rijk.1/26/21UCF @ CS40

State Ripping Detailse2Ae1Be3Ce4 Erase the existing arcs from A to B and A to C, adding a new arc from A toC labelled with the expressione4 e1 e2* e3Note that all other arcs associated with A and C are untouched.e2ABCe4 e1 e2* e31/26/21UCF @ CS41

State Ripping Detailse2ABCe4 e1 e2* e3 Do this for all nodes that have edges to B until B has no more entering.edges; at this point remove B and any edges it has to other nodes and itselfIterate until all but the start and final nodes remain.The expression from start to final describes the regular set that is equivalentto the regular language accepted by the original automaton.Note: Your choices of the order of removal make a big difference in howhard or easy this is.1/26/21UCF @ CS42

Use Ripping; Rip q30q0l1q1q20lq30 1qf1 (0 1)1 q1q201/26/211l0q01lqfUCF @ CS43

Use Ripping; Rip q10q01 (0 1)1 q1lq20lqf1 (0 1)1 00q00q2lqf1/26/21UCF @ CS44

Use Ripping; Rip q21 (0 1)1 000q0q2lqf0 (1 (0 1)1 00)*q0qfL 0 (1 (0 1)1 00)*1/26/21UCF @ CS45

Regular Equations (Arden) Assume that R, Q and P are sets such that Pdoes not contain the string of length zero, and Ris defined by R Q RP We wish to show that R QP* This is called “Arden’s Theorem” (Google it!!)1/26/21UCF @ CS46

Show QP* is a Solution We first show that QP* is contained in R. Bydefinition, R Q RP. To see if QP* is a solution, we insert it as thevalue of R in Q RP and see if the equationbalances. R Q QP*P Q(λ P*P) Q(λ P ) QP* Hence QP* is a solution, but not necessarily theonly solution.1/26/21UCF @ CS47

Uniqueness of Solution To prove uniqueness, we show that R is contained in QP*.By definition, R Q RP Q (Q RP)P Q QP RP2 Q QP (Q RP)P2 Q QP QP2 RP3. Q(λ P P2 . Pi) RPi 1, for all i 0Choose any w in R, where w k. Then, from above,R Q(λ P P2 . Pk) RPk 1but, since P does not contain the string of length zero, w is not inRPk 1. But then w is inQ(λ P P2 . Pk) and hence w is in QP*.1/26/21UCF @ CS48

Reg. Eq. Process Let A (Q,Σ,δ,q1,F) be a DFA For each pair of states, A,B in Q, where for some input‘a’, δ(B,a) A, include the term Ba in the right-side of theequation for A, that is, A BaThis just says that any solution for A must include thesolution for B followed by an ‘a’. If A is the start state, then include λ as one of the termsas well, that is A λ This just says that any solution for A must include λsince A is the start state.1/26/21UCF @ CS49

Example We use the above to solve simultaneous regular equations.For example, we can associate regular expressions withfinite-state automata as follows Hence, For A, Q l B1; P 0A QP* (l B1)0* B10* 0* B B10*1 B0 0*1For B, Q 0*1; P B10*1 B0 B(10*1 0) and therefore B 0*1(10*1 0)* Note: This technique fails if there are self lambda transitions.1/26/21UCF @ CS50

Using Regular Equations0A11B01C0, 1A l B0B A0 C1 B1C B(0 1) C1; C B(0 1)1*B 0 B00 B(0 1)1 B1B 0 B (00 (0 1) 1 1); B 0(00 (0 1)1 1)* 0 (1 (0 1)1 00)*This is same form as with state ripping. It won’t always be so.1/26/21UCF @ CS51

Use Reg. Eq. to So

Formal Languages & Automata Theory Charles E. Hughes COT6410 –Spring 2021 Notes. Regular Languages I Hope This is Mostly Review Read Sipseror Aho, Motwani,and Ullman if not old stuff for you. Finite-State Automata A Finite-State Automaton (FSA

Related Documents:

Advanced Automata Theory is a lecture which will rst review the basics of formal languages and automata theory and then give insight into speci c topics from wider area of automata theory. In computer science, automata are an important tool for many theoretical results and various types of automata

CIT 342 Formal Languages and Automata Theory Introduction CIT 342 – Formal Languages and Automata Theory is a two (2) credit unit course of 16 units. The course will cover the important formal languages in the Chomsky hierarchy -- the regu

Automata and Formal Languages II Tree Automata Peter Lammich SS 2015 1/161. Overview by Lecture Apr 14: Slide 3 Apr 21: Slide 2 Apr 28: Slide 4 May 5: Slide 50 . Finite tree automata: Basic theory (TATA Ch. 1) Pumping Lemma, Clo

FORMAL LANGUAGES AND AUTOMATA THEORY, H S Behera , Janmenjoy Nayak , Hadibandhu Pattnayak , Vikash Publishing, New Delhi. 3. Anand Sharma, “Theory of Automata and Formal Languages”, Laxmi Publisher. Formal language The alphabet of a formal language is the set of s

(Recognizable languages) Are certain automata . closed . under union, intersection, or complementation of formal languages? (Closure properties) How much is a type of automata expressive in terms of recognizing class of formal languages? And, their relative expressive power? (Language Hie

Automata Theory, Languages, and Computation S ECO IN O EDITION Pearson Educatic ULBI Hil Darmstadtl III 16356298 River, N.J. 07458. Table of Contents . 1.1.1 Introduction to Finite Automata 2 1.1.2 Structural Representations 4 1.1.3 Automata and Complexity 5 1.2 Introduction to Formal Proof 5 1.2.1 Deductive Proofs 6 1.2.2 Reduction to .

Minimisation of automata. Contributes to the following learning outcome: Explain and manipulate the di . concepts in automata theory and formal lang ; Understand the power and the limitations of regular lang and context-free lang ; Prove properties of languages , grammars and automata with rigorou

Four years later, the international standard ISO 14001:1996 was adopted. After eight years, the standard was updated with ISO 14001:2004 and re-viewed in 2015 [ISO 2015; Şahin 2014]. The number .