Anda di halaman 1dari 8

AUTOMATA THEORY in COMPILER DESIGN

Om Prakash Jha 1 1 MCA in School of Computer Science and Engineering, Bharathiar University, Coimbatore Om.prakash.mck@gmail.com
st

R. Balu 2 Ph.D Research Scholar and Guest Lecturer School of Computer Science and Engineering, Bharathiar University, Coimbatore rvkbalu@yahoo.co.in

1.0 INTRODUCTION: In theoretical computer science, automata theory is the study of abstract machines and the problem which are able to solve[1]. These abstract machines are called automata.A discrrete automatan is a mathamatical model for a finite state machine.An finite state machine is a machine that takes a symbol as input and "jumps" or transition, from one state to another according to transition function,which can be expressed as a table.this transition function tells the automaton which state to go to next given a current state and a current symbol
[2]

. Automata

theory is closely related to formal language theory are often classifed by the class of formal language they are able to recognize.Automata play role in computer design and parsing. 2.0 AUTOMATA THEORY: An automaton is supposed to run on some given sequence or string in discrete time steps.At each time step,an automaton gets one input is picked up from a set of symbols or letters,which is called an alphabet.An automaton takes input from a finite sequence of symbols, which is called a word.An automaton contains a finite set of states.During each time instance of some run,automaton has to be "in" one of its state.At each time step when automaton reads a symbols,it "jumps" or "trasits" to next state depending on its current state and the read

symbol.This function over current state and input symbol is called transition function.The automaton reads input word one symbol after another in the sequence and transits from state to state according to the transitiom function,untill the word is read completely. Once the input word is read,the automaton is said to have been stopped and the state at which automaton has stopped is called final state .depending on the final state,it's said that the automaton either accept or rejects an input word.There is a subset of states of the automaton,which is defined as a set accepting states.If the final state is an accept state,then the automaton accept the word.Otherwise,the word is rejected.The set of all the words accepted by an automaton is called the language recognized by the automaton.

2.1 FORMAL DESCRIPTION OF AUTOMATA THEORY[2]: A finite state automaton or simply, automaton M consists of five parts: 1: A finite set (alphabet) A of inputs. 2: A finite set S of (internal) states. 3: A subset Y of S (whose element are called accepting oryes state). 4: An initial state so in S. 5: A next-state function F from S A into S. Such an automaton M is denoted by M= (A, S, Y, So, F) when we want to indicate its five part. IN OTHER WORDS: A finite state automata is a collection of function F: S A S And fa: S s a A.

That is each input a may be viewed as causing a change in the state of the function M. 2.2 VARIATION IN DEFINATION OF AUTOMATA: Automata are defined to study useful machines under machines under mathematical formalism so,the definition of an automata is open to variation according to the real word machine,which we want to model using the automaton.People have studied many variation of automata . Above, the most standerd variant is described, which is called deterministic finite automaton.The following are some popular variation in the definationof different components of automata. INPUT: FINITE INPUT: An automaton that accepts only finite sequence of words.The above introductory definition only accepts finite words. INFINITE INPUT: An automaton that accepts infinite word (w-word).Such automata are called w-automata. TREE WORD INPUT:The input may bea tree of symbols instead of sequence of symbols.In this case after reading each symbol,the automaton reads all the successor symbols in the input tree.It is said that automaton makes one copy of itself for each successor and each such copy starts running relationof the automaton.Such an automaton is called tree automaton.

STATES: FINITE STATES: An automaton that contains only a finite number of states.The above introductory definition describes with finite number of states. INFINITE STATES:An automaton that may not have a finite number of states,or even a countable numb STACK MEMORY: An automaton may also contain some extra memory in the form stack in which symbols can be pushed and popped.This of automatonis called a pushdown automaton. 3.0 DETERMINISTIC FINITE STATE-STATE MACHINE In a deterministic system,every action,or cause,produces a reaction ,or effect,and every reaction, in turn,becomes the cause of subsequent reaction.The totality of these cascading events can theoretically show exactly how the system will exist at any moment in time. Thus,a non-deterministic syatem is one where knowing the state of a model and the stimulus,we cannot precisely know the next state and output. At Nvidia-the term non-determinism is used for situatios where the C model can not predict the DUTs behavior. 1 0

S
1 0

So

Start 1

S2
Fig: 1 Finite State-State Machine

An example of a deterministic finite automaton that accepts only binary number that are multiples of 3.the state So is both the start state and an accept state.In the theory ofcomputation,a deterministis finite state machine- also known as deterministic finite automata(DFA)-is a finite state machine accepting strings of symbols.The list of symbols used by a DFA is called its alphabet.For each state,there is a transition arrow leading out to a next state for each symbol in the alphabet.This contraats with a non-detetrministic finite automaton(NFA),in which a state may have more than one transition for the same input symbol.Every DFA has a start state (denote graphically by an arrow coming in form nowhere) where computation begin,and a set of accept state (denoted graphically by a double circle) which help define when a computation is successful.DFAs recognize exactly the set of regular language which are, among other things, useful for doing lexical analysis and pattern matching. 4.0 FINITE AUTOMATA IN LEXICAL ANALYSIS[3]: A recognizer program,such as a lexical analyzer,reads a string as input and output yes if the string is a sentence in a language,no if it is not.A lexical analyzer has to do more than say yes or no to be useful,so an extra layer is usually added around the recognizer itself.When a certain string is recognized,the second layer perform an action associated with that string.lexical analysis takes an input file comprised of regular expression and associated action(code).It then builds a recognier program that executs the code in the action when a string is recognized.Lexical analysis builds the recognizer component of the analyzer by translating regular expression that represent the lexems in to a finite automaton or finite state machine.In the given fig:Transition diagram for a state machine that recognizes the four string he,she,his,and hers.
START

h 0 1

e
2

s 4

i s
7

Fig: 2 Transition diagram for a state machine

The circles are individual state,marked with the number-an arbitrary element that identifies the state.State 0 is the start state, and the machine is inially in this state.The lines connecting the states represent the transitions,these line are called edges and the label on the edge representcharacter that cause the transition from one state to another (in the direction of the arrow ).From the start state,reading an h from the input causes a transition to state 1;from state 1,an egets the machine to state 3,and an I causes a transation to state 5;and so on.T transation from state N to state M on the character C is often represented with the

notation:next(N,c)=M.This function is called the move function.

Fig: 3 Lexical Analyzer [4] The states with double circle are called accepting state.Entering an accepting state signifies recognition of a particular input string,and there is usually some sort of action associated with the accepting state (in lexical-analyzer application,a token is returned).Unmark edges(for example,ther are no out going edges marked with an I,s,r,or e from state 0)are all inplied transition to a special implicit error state.State machine such as the foregoing can be modeled with two data structures:A single variable holding the current state number and a two dimentional array for computing the next state. One axis is indexed by the input character,the other by the current state, and the array holds the next state. Transition and another to tell you whether a state is accepting or not.The next state is determined from the current state and input character,The other by the current state,and the array hold the next state. In the above table two array are used, one to hold the state transation and another to tell whether a state is accepting or not.

Table1: Representing the state machine 5.0 STATE-MACHINE-DRIVEN LEXICAL ANALYZERS: This section demonstrate how state machines are used for lexical analysis by looking, at a high level,at the method used by a lexical-generated lexical analyzer.I will describe a simple tabledriven analyzer that recognizes decimal and floating-point constant.The following regular expression describe these constants: [0+9]+([0-9]+1[0-9]*\.[0-9]+1[0-9]+\.[0-9]*< >e[0-9]+)? {retorn ICON & retornFCON} The code to the right of the regular expression is executed by the lexical analyzer when an input string that matches that expression is recognized.The first expression recognizes a simple sequence of one or more digit.The second expression recognizes a floating point constant.The (e[0-9]+)? At the end of the second regular expression is the optional engineering notation at the

end of the number.I have simplified by not allowing the usual + or to follow the e, and only a lower case e is recognized.Lexical analysis uses a state-machine approach to recognine regular expression, and a DFA that recognizes the previous expression is shown in fig(3) The same machine is represented as an array in table(2).The next state is computed using that array with: Next state=array[current-state] [input] Current [0-9]
0

Look character 0-9

ahead Accepting action e 5 5 Return ICON; ReturnFCON; ReturnFCON; -

state

[0-9]

return icon 0 Returfcon 1 2 3 [0-9] 4 5 3 2 -

1 1 2 2 4 4

[0-9]

[0-9]

return fcon [0-9] Fig:[4] Regular Expression

Table:[2] Regular Expression Table Here two example of to show the machine,the first with the input1.2e4.lexical analysis start in 0.the 1 causes a transition to state 1and the input is advanced. Since 1 is potential accepting state. The e now gets final 5, which is not an accepting state, so no other action is performed here is no legal transition out of state 4 on end of input, so the machine enters state(4)is performed and the machine return FCON. The next tome the subroutine is called, it return zero immediately, because the look ahead character is end of input.

6.0 Conclusion: It is very beneficial to work in compiler design because compiler is complex program.

REFERENCES 1. Hari kishan, Discrete Mathematics , 2008, Pragati Prakashan. 2. Semyour Lipschuth, Discrete Mathematics, 2008, Tata McGraw-Hill. 3. Allen I.Holup, Computer Design in C, 2005,PrenticeHall of India 4. Automata and Compiler Design Notes at http://www.mediafire.com/?r4yq2danoab4og6

Mr. Om Prakash Jha pursuing Master of Computer Science in Bharthiar University, Coimbatore in 2010 and B.Sc., Mathematics from Tilkamaghi Bhagalpur University, BIHAR in 2006.

Mr. R. Balu received his Master of Computer Science from Bharathidasan University, Trichy in 2005 and M.Phil, in Computer Science from Bharathiar University,

Coimbatore in 2008. Currently, he is working a faculty in School of Computer Science and Engineering, Bharathiar University, Coimbatore. He is pursuing his Ph.D. his research in image mining. His current research interests are in the fields of Networks, Multimedia and Data Mining. He is a member of IEEE, CSS, IET and IAENG.

Anda mungkin juga menyukai