Anda di halaman 1dari 6

Alphabets, Strings and Languages

Languages :
A general definition of language must cover a variety of distinct categories: natural languages, programming languages, mathematical languages, etc. The notion of natural languages like English, Hindi, etc. is familiar to us. Informally, language can be defined as a system suitable for expression of certain ideas, facts, or concepts, which includes a set of symbols and rules to manipulate these. The languages we consider for our discussion is an abstraction of natural languages. That is, our focus here is on formal languages that need precise and formal definitions. Programming languages belong to this category. We start with some basic concepts and definitions required in this regard.

Symbols :
Symbols are indivisible objects or entity that cannot be defined. That is, symbols are the atoms of the world of languages. A symbol is any single object such as , a, 0, 1, #, begin, or do. Usually, characters from a typical keyboard are only used as symbols. Quiz

1. Prove that (xy)R = yRxR ,


2. Show that | x k | = k |x| , x, y

x, y n 0

and

3. Consider the language L={ 01, 11, 011}. Which of the following strings are in L* 010101, 0001, 110, 010111101, 0111111110, 11010111111101, 110111110011, 11101101? 4. Let L1={ 00,11} and L2={ , 0, 01 } a) List the strings in the set L1L2. b) List the strings of the set L2* of length three or less.

c) How many strings of length 5 are there in L1*?

Automata and Grammars


Automata
An automata is an abstract computing device (or machine). There are different varities of such abstract machines (also called models of computation) which can be defined mathematically. Some of them are as powerful in principle as today's real computers, while the simpler ones are less powerful. ( Some models are considered even more powerful than any real computers as they have infinite memory and are not subject to physical constraints on memory unlike in real computers). Studying the simpler machines are still worth as it is easier to introduce some formalisms used in theory.

Every automaton consists of some essential features as in real computers. It has a mechanism for reading input. The input is assumed to be a sequence of symbols over a given alphabet and is placed on an input tape(or written on an input file). The simpler automata can only read the input one symbol at a time from left to right but not change. Powerful versions can both read (from left to right or right to left) and change the input. The automaton can produce output of some form. If the output in response to an input string is binary (say, accept or reject), then it is called an accepter. If it produces an output sequence in response to an input sequence, then it is called a transducer(or automaton with output). The automaton may have a temporary storage, consisting of an unlimited number of cells, each capable of holding a symbol from an alphabet ( whcih may be different from the input alphabet). The automaton can both read and change the contents of the storage cells in the temporary storage. The accusing capability of this storage varies depending on the type of the storage.

Finite Automata
Automata (singular : automation) are a particularly simple, but useful, model of computation. They were initially proposed as a simple model for the behavior of neurons. The concept of a finite automaton appears to have arisen in the 1943 paper A logical calculus of the ideas immanent in

nervous activity", by Warren McCullock and Walter Pitts. In 1951 Kleene introduced regular expressions to describe the behaviour of finite automata. He also proved the important theorem saying that regular expressions exactly capture the behaviours of finite automata. In 1959, Dana Scott and Michael Rabin introduced non-deterministic automata and showed the surprising theorem that they are equivalent to deterministic automata. We will study these fundamental results. Since those early years, the study of automata has continued to grow, showing that they are indeed a fundamental idea in computing.

Nondeterministic Finite Automata (NFA)


Nondeterminism is an important abstraction in computer science. Importance of nondeterminism is found in the design of algorithms. For examples, there are many problems with efficient nondeterministic solutions but no known efficient deterministic solutions. ( Travelling salesman, Hamiltonean cycle, clique, etc). Behaviour of a process is in a distributed system is also a good example of nondeterministic situation. Because the behaviour of a process might depend on some messages from other processes that might arrive at arbitrary times with arbitrary contents. It is easy to construct and comprehend an NFA than DFA for a given regular language. The concept of NFA can also be used in proving many theorems and results. Hence, it plays an important role in this subject. In the context of FA nondeterminism can be incorporated naturally. That is, an NFA is defined in the same way as the DFA but with the following two exceptions:

multiple next state. - transitions.

Regular Expressions (RE)


REs: Formal Definition
We construct REs from primitive constituents (basic elements) by repeatedly applying certain recursive rules as given below. (In the definition)

Definition : Let S be an alphabet. The regular expressions are defined


recursively as follows.

Basis :
i) ii) iii) is a RE is a RE , a is RE.

These are called primitive regular expression i.e. Primitive Constituents

Regular Grammars
A grammar the following three forms:

is right-linear if each production has one of

A cB , A c, A

Where A, B ( with A = B allowed) and . A grammar G is leftlinear if each production has once of the following three forms.

A Bc , A c, A
A right or left-linear grammar is called a regular grammar.

Minimization of Deterministic Finite Automata (DFA)


For any regular language L it may be possible to design different DFAs to accept L. Given two DFAs accepting the same language L, it is now natural to ask - which one is more simple? In this case, obviously, the one with less number of states would be simpler than the other. So, given a DFA accepting a language, we might wonder whether the DFA could further be simplified i.e. can we reduce the number of states accepting the same language ? Consider the follwoing DFA ,

Figure 1

Pushdown Automata (PDA)


Regular language can be charaterized as the language accepted by finite automata. Similarly, we can characterize the context-free language as the langauge accepted by a class of machines called "Pushdown Automata" (PDA). A pushdown automation is an extension of the NFA. It is observed that FA have limited capability. (in the sense that the class of languages accepted or characterized by them is small). This is due to the "finite memory" (number of states) and "no external memory" involved with them. A PDA is simply an NFA augmented with an "external stack memory". The addition of a stack provides the PDA with a last-in, first-out memory management cpapability. This "Stack" or "pushdown store" can be used to record a potentially unbounded information. It is due to this memory management capability with the help of the stack that a PDA can overcome the memory limitations that prevents a FA to accept many interesting languages like . Although, a PDA can store an unbounded amount of information on the stack, its access to the information on the stack is limited. It can push an element onto the top of the stack and pop off an element from the top of the stack. To read down into the stack the top elements must be popped off and are lost. Due to this limited access to the information on the stack, a PDA still has some limitations and cannot accept some other interesting languages

Equivalence of PDAs and CFGs


We will now show that pushdown automata and context-free grammars are equivalent in expressive power, that is, the language accepted by PDAs are exactly the context-free languages. To show this, we have to prove each of the following:

i) Given any arbitrary CFG G there exists some PDA M that accepts exactly the same language generated by G. ii) Given any arbitrary PDA M there exists a CFG G that generates exactly the same language accpeted by M.

(i) CFA to PDA


We will first prove that the first part i.e. we want to show to convert a given CFG to an equivalent PDA. Let the given CFG is . Without loss of generality we can assume that G is in Greibach Normal Form i.e. all productions of G are of the form . where and .

PDA and CFG


We now want to show that for every PDA M that accpets by empty stack, there is a CFG G such that L(G) = N(M) we first see whether the "reverse of the construction" that was used in part (i) can be used here to construct an equivalent CFG from any PDA M. It can be show that this reverse construction works only for single state PDAs.

That is, for every one-state PDA M there is CFG G such that L(G) = N(M). For every move of the PDA M we introduce a production the grammar where N = T and . in

we can now apply the proof in part (i) in the reverse direction to show that L(G) = N(M).

Anda mungkin juga menyukai