Application of Improved Forecasting Analyzer in Grammar Analysis

Application of Improved Forecasting Analyzer in
Grammar Analysis
Kuihe Yang, Lingling Zhao

College of Information, Hebei University of Science and Technology
Shijiazhuang 050018, China
ykh@hebust.edu.cn
Abstract—LL(1) grammar is obvious and can be determined ameliorated LL(2) analyzing method ,which greatly simplifies
easily. LL(1) grammar is used widely, but LL(1) grammar can LL(2) analyzing program.
only express little language information. LL(1) grammar can not
meet practicality request in most circumstances. The LL(k)
grammar can express much more language information than II. A BRIEF INTRODUCE OF LL FORECAST AND ANALYSIS
LL(1) grammar in compiling principles, but LL(k) analyzing METHOD
program is much more complex. It is not be used widely in
practical syntax analysis programs. The improved LL(2) A. LL(1) grammar and forecast analysis table
analyzing method greatly simplifies primary syntax analyzing
program. The method only adopts to see a sign ahead with a part If G=(VNˈVTˈPˈS) is a grammar that is independent
of regulation formulas, which enhances the practicability of of context. VN is not-end character collection. VT is a end
LL(2) analyzing program. Therefore, improved forecasting character collection. P is regulation formula. S is initialization
analyzer has easier and higher efficiency. If a grammar is LL(2) symbol. Suppose the # is end character of input strings. Then
grammar rather than LL(1) grammar, above improved there is following definition:
forecasting analysis method may be adopted to get better results. * a£, aęVT, Į,ȕęV* }
First (Į) = {a |¢=>
Keywords-Grammar; Sign cluster; Regulation formula Follow(A) = {a | S => * · · · Aa· · · , aęVT, AęVN }
If S => · · · A, then stipulate #ęFollow(A).

*
AĺĮ is regulation formula of grammar G(S). If Į>İ,
I. INTRODUCTION then Select (AĺĮ)= First (Į), if Į=>İ, then Select (AĺĮ)=
The compiler is a program that reads the program written First (Į)ĤFollow(A).
in one language (source language) and translates it into an The definition of LL(1) grammar is that if a grammar
equivalent program in another language (target language). For that is independent of context is a LL(1) grammar, then
different programming language paradigms, different adequate and necessary condition is that for every not-end
techniques will be applied for developing their compilers. In symbol A, two different regulation formula AĺĮ and Aĺȕ
the status of syntax Analysis, the compiler reads the sequences meet[2]:
of tokens, determines the syntactical structure of the program, Select (AĺĮ) ŀ Select (AĺĮ)˙ĭ, ĭ is empty set.
and the results of parsing are represented as a parse tree or a Suppose that there is grammar G(S):
syntax tree.
As a parsing method form top to bottom, LL(1) grammar 1) S ĺ eAbB
is obvious and can be determined easily. Algorithm of LL(1) 2) A ĺb
is easy. So LL(1) grammar is used widely. But LL(1) grammar 3) A ĺ aA
can only express little language information. Therefore, LL(1) 4) B ĺc
grammar can not meet practicality request in most 5) B ĺ dB
circumstances. The LL(k) grammar can express much more
language information than LL(1) grammar. LL(1) grammar is According to above definition, it is obvious that G(S) is
only a special example of LL(k)[1]. The k means to see a sign LL(1) grammar. Following construct forecast and analysis
ahead. However, for LL(k) (k>1), the column count of table of grammar G(S). a denotes every end symbol, if aę
analyzer adds with k exponent. LL(K) analyzing program is
Select(AĺĮ),then add regulation formula AĺĮinto forecast
more complicated than LL(1).Therefore, LL(k) grammar can
and analysis table M[A, a] [3].Table 1 shows forecast and
not be used widely in practical syntax analysis programs. This
analysis table of G(S). Those numbers in the table 1 denotes
paper, taking LL(2) grammar for example, introduces a type of
the serial number of grammar every regulation formula.
This work is supported by the Foundation of Hebei University of Science and Technology.
978-1-4244-2108-4/08/$25.00 © 2008 IEEE 1

TABLE 1 FORECAST AND ANALYSIS TABLE OF G(S) symbol in VT or İ. First[1](Į) is first part [2] set of string a.
D E F G H Those elements in the sets are string whose length is less than
k in VT or İ.
6
For any a not-end symbol A in VN , order
$
% *
Follow[k](A)={Ȝ|S=>ĮAȜȘˈȜęV +
T ,|Ȝ|=k}
*
Ĥ{Ȝ|S=>ĮAȜ1, Ȝ1ęVT , |Ȝ1|<k, Ȝ=Ȝ1# nˈn = k ņ |Ȝ1|}
*
In logic, the LL forecasting analyzer consists of total Follow[k](A) is called subsequence [k]collection of variable
control program, a forecasting analysis table and a analysis A. Those elements in the collection are strings whose length is
shed, as figure 1 shows. k and that is composed of those symbols in VT and symbol #.
The definition of LL(k) is : if a grammar that is
independent of context is a LL(k) grammar, then adequate and
Input necessary condition is that for every not-end symbol A, two
a1 a2 ... ai ... an # different rule formula AĺĮ and Aĺȕ, following
cluster
circumstance come into existence:
Ĥŀ(First1[k](Į,First1[k](ȕ),
H[k](First2[k](Į)·Follow[k](A)),
xj H[k](First2[k](ȕ)·Follow[k](A))) = ĭ
Total control program Output In above expression, left expressions express clash
. collection of LL(k) collection.
.
. Suppose that there is a grammar G ˷ P ˹ that is
Analysis table independent of context:
x1
# 1) PėeDeS
2) Dėd;D
3) Dėd
Figure 1. LL forecasting analyzer
4) Sės;S
5) Sės
B. LL(k) grammar and forecast analysis table
The former work may be extend to previous row k It can be prove that G[P] is LL(2) grammar rather than
symbols, thereby getting the definition of LL(k) grammar. LL(1) grammar, according to above definition. The
Suppose that Į, Ȝ, Ș denote a sign cluster, P, Q, Q1, Q2, …, Qn constructing method of LL(2) analysis table is similar to the
denote a set of a sign cluster. Respectively define: constructing method of LL(1). Table 2 shows analysis table of
G[P].
Į·Q={Įȕ|ȕ Q}
Ĥ Į·Q=P·Q
TABLE 2 FORECAST AND ANALYSIS TABLE OF G[P] LL(2)
ĮęP
H[k](Q)={Į|ĮȕęQ, |Į|=k, |ȕ|0, Not-end end symbol
Or ĮęQ, |Į|<k} symbol ed d; de s; s#
n −1 n P 1
(Q1 , Q2 ,..., Qn ) = (Qi Q j ) D 2 3
i =1 j = i +1
S 4 5
Call Qi,( i=1, 2, …, n) is root set of Ĥģ(Q1, Q2, …, Qn).
We can know that if Ĥģ(Q1, Q2, …, Qn)= , then Q1,
Q2, … ,Qn can not cut each two other. III. THE IMPROVED OF AMELIORATED LL(2) FORECASTING
Suppose that k1, there is following definition: AND ANALYSIS TABLE
First[k](Į)=Firstl[k](Į)ĤFirst2[k](Į) A. The ameliorated LL(2) grammar

First1[k](Į)={Ȝ|Į=>* ȜȘˈȜęVT+, |Ȝ|=k} As can be from table 2, the LL(2) analysis table is much
*
First2[k](Į)={Ȝ|Į=> Ȝ, ȜęVT*, |Ȝ|<k} more complicated than LL(1) analysis table. The analysis
program of LL(2) is limited in practice. Therefore, in
First[k](Į) is called first part [k] set of string a. First[1](Į) reference [4], the LL(2) analysis method is improved.
is first part [1] set of string a. Those elements in the set are
978-1-4244-2108-4/08/$25.00 © 2008 IEEE 2

Suppose that grammar G that is independent of context According to table 3, when analyzing problem with
exists LL(1) clash collection that is not empty. However, ameliorated LL(2) forecast and analysis table, use analysis
LL(1) clash collection is empty. SA denotes clash collection of method of LL(1) primarily. for example, when meet with shed
regulation formula LL(1) whose left part is A, according to the top is not-end symbol P, see a sign ahead. If it is not-end
definition of clash collection ¦ę SA. Suppose that Q is a symbol e, return error. If it is end symbol e, select regulation
collection of sign cluster. a is a sign cluster. ¢_Q denotes a formula 1.For ameliorated LL(2) grammar, It only need see a
child collection that is composed of those element whose sign ahead in most circumstances. But in some circumstances,
prefix is a. For any two regulation formulas whose left is A , it need see a sign ahead sequentially. For example, when meet
AĺĮˈAĺȕ, If with shed top is not-end symbol D, see a sign ahead. If it is
not-end symbol d, return error. If it is end symbol d, see a sign
ahead. If it is end symbol ,select regulation formula 2. If it is
Ĥģ((First1[1](Į) ņ SA)ˈ(First1[1](ȕ) ņ SA)
end symbol e, select regulation formula 3. If it is other end
(H[1](First2[1](Į)·Follow[1](A)) ņ SA)
symbols, return error .
(H[1](First2[1](ȕ)·Follow[1](A)) ņ SA)
Ĥa_First1[2](Į)ˈĤa_First1[2](ȕ)
Ĥa_H[2](First2[2](Į)·Follow[2](A)) IV. APPLICATION EXAMPLE
Ĥa_H[2](First2[2](ȕ)·Follow[2](A)))= ĭ LL analyzing program is composed of forecast and
analysis program, first-in-and-first-out shed, forecast and
then, grammar G that is independent of context is called analysis table. Now, using ameliorated LL(2) forecast and
ameliorated LL(2) grammar. aęSA, ,above left expression is analysis table distinguish whether sign cluster edes is sentence
clash collection expression of ameliorated LL(2) [4]. It can be of grammar G[P]. Table 4 shows the process that is used for
proved that G[p] meets request of above expression. distinguishing sign cluster edes. # is sentence symbol that is
added. When using regulation formula, use reverse string that
is in regulation formula in right part to replace shed top
B. The construct of ameliorated forecast and analysis table
element.
The ameliorated forecast and analysis table may be
constructed according to the expression of ameliorated LL(2)
clash collection. The method is as follows. TABLE 4 THE ANGLICIZING PROCESS OF SIGN CLUSTER EDES#
First construct LL(1) analysis table according to the
Leavings Used
method of constructing LL(1) analysis table. If there are more Analysis
process input regulation remark
than one regulation formula coding in some grids, do more shed
string formula
deal with. If left part is regulation formula(AĺĮi, i=1, 2, …,
see a sign
n) of not-end symbol A, construct a_First1[2](Įi) and 1 #P edes# 1 PėeDeS
ahead
a_H[2](First2[2](Įi)·Follow[2](A)) for any end symbol aęSA .
2 #SeDe edes# e matching
For all Įi, i=1, 2, …, n, all total of these collection is {ab1, ab2,
…, abs}. Make off these grids whose column first element is a see two sign
3 #SeD des# 3 Dėd
ahead
and row first element is A into s child grids and respectively
indicate every child grid with bi, i=1, 2, …, s. 4 #Sed des# d matching
If there is symbol 5 #Se es# e matching
abięa_First1[2](Įj) see a sign
6 #S s# 5 Sės
or abięa_H[2](First2[2](Įj)·Follow[2](A)), ahead
add serial number of regulation formula AĺĮj into child 7 #s s# s matching
grid whose identifier is bi. Thus, construct ameliorated 8 # # accept
LL(2)forecast and analysis table. Table 3 shows ameliorated
forecast and analysis table of G(P).
As can be form table 4, sign cluster edes is accepted by
grammar G[P]. Thus, sign cluster edes is the sentence of
TABLE 3 IMPROVED LL(2) FORECAST AND ANALYSIS TABLE OF GRAMMAR G grammar G[P].
On the whole, when using ameliorated LL(2) grammar to
Not-end end symbol analyze sentence, it only need see a sign ahead and a
symbol e d s regulation formula that is used for reasoning may be
P 1 confirmed. The efficiency using ameliorated LL(2) grammar
; e analysis table to do grammar analysis is much higher than
D using LL(2).
2 3
; # Although Literature [4] indicates constructing LL(k)
S analysis table may use the same method, there are some
4 5
complicated things in LL(k) analysis. First, forecast and
978-1-4244-2108-4/08/$25.00 © 2008 IEEE 3

analysis table becomes much bigger because the column count
adds with k exponent. Second, forecast and analysis table can
not express all ability of LL(k) analysis because all Follow
strings do not happen in all context. Therefore, although the
above method may be used for LL(k)(k>2) grammar,
LL(k)(k>2) analysis program is limited in practice. LL(k) (kı
2)grammar can not be used in translate and edit that is now
program widely.
After status of syntax Analysis, there is a semantic
analysis phase. In this phase, the compiler Checks the meaning
of the constructs. Most of the semantic analysis is done by a
parser. Some could be done by LEXER or code generator.
Parser and part of code generator performing semantic
analysis is called constrainer. The type checking and
uniqueness checking are done either in semantic analysis
phase.
V. CONCLUSIONS
As can be from construct and analysis method of LL(2)
forecast and analysis table, the volume of ameliorated LL(2)
forecast and analysis table is much smaller than LL(2)
generally. Therefore, improved LL(2) analysis possess easier
and higher efficiency. If a grammar is LL(2) grammar rather
than LL(1) grammar, above improved LL(2) analysis method
may be adopted to get higher efficiency. But for LL(k)( kı
2),other grammar analysis methods should be adopted to
analyze because the construct method of LL(k) grammar is
complicated.
ACKNOWLEDGMENT
This work is supported by the Foundation of Hebei
University of Science and Technology.
REFERENCE
[1] Jin Chengzhi. “Construct principle and construct technology of translate
and edit programme”. Beijing: higher education press, 2000
[2] Jiang Liyuang, Kang Muning. “The principle of translate and edit”.
Xian: northwest industry university pres, 2000
[3] Chen Huowang, Zhao Chunling, Tan Qingping. “The design language of
program principle of translate and edit”. Beijing: national defense
industry press,2000.
[4] Lv Yingzhi, Feng Jianhua. “LL(2) grammar analyzing method base –
LL(1) grammar”. The Transaction of Tsinghua University,
1997,37(1):102-105
[5] Zhang Be, Huang Tao, Fu Yuanbin. “Design and Implementation of
compiler of object description language”. The Transaction of Software,
1998,9(7):525-531
[6] Terrence W P, Marvin. V Z. “Programming Languages: Design and
Implementation”. Prentice Hall, 1996
[7] Gong Tianfu. “Program Design Language and Complier”, Publishing
Houst of Electronics Industry, 2003.
[8] M. J. C. Gordon. “Programming Language Theory and its
Implementation”. Printics Hall, 1988
[9] Hu Lunjun, Xu Lanfang, Luo Ting. “Compiler Principle”. Beijing:
Publishing House of Electronics Industry, 2005
978-1-4244-2108-4/08/$25.00 © 2008 IEEE 4

Application of Improved Forecasting Analyzer in Grammar Analysis

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Application of Improved Forecasting Analyzer in Grammar Analysis

Diunggah oleh

Hak Cipta:

Format Tersedia

Application of Improved Forecasting Analyzer in

Kuihe Yang, Lingling Zhao

If S => · · · A, then stipulate #ęFollow(A).

978-1-4244-2108-4/08/$25.00 © 2008 IEEE 1

First[k](Į)=Firstl[k](Į)ĤFirst2[k](Į) A. The ameliorated LL(2) grammar

978-1-4244-2108-4/08/$25.00 © 2008 IEEE 2

978-1-4244-2108-4/08/$25.00 © 2008 IEEE 3

978-1-4244-2108-4/08/$25.00 © 2008 IEEE 4

Anda mungkin juga menyukai