1
Kuis
subject: kuis pretest1_Kelas_Nama_NPM
4IA09:
https://drive.google.com/drive/folders/1W1l-2wgwRTE9puR
hhQREFnlqXn3rztMe?usp=sharing
4IA11:
https://drive.google.com/drive/folders/1d2kk81A9XLBJnQfr
88kUlmEK5-RQaUO_?usp=sharing
4IA07:
https://drive.google.com/drive/folders/1EkI9wENGGY1uFCr
3
Source program with macros
Preprocessor
Compiler
Targetwith
Try g++ assembly program
–v, -E, -S flags
on linprog.
assembler
linker
5
Compiler Front- and Back-end Pass
Source program (character stream) Abstract syntax tree or
other intermediate form
Scanner
Machine-
(lexical analysis)
Independent Code
Tokens
Improvement
Front end
Back end
synthesis
Parser Modified intermediate form
analysis
(syntax analysis)
Target Code
Parse tree
Generation
Semantic Analysis Assembly or object code
and Intermediate
Machine-Specific
Code Generation
Code Improvement
Abstract syntax tree or
Modified assembly or object code
other intermediate form 6
Contoh Proses
Kompilasi
7
Proses Scanning (Analisa Leksikal)
Tujuan utama: mengenal kata (token)
Bagaimana? Dengan mengenal patterns/pola
Contoh: identifier berbentuk susunan huruf atau digits yang
diawali dengan huruf
Pola lexical membentuk bahasa regular
Regular languages dapat dirumuskan menggunakan
regular expressions (REs)
Dapatkan RE recognizer diotomatisasi?
Yes!
8
The scanning process
Goal: automate the process
Idea:
Start with an RE/RD
Build a DFA
How?
We can build a non-deterministic finite automaton
(Thompson's construction)
Convert that to a deterministic one
(Subset construction)
Minimize the DFA
(Hopcroft's algorithm)
Implement it
Existing scanner generator: flex, lex
9
Proses Scanning
Definisi: Regular expressions (atas alfabet )
/ / adalah RE dengan notasi {}
Jika a , maka adalah RE dengan notasi {a}
Jika r dan s adalah RE, maka
(r) adalah RE dengan notasi L(r)
11
Regular Definitions
A regular expression that describes digits is:
0|1|2|3|4|5|6|7|8|9
For convenience, we'd like to give it a name and then
use the name in building more complex regular
expressions:
digit 0|1|2|3|4|5|6|7|8|9
This is called a regular definition.
Example
Integer 0|((1|2|3|..|9 )digit*)
letter a|...|z|A|...|Z
ident letter (letter | digit)*
Token_if if
Token_Then t|T H|h e|E n|N
tHeN 12
digit 0|1|2|3|4|5|6|7|8|9
letter a|...|z|A|...|Z
bEtE2
bE2tE
13
What’s next
Given an input string, we need a “machine” that has
a regular expression hard-coded in it and can tell
whether the input string matches the pattern
described by the regular expression or not.
14
The scanning process
Definition: Deterministic Finite Automaton
a five-tuple (, S, , s0, F) where
is the alphabet
S is the set of states
is the transition function (SS)
s0 is the starting state
F is the set of final states (F S)
Notation:
Use a transition diagram to describe a DFA
15
The scanning process
Main goal: recognize words/tokens
Snapshot:
At any point in time, the scanner has read some input and is
on the way to identifying what kind of token has been read
(e.g. identifier, operator, integer literal, etc.)
Once the scanner identifies a token, it sends it off to the
parser and starts over with the next word.
Some tokens need additional data to be carried along
with them
For example, an identifier token needs to have the
identifier itself attached to it.
Alternatively, the scanner generates a file of tokens which is
then input to the parser.
16
The scanning process
A simple hand-written scanner would look a bit like this:
…
nextchar = getNextChar();
switch (nextchar) {
case '(': return LPAREN; /* return LPAREN token */
case 0:
case 1:
...
case 9: nextchar = getNextChar();
while (nextchar is a digit) {
concat the digits to build an integer
nextchar = getNextChar();
}
putBack(nextchar)
make a new INTEGER token with the integer value attached
return INTEGER;
...
}
… 17
The scanning process
Not always as simple as it seems
Example from old versions of FORTRAN:
DO 5 I=1,10
vs.
DO 5 I=1.10
Instead of writing a scanner by hand, we can
automate the process.
Specify what needs to be recognized and what to do when
something is recognized.
Have a scanner generator create the scanner based on our
specification.
Hand-written vs. automated scanner
18
The scanning process
Goal: automate the process
Idea:
Start with an RE
Build a DFA
How?
We can build a non-deterministic finite automato_ NFA
(Thompson's construction)
Convert that to a deterministic one = DFA
(Subset construction)
Minimize the DFA
(Hopcroft's algorithm)
Implement it
Existing scanner generator: lex, flex. dll
19
Scanner generator: Lex
Lex source is a table of
regular expressions and
corresponding program fragments
digit [0-9]
letter [a-zA-Z]
%%
{letter}({letter}|{digit})* printf(“id: %s\n”, yytext);
\n printf(“new line\n”);
%%
main() {
yylex();
}
20
Lex Source
Lex source is separated into three sections by %
% delimiters
The general format of Lex source is
{definitions}
%% (required)
{transition rules}
%% (optional)
{user subroutines}
The absolute minimum Lex program is thus
%% 21 21
22
23
Contoh untuk suatu bahasa “Tiny”
24
25
26
Tugas
Baca ppt ini
Buat kelompok tidak lebih dari 5, untuk tugas
selanjutnya
27