0 penilaian0% menganggap dokumen ini bermanfaat (0 suara)
13 tayangan20 halaman
Syntax analysis parts are not very separable, since they are mixed up with calls to all other parts, such as semantic analysis. The method used is that commonly known as recursive descent. If any structure contains another structure then the parsing procedure can call the procedure for this contained structure.
Syntax analysis parts are not very separable, since they are mixed up with calls to all other parts, such as semantic analysis. The method used is that commonly known as recursive descent. If any structure contains another structure then the parsing procedure can call the procedure for this contained structure.
Hak Cipta:
Attribution Non-Commercial (BY-NC)
Format Tersedia
Unduh sebagai DOCX, PDF, TXT atau baca online dari Scribd
Syntax analysis parts are not very separable, since they are mixed up with calls to all other parts, such as semantic analysis. The method used is that commonly known as recursive descent. If any structure contains another structure then the parsing procedure can call the procedure for this contained structure.
Hak Cipta:
Attribution Non-Commercial (BY-NC)
Format Tersedia
Unduh sebagai DOCX, PDF, TXT atau baca online dari Scribd
ln computer science /exico/ ono/ysis is the process of convertinq o sequence of chorocters into o sequence of tokens 4 proqrom or function which performs /exico/ ono/ysis is co//ed o /exico/ ono/yter /exer or sconner 4 /exer often exists os o sinq/e function which is co//ed by o porser or onother function
8) 5yntox 4no/ysis Unlike other aspects of the compiler, the syntax analysis parts are not very separable, since they are mixed up with calls to all other parts, such as semantic analysis. However the method used is that commonly known as recursive descent. 1his will not be treated in great detail here - consult any book on compiler theory for details. 1he method depends on writing a separate parsing procedure for each kind of syntactic structure, such as if statement, assignment statement, expression and so on, and each of these is only responsible for analysing its own kind of structure. If any structure contains another structure then the parsing procedure can call the procedure for this contained structure. As an example, consider the procedure ifstatement Eliminating all but the syntax analysis parts leaves procedure ifstatement; begin expression; if sy = thensy then insymbol else error(52); statement; if sy = elsesy then begin insymbol; statement end end;
whot is kl5c ond how it is different from the cl5c? 4ns cl5c 4 comp/ex lnstruction 5et computer {cl5c) supp/ies o /orqe number of comp/ex instructions ot the ossemb/y /onquoqe /eve/ 4ssemb/y /onquoqe is o /ow/eve/ computer proqromminq /onquoqe in which eoch stotement corresponds to o sinq/e mochine instruction cl5c instructions foci/itote the extensive monipu/otion of /ow/eve/ computotiono/ e/ements ond events such os memory binory orithmetic ond oddressinq 1he qoo/ of the cl5c orchitecturo/ phi/osophy is to moke microprocessors eosy ond f/exib/e to proqrom ond to provide for more efficient memory use 1he cl5c phi/osophy wos unquestioned durinq the 190s when the eor/y computinq mochines such os the popu/or uiqito/ quipment corporotion PuP 11 fomi/y of minicomputers were beinq proqrommed in ossemb/y /onquoqe ond memory wos s/ow ond expensive cl5c mochines mere/y used the thenovoi/ob/e techno/oqies to optimite computer performonce 1heir odvontoqes inc/uded the fo//owinq {1) 4 new processor desiqn cou/d incorporote the instruction set of its predecessor os o subset of on everqrowinq /onquoqeno need to reinvent the whee/ codewise with eoch desiqn cyc/e {) lewer instructions were needed to imp/ement o porticu/or computinq tosk which /ed to /ower memory use for proqrom storoqe ond fewer timeconsuminq instruction fetches from memory {l) 5imp/er compi/ers sufficed os comp/ex cl5c instructions cou/d be written thot c/ose/y resemb/ed the instructions of hiqh/eve/ /onquoqes ln effect cl5c mode o computers ossemb/y /onquoqe more /ike o hiqh/eve/ /onquoqe to beqin with /eovinq the compi/er /ess to do 1he terms cl5c ond kl5c {keduced lnstruction 5et computer) were coined ot this time to ref/ect the wideninq sp/it in computerorchitecturo/ phi/osophy kl5c 1he keduced lnstruction 5et computer or kl5c is o microprocessor cPu desiqn phi/osophy thot fovors o simp/er set of instructions thot o// toke obout the some omount of time to execute 1he most common kl5c microprocessors ore 4vk Plc 4kM uc 4/pho P4kl5c 5P4kc MlP5 ond l8Ms PowerPc
kl5c chorocteristics 5mo// number of mochine instructions /ess thon 150 5mo// number of oddressinq modes /ess thon 4 5mo// number of instruction formots /ess thon 4 lnstructions of the some /enqth l bits {or 4 bits) 5inq/e cyc/e execution Lood / 5tore orchitecture Lorqe number of 6kPs {6enero/ Purpose keqisters) more thon l nordwired contro/ 5upport for nLL {niqh Leve/ Lonquoqe) kl5c v5 cl5c cl5c kl5c mphosis on hordwore mphosis on softwore lnc/udes mu/tic/ock
comp/ex instructions 5inq/ec/ock
reduced instruction on/y Memorytomemory
LO4u ond 51Ok incorporoted in instructions keqister to reqister
LO4u ond 51Ok ore independent instructions 5mo// code sites
hiqh cyc/es per second Low cyc/es per second
/orqe code sites 1ronsistors used for storinq
comp/ex instructions 5pends more tronsistors
on memory reqisters
l xp/oin the fo//owinq with respect to the desiqn specificotions of on 4ssemb/er 4) uoto 5tructures 8) poss1 poss 4ssemb/er f/ow chort 4nsl ata Structure 1he second step in our design procedure is to establish the databases that we have to work with.
Pass 1 ata Structures 1. Input source program 2. A Location Counter (LC), used to keep track of each instruction's location. 3. A table, the Machine-operation 1able (MO1), that indicates the symbolic mnemonic, for each instruction and its length (two, four, or six bytes) 4. A table, the Pseudo-Operation 1able (PO1) that indicates the symbolic mnemonic and action to be taken for each pseudo-op in pass 1. 5. A table, the Symbol 1able (S1) that is used to store each label and its corresponding value. . A table, the literal table (L1) that is used to store each literal encountered and its corresponding assignment location. 7. A copy of the input to be used by pass 2. Pass 2 ata Structures 1. Copy of source program input to pass1. 2. Location Counter (LC) 3. A table, the Machine-operation 1able (MO1), that indicates for each instruction, symbolic mnemonic, length (two, four, or six bytes), binary machine opcode and format of instruction. 4. A table, the Pseudo-Operation 1able (PO1), that indicates the symbolic mnemonic and action to be taken for each pseudo-op in pass 2. 5. A table, the Symbol 1able (S1), prepared by pass1, containing each label and corresponding value. . A 1able, the base table (B1), that indicates which registers are currently specified as base registers by USIAC pseudo-ops and what the specified contents of these registers are. 7. A work space IAS1 that is used to hold each instruction as its various parts are being assembled together. 8. A work space, PRIA1 LIAE, used to produce a printed listing. 9. A work space, PUACH CAR, used prior to actual outputting for converting the assembled instructions into the format needed by the loader. 1. An output deck of assembled instructions in the format needed by the loader.
Format of ata Structures
1he third step in our design procedure is to specify the format and content of each of the data structures. Pass 2 requires a machine operation table (MO1) containing the name, length, binary code and format; pass 1 requires only name and length. Instead of using two different tables, we construct single (MO1). 1he Machine operation table (MO1) and pseudo-operation table are example of fixed tables. 1he contents of these tables are not filled in or altered during the assembly process. 1he following figure depicts the format of the machine-op table (MO1) -------------- bytes per entry ------------ Mnemonic Opcode (4bytes) characters Binary Opcode (1byte) (hexadecimal) Instruction length (2 bits) (binary) Instruction format (3bits) (binary) Aot used here (3 bits) "Abbb" 5A 1 1 "Ahbb" 4A 1 1 "ALbb" 5E 1 1 "ALRB" 1E 1 ... ... ... ... :b' represents "blank"
1he primary function performed by the analysis phase is the building of the symbol table. For this purpose it must determine the addresses with which the symbol names used in a program are associated. It is possible to determine some address directly, e.g. the address of the first instruction in the program, however others must be inferred.
4 uefine the fo//owinq 4) Porsinq 8) 5conninq c) 1oken
4ns 4
Parsing:
Parsing transforms input text or string into a data structure, usually a tree, which is suitable for later processing and which captures the implied hierarchy of the input. Lexical analysis creates tokens from a sequence of input characters and it is these tokens that are processed by a parser to build a data structure such as parse tree or abstract syntax trees. Conceptually, the parser accepts a sequence of tokens and produces a parse tree. In practice this might not occur. 1. 1he source program might have errors. Shamefully, we will do very little error handling. 2. Real compilers produce (abstract) syntax trees not parse trees (concrete syntax trees). We don't do this for the pedagogical reasons given previously. 1here are three classes for grammar-based parsers. 1. Universal 2. 1op-down 3. Bottom-up 1he universal parsers are not used in practice as they are inefficient; we will not discuss them.
5conninq ond token 1here ore three phoses of ono/ysis with the output of one phose the input of the next och of these phoses chonqes the representotion of the proqrom beinq compi/ed 1he phoses ore co//ed /exico/ ono/ysis or sconninq which tronsforms the proqrom from o strinq of chorocters to o strinq of tokens 5yntox 4no/ysis or Porsinq tronsforms the proqrom into some kind of syntox tree ond 5emontic 4no/ysis decorotes the tree with semontic informotion 1he chorocter streom input is qrouped into meoninqfu/ units co//ed /exemes which ore then mopped into tokens the /otter constitutinq the output of the /exico/ ono/yter lor exomp/e ony one of the fo//owinq c stotements xl y + l xl y + l xl y+ l but not x l y + l wou/d be qrouped into the /exemes xl y + l ond 4 token is o tokennomeottributevo/ue poir 1he hierorchico/ decomposition obove sentence is qiven fiqure 10
4 token is o tokennome ottributevo/ue poir lor exomp/e 1 1he /exeme xl wou/d be mopped to o token such os id1 1he nome id is short for identifier 1he vo/ue 1 is the index of the entry for xl in the symbo/ tob/e produced by the compi/er 1his tob/e is used qother informotion obout the identifiers ond to poss this informotion to subsequent phoses 1he /exeme wou/d be mopped to the token ln reo/ity it is probob/y mopped to o poir whose second component is iqnored 1he point is thot there ore mony different identifiers so we need the second component but there is on/y one ossiqnment symbo/ l 1he /exeme y is mopped to the token id 4 1he /exeme + is mopped to the token + 5 1he number l is mopped to number somethinq but whot is the somethinq On the one hond there is on/y one l so we cou/d just use the token numberl nowever there con be o difference between how this shou/d be printed {eq in on error messoqe produced by subsequent phoses) ond how it shou/d be stored {fixed vs f/oot vs doub/e) Perhops the token shou/d point to the symbo/ tob/e where on entry for this kind of l is stored 4nother possibi/ity is to hove o seporote numbers tob/e
7 1he /exeme is mopped to the token Note nonsiqnificont b/onks ore normo//y removed durinq sconninq ln c most b/onks ore non siqnificont 1hot does not meon the b/onks ore unnecessory consider int x intx Note thot we con define identifiers numbers ond the vorious symbo/s ond punctuotion without usinq recursion {compore with porsinq be/ow) Porsinq invo/ves o further qroupinq in which tokens ore qrouped into qrommotico/ phroses which ore often represented in o porse tree
5 uescribe the process of 8ootstroppinq in the context of Linkers 4ns5 Boot straping: In computing, bootstrapping refers to a process where a simple system activates another more complicated system that serves the same purpose. It is a solution to the Chicken-and-egg problem of starting a certain system without the system already functioning. 1he term is most often applied to the process of starting up a computer, in which a mechanism is needed to execute the software program that is responsible for executing software programs (the operating system). Bootstrap loading: 1he discussions of loading up to this point have all presumed that there's already an operating system or at least a program loader resident in the computer to load the program of interest. 1he chain of programs being loaded by other programs has to start somewhere, so the obvious question is how is the first program loaded into the computer? Many Unix systems use a similar bootstrap process to get user-mode programs running. 1he kernel creates a process, then stuffs a tiny little program, only a few dozen bytes long, into that process. 1he tiny program executes a system call that runs /etc/init, the user mode initialization program that in turn runs configuration files and starts the daemons and login programs that a running system needs. Software Bootstraping & Compiler Bootstraping: Bootstrapping can also refer to the development of successively more complex, faster programming environments. 1he simplest environment will be, perhaps, a very basic text editor (e.g. ed) and an assembler program. Using these tools, one can write a more complex text editor, and a simple compiler for a higher-level language and so on, until one can have a graphical IE and an extremely high-level programming language Compiler Bootstraping: In compiler design, a bootstrap or bootstrapping compiler is a compiler that is written in the target language, or a subset of the language, that it compiles. Examples include gcc, CHC, OCaml, BASIC, PL/I and more recently the Mono C# compiler.
uescribe the procedure for desiqn of o Linker Ans: esign of a linker Relocation and linking requirements in segmented addressing 1he relocation requirements of a program are influenced by the addressing structure of the computer system on which it is to execute. Use of the segmented addressing structure reduces the relocation requirements of program. A Linker for MS-OS Example : Consider the program of written in the assembly language of intel 888. 1he ASSUME statement declares the segment registers CS and S to the available for memory addressing. Hence all memory addressing is performed by using suitable displacements from their contents. 1ranslation time address o A is 19. In statement 1, a reference to A is assembled as a displacement of 19 from the contents of the CS register. 1his avoids the use of an absolute address, hence the instruction is not address sensitive. Aow no relocation is needed if segment SAMPLE is to be loaded with address 2 by a calling program (or by the OS). 1he effective operand address would be calculated as <CS>+19, which is the correct address 219. A similar situation exists with the reference to B in statement 17. 1he reference to B is assembled as a displacement of 2 from the contents of the S register. Since the S register would be loaded with the execution time address of A1A_HERE, the reference to B would be automatically relocated to the correct address.
1hough use of segment register reduces the relocation requirements, it does not completely eliminate the need for relocation. Consider statement 14 . MOJ AX, A1A_HERE Which loads the segment base of A1A_HERE into the AX register preparatory to its transfer into the S register . Since the assembler knows A1A_HERE to be a segment, it makes provision to load the higher order 1 bits of the address of A1A_HERE into the AX register. However it does not know the link time address of A1A_HERE, hence it assembles the MOJ instruction in the immediate operand format and puts zeroes in the operand field. It also makes an entry for this instruction in RELOC1AB so that the linker would put the appropriate address in the operand field. Inter-segment calls and jumps are handled in a similar way. Relocation is somewhat more involved in the case of intra-segment jumps assembled in the FAR format. For example, consider the following program : FAR_LAB EQU 1HIS FAR ; FAR_LAB is a FAR label 1MP FAR_LAB ; A FAR jump Here the displacement and the segment base of FAR_LAB are to be put in the 1MP instruction itself. 1he assembler puts the displacement of FAR_LAB in the first two operand bytes of the instruction , and makes a RELOC1AB entry for the third and fourth operand bytes which are to hold the segment base address. A segment like AR_A W OFFSE1 A (which is an :address constant') does not need any relocation since the assemble can itself put the required offset in the bytes. In summary, the only RELOCA1AB entries that must exist for a program using segmented memory addressing are for the bytes that contain a segment base address. For linking, however both segment base address and offset of the external symbol must be computed by the linker. Hence there is no reduction in the linking requirements.
July 2011 Master of Computer Application (MCA) - Semester 3 Mc007l 5ystem Proqromminq4 credits 4ssiqnment 5et
1 uiscuss the vorious 4ddressinq mode for cl5c
4ns1 AJJrexxlng MoJex of CISC : 1he 8 addressing (Motorola) modes Register to Register, Register to Memory, Memory to Register, and Memory to Memory 8 Supports a wide variety of addressing modes. Immediate mode -- the operand immediately follows the instruction Absolute address - the address (in either the "short" 1-bit form or "long" 32-bit form) of the operand immediately follows the instruction Program Counter relative with displacement - A displacement value is added to the program counter to calculate the operand's address. 1he displacement can be positive or negative. Program Counter relative with index and displacement - 1he instruction contains both the identity of an "index register" and a trailing displacement value. 1he contents of the index register, the displacement value, and the program counter are added together to get the final address. Register direct - 1he operand is contained in an address or data register. Address register indirect - An address register contains the address of the operand. Address register indirect with predecrement or postdecrement - An address register contains the address of the operand in memory. With the predecrement option set, a predetermined value is subtracted from the register before the (new) address is used. With the postincrement option set, a predetermined value is added to the register after the operation completes. Address register indirect with displacement - A displacement value is added to the register's contents to calculate the operand's address. 1he displacement can be positive or negative. Address register relative with index and displacement - 1he instruction contains both the identity of an "index register" and a trailing displacement value. 1he contents of the index register, the displacement value, and the specified address register are added together to get the final address.
write obout ueterministic ond Nonueterministic linite 4utomoto with suitob/e numerico/ exomp/es 4ns eterministic finite automata (FA) : A deterministic finite automaton (FA) is a 5-tuple: (S, 2, 1, s, A) an alphabet (2) a set of states (S) a transition function (1 : S 2 S). a start state (s S) a set of accept states (A S) 1he machine starts in the start state and reads in a string of symbols from its alphabet. It uses the transition function 1 to determine the next state using the current state and the symbol just read. If, when it has finished reading, it is in an accepting state, it is said to accept the string, otherwise it is said to reject the string. 1he set of strings it accepts form a language, which is the language the FA recognizes.
on-etermlnlxtlc Flnlte Automuton (-FA): A Aon-eterministic Finite Automaton (AFA) is a 5-tuple: (S, 2, 1, s, A) an alphabet (2) a set of states (S) a transition function (1: S 2 S). a start state (s S) a set of accept states (A S) Where P(S) is the power set of S and e is the empty string. 1he machine starts in the start state and reads in a string of symbols from its alphabet. it is in an accepting state, it is said to accept the string, otherwise it is said to reject the string. 1he set of strings it accepts form a language, which is the language the AFA recognizes. l write o short note on 4) c Preprocessor for 6cc version 8) conditiono/ 4ssemb/y 4nsl 1he C Preprocessor for CCC version 2 1he C preprocessor is a macro processor that is used automatically by the C compiler to transform your program before actual compilation. It is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs. 1he C preprocessor provides four separate facilities that you can use as you see fit: Inclusion of header files. 1hese are files of declarations that can be substituted into your program. Macro expansion. You can define macros, which are abbreviations for arbitrary fragments of C code, and then the C preprocessor will replace the macros with their definitions throughout the program. Conditional compilation. Using special preprocessing directives, you can include or exclude parts of the program according to various conditions. Line control. If you use a program to combine or rearrange source files into an intermediate file which is then compiled, you can use line control to inform the compiler of where each source line originally came from. AASI Standard C requires the rejection of many harmless constructs commonly used by today's C programs. Such incompatibility would be inconvenient for users, so the CAU C preprocessor is configured to accept these constructs by default. Strictly speaking, to get AASI Standard C, you must use the options `-trigraphs', `-undef' and `-pedantic', but in practice the consequences of having strict AASI Standard C make it undesirable to do this.
Conditional Assembly : Means that some sections of the program may be optional, either included or not in the final program, dependent upon specified conditions. A reasonable use of conditional assembly would be to combine two versions of a program, one that prints debugging information during test executions for the developer, another version for production operation that displays only results of interest for the average user. A program fragment that assembles the instructions to print the Ax register only if ebug is true is given below. Aote that true is any non-zero value.
Here is a conditional statements in C programming, the following statements tests the expression `BUFSIZE 12', where `BUFSIZE' must be a macro. #if BUFSIZE 12 printf ("Large buffers!n"); #endif / BUFSIZE is large /
4 write obout different Phoses of compi/otion 4ns4 Phases of Compiler A compiler takes as input a source program and produces as output an equivalent sequence of machine instructions. 1his process is so complex that it is not reasonable, either from a logical point of view or from an implementation point of view, to consider the compilation process as occurring in one single step. For this reason, it is customary to partition the compilation process into a series of sub processes called phases, as shown in the Fig 1.2. A phase is a logically cohesive operation that takes as input one representation of the source program and produces as output another representation. 1he syntax analyzer groups tokens together into syntactic structures. For example, the three tokens representing A + B might be grouped into a syntactic structure called an expression. Expressions might further be combined to form statements. Often the syntactic structure can be regarded as a tree whose leaves are the tokens. 1he interior nodes of the tree represent strings of tokens that logically belong together. Code Optimization is an optional phase designed to improve the intermediate code so that the ultimate object program runs faster and / or takes less space. Its output is another intermediate code program that does the same job as the original, but perhaps in a way that saves time and / or space. 1he final phase, code generation, produces the object code by deciding on the memory locations for data, selecting code to access each datum, and selecting the registers in which each computation is to be done. esigning a code generator that produces truly efficient object programs is one of the most difficult parts of compiler design, both practically and theoretically. 1he 1able-Management, or bookkeeping, portion of the compiler keeps track of the names used by the program and records essential information about each, such as its type (integer, real, etc). 1he data structure used to record this information is called a Symbol table. 1he Error Handler is invoked when a flaw in the source program is detected. It must warn the programmer by issuing a diagnostic, and adjust the information being passed from phase to phase so that each phase can proceed. It is desirable that compilation be completed on flawed programs, at least through the syntax-analysis phase, so that as many errors as possible can be detected in one compilation. Both the table management and error handling routines interact with all phases of the compiler.
5 whot is M4ckO? uiscuss its uses 4ns5 Macro definition and Expansion efinition : macro
A macro name is an abbreviation, which stands for some related lines of code. Macros are useful for the following purposes: 1o simplify and reduce the amount of repetitive coding 1o reduce errors caused by repetitive coding 1o make an assembly program more readable. A macro consists of name, set of formal parameters and body of code. 1he use of macro name with set of actual parameters is replaced by some code generated by its body. 1his is called macro expansion. Macros allow a programmer to define pseudo operations, typically operations that are generally desirable, are not implemented as part of the processor instruction, and can be implemented as a sequence of instructions. Each use of a macro generates new program instructions, the macro has the effect of automating writing of the program.
For instance, define max (a, b) a>b? A: b efines the macro max, taking two arguments a and b. 1his macro may be called like any C function, using identical syntax. 1herefore, after preprocessing z max(x, y); Becomes z x>y? X:y; While this use of macros is very important for C, for instance to define type-safe generic data- types or debugging tools, it is also slow, rather inefficient, and may lead to a number of pitfalls.
whot is compi/er? xp/oin the compi/er process Compller : A compiler is a computer program (or set of programs) that translates text written in a computer language (the source language) into another computer language (the target language). 1he original sequence is usually called the source code and the output called object code. Commonly the output has a form suitable for processing by other programs (e.g., a linker), but it may be a human-readable text file. Compller BuckenJ: While there are applications where only the compiler frontend is necessary, such as static language verification tools, a real compiler hands the intermediate representation generated by the frontend to the backend, which produces a functional equivalent program in the output language. 1his is done in multiple steps: 1. Optimization - the intermediate language representation is transformed into functionally equivalent but faster (or smaller) forms. 2. Code Ceneration - the transformed intermediate language is translated into the output language, usually the native machine language of the system. 1his involves resource and storage decisions, such as deciding which variables to fit into registers and memory and the selection and scheduling of appropriate machine instructions. 1he compiler frontend consists of multiple phases in itself, each informed by formal language theory: 1. Scanning - breaking the source code text into small pieces, tokens - sometimes called :terminals' - each representing a single piece of the language, for instance a keyword, identifier or symbol names. 1he token language is typically a regular language, so a finite state automaton constructed from a regular expression can be used to recognize it. 2. Parsing - Identifying syntactic structures - so called :non-terminals' - constructed from one or more tokens and non-terminals, representing complicated language elements, for instance assignments, conditions and loops. 1his is typically done with a parser for a context-free grammar, often an LL parser or LR parser from a parser generator 3. Intermediate Language Ceneration - an equivalent to the original program is created in a special purpose intermediate language