1. What are three
reasons why syntax analyzers are based on grammars?
First, using BNF descriptions of the syntax of programs are
clear and concise. Second, can be used as the direct basis for the syntax
analyzer. Third, implementations based on BNF are relatively easy to maintain
because of their modularity.
2. Explain the three
reasons why lexical analysis is separated from syntax analysis.
1. Simplicity (Techniques for lexical analysis are less complex
than those required for syntax analysis, so the lexical-analysis process can be
simpler if it is separate. Also, removing the low level details of lexical
analysis from the syntax analyzer makes the syntax analyzer both smaller and
less complex)
2. Efficiency (Although it pays to optimize the lexical analyzer, because lexical analysis requires a significant portion of total compilation time, it is not fruitful to optimize the syntax analyzer. Separation facilitates this selective optimization)
3. Portability (Because the lexical analyzer reads input program files and often includes buffering of that input, it is somewhat platform dependent. However, they syntax analyzer can be platform independent. It is always good to isolate machine-dependent parts of any software system.)
2. Efficiency (Although it pays to optimize the lexical analyzer, because lexical analysis requires a significant portion of total compilation time, it is not fruitful to optimize the syntax analyzer. Separation facilitates this selective optimization)
3. Portability (Because the lexical analyzer reads input program files and often includes buffering of that input, it is somewhat platform dependent. However, they syntax analyzer can be platform independent. It is always good to isolate machine-dependent parts of any software system.)
3. Define lexeme and
token
Lexeme is the logical groupings that the lexical analyzer
collects characters into and Token is the internal codes for categories of
these groupings.
4. What are the
primary tasks of a lexical analyzer?
Lexical analysis process includes skipping comments and white
space outside lexemes, as they are not relevant to the meaning of the program.
Also lexical analyzer inserts lexemes for user-defined names into the symbol
table, which is used by later phases of the compiler. Finally, lexical analysis
detect syntactic errors in tokens, such as ill-formed floating-point literals,
and report such errors to the user.
5.
Describe briefly the three approaches to building a lexical analyzer.
–
Write a formal description of the token patterns of the language using a
descriptive language related to regular expression.
–
Design a state transition diagram that describes the token patterns of the
language and write a program that implements the diagram.
– Design a state
transition diagram that describes the token patterns of the language and
hand-construct a table-driven implementation of the state diagram.
6.
What is a state transition diagram?
A state transition diagram, or just state diagram, is a directed graph. he nodes of a state diagram are labeled with state names. The arcs are labeled with the input characters that cause the transitions among the states. An arc may also include actions the lexical analyzer must perform when the transition is taken.
A state transition diagram, or just state diagram, is a directed graph. he nodes of a state diagram are labeled with state names. The arcs are labeled with the input characters that cause the transitions among the states. An arc may also include actions the lexical analyzer must perform when the transition is taken.
State diagrams of the form used for lexical
analyzers are representations
of a class of mathematical machines called finite
automata. Finite automata
can be designed to recognize members of a class of
languages called regular
languages. Regular grammars are generative devices
for regular languages.
The tokens of a programming language are a regular
language, and a lexical
analyzer is a finite automaton.
7. Why are character
classes used, rather than individual characters, for the letter and digit transitions of a state diagram for a lexical analyzer?
Observe that there are 52 different characters that can begin a
name, which would require 52 transitions from the transition diagram's initial
state. However, a lexical analyzer is interested only in determining that it is
a name and is not concerned with which specific name it happens to be.
Therefore, we define a character class named LETTER for all 52 letters and use
a single transition on the first letter of any name.
8. What are the two
distinct goals of syntax analysis?
1. Syntax analyzer must check input program to determine whether
it is syntactically correct. When an error is found the analyzer must produce a
diagnostic message and recover. (recovery means get back to a normal state and
continue its analysis of the input program)
2. produce complete parse tree, or at least trace the structure of the complete parse tree, for syntactically correct input. Parse tree is used as the basis for translation
2. produce complete parse tree, or at least trace the structure of the complete parse tree, for syntactically correct input. Parse tree is used as the basis for translation
9. Describe the
difference between top-down and bottom-up parser.
Top-down the tree is built from the root downward to the leaves
Bottom-up parse tree is built from the leaves upward to the root
Bottom-up parse tree is built from the leaves upward to the root
10. Describe the parsing problem for a Top-Down Parser
The general form of a left sentential form is xAy, whereby our
notational conventions x is a string of terminal symbols, A is a non terminal,
and y is a mixed string. Because x contains only terminals, A is the leftmost
non terminal in the sentential form, so it is the one that must be expanded to
get the next sentential form in a left- most derivation.
Determining the next sentential form is a matter of choosing the correct grammar rule that has A as its LHS. For example, if the current sentential form is xAy and the A-rules are A → bB, A → cBb, and A → a, a top-down parser must choose among these three rules to get the next sentential form, which could be xbBy, xcBby, or xay. This is the parsing decision problem for top-down parsers.
Determining the next sentential form is a matter of choosing the correct grammar rule that has A as its LHS. For example, if the current sentential form is xAy and the A-rules are A → bB, A → cBb, and A → a, a top-down parser must choose among these three rules to get the next sentential form, which could be xbBy, xcBby, or xay. This is the parsing decision problem for top-down parsers.
11. Describe the parsing
problem for a bottom-up parser
Bottom-up parser
can only identify and process the low level details of the structure of the
text prior to the secondary level and
leave the overall structure of the highest level.
12. Explain why
compilers use parsing algorithms that work on only a subset of all grammars.
Because parsing algorithms that works for any unambiguous
grammar are complex and inefficient. In fact, the complexity of the algorithm
is O (n3). Algorithm is O (n3) is not usually useful for practical processes,
such as analysis of the syntax for the compiler, because they are too slow. In
this situation, computer scientists often look for a faster algorithm, although
it is less common. In the case of parsing, faster algorithm has listed who work
only a subset of the set of all possible grammar
13. Why are named
constants used, rather than numbers, for token codes?
Because
named constants can used to simplify the reading of lexical and syntactic
analysis
14. Describe how a
recursive-descent parsing subprogram is written for a rule with a single RHS
First it calls the function that aparses terms(term). Then it
continues to call that function as long as it finds ADD_OP or SUB_OP tokens
(which it passes over by calling lex). A recursive-descent subprogram for a
rule with a single RHS is relatively simple. For each terminal symbol in the
RHS, that terminal symbol is compared with next Token. If they do not match, it
is a syntax error. If they match,
the lexical analyzer is called to get the next input token. For each nonterminal, the parsing subprogram for that nonterminal is called.
the lexical analyzer is called to get the next input token. For each nonterminal, the parsing subprogram for that nonterminal is called.
15. Explain the two
grammar characteristics that prohibit them from being used as the basis for a
top-down parser.
Direct or indirect Left Recursion. Recursive decesnt parser
subprogram for A immediately calls itself to parse the first symbol in its RHS.
That activation of the A parser subprogram then immediately calls itself again,
and again, and so forth. It is easy to see that this leads nowhere (stack
overflow)
16. What is the FIRST
set for a given grammar and sentential form?
FIRST( ) = {a => a } (If => , is in
FIRST( ))
in which =>* means 0 or more derivation steps
in which =>* means 0 or more derivation steps
17.
Describe the pairwise disjointness test.
The pairwise test try
to test whether a parsing subprogram can determine which RHS is being parsed on
the basis of the next token of input.
18. What is left factoring?
This states that a <variable> is either an identifier or an identifier followed by
an expression in brackets (a subscript). These rules clearly do not pass the
pair-wise disjointness test, because both RHSs begin with the same terminal, identi-fier.
This states that a <variable> is either an identifier or an identifier followed by
an expression in brackets (a subscript). These rules clearly do not pass the
pair-wise disjointness test, because both RHSs begin with the same terminal, identi-fier.
19. What is a phrase of a
sentential form?
A phrase is a subsequence of a
sentential form that is eventually reduced to a single non-terminal
20. What is a simple phrases of a sentential form?
simple phrases is just a phrase that takes a single derivation step from its root nonterminal node.
simple phrases is just a phrase that takes a single derivation step from its root nonterminal node.
21. What is the handle
of a sentential form?
a handle of a right-sentential form γ is a production A → β and
a position in γ where β may be found and replaced by A to produce the previous
right-sentential form in a rightmost derivation of γ.
22. What is the
mathematical machine on which both top-down and bottom-up parsers are based?
A PDA (PushDown Automation) machine use a both top-down parsers
and bottom-up parsers because it recognizing context free languange.
23. Describe three
advantages of LR parsers.
1. They can be built for all programming languages.
2. They can detect syntax errors as soon as it possible in a left-to-right scan.
3. The LR class of grammars is a proper superset of the class parsable by LL parsers.
2. They can detect syntax errors as soon as it possible in a left-to-right scan.
3. The LR class of grammars is a proper superset of the class parsable by LL parsers.
25. Describe the
purpose of the ACTION table of an LR parser
The ACTION part specifies what the parser should do, given the
state symbol on
top of the parse stack and the next token of input.
top of the parse stack and the next token of input.
26. Describe the purpose
of the GOTO table of an LR parser
The rows of the GOTO part of the LR parsing table have state
symbols as labels. This part of the table has nonterminals as column labels.
The values in the GOTO part of the table indicate which state symbol should be
pushed onto the parse stack after a reduction has been completed, which means
the handle has been removed from the parse stack and the new nonterminal has
been pushed onto the parse stack.
27. Is left recursion
a problem for LR parsers?
NO many left recursive grammars are LR, but none are LL
No comments:
Post a Comment