You are on page 1of 4

What is Syntax analysis?

Syntax analysis is a second phase of the compiler design process that comes after lexical
analysis.
It analyses the syntactical structure of the given input. It checks if the given input is in the
correct syntax of the programming language in which the input which has been written. It is
known as the Parse Tree or Syntax Tree.

The Parse Tree is developed with the help of pre-defined grammar of the language. The
syntax analyzer also checks whether a given program fulfills the rules implied by a context-
free grammar.
If it satisfies, the parser then creates the parse tree of that source program. Otherwise, it will
display error messages.

Why do you need Syntax Analyzer?

 Check if the code is valid grammatically


 The syntactical analyzer helps you to apply rules to the code
 Helps you to make sure that each opening brace has a corresponding closing balance
 Each declaration has a type and that the type must be exists

Important Syntax Analyzer Terminology

Important terminologies used in syntax analysis process:


 Sentence: A sentence is a group of character over some alphabet.
 Lexeme: A lexeme is the lowest level syntactic unit of a language (e.g., total, start).
 Token: A token is just a category of lexemes.
 Keywords and reserved words – It is an identifier which is used as a fixed part of
the syntax of a statement. It is a reserved word which you can't use as a variable name
or identifier.
 Noise words - Noise words are optional which are inserted in a statement to enhance
the readability of the sentence.
 Comments – It is a very important part of the documentation. It mostly display by, /*
*/, or//Blank (spaces)
 Delimiters – It is a syntactic element which marks the start or end of some syntactic
unit. Like a statement or expression, "begin"...''end", or {}.
 Character set - ASCII, Unicode
 Identifiers – It is a restrictions on the length which helps you to reduce the readability
of the sentence.
 Operator symbols - + and – performs two basic arithmetic operations.
 Syntactic elements of the Language

Why do we need Parsing?

A parse also checks that the input string is well-formed, and if not, reject it.

Following are important tasks perform by the parser:


 Helps you to detect all types of Syntax errors
 Find the position at which error has occurred
 Clear & accurate description of the error.
 Recovery from an error to continue and find further errors in the code.
 Should not affect compilation of "correct" programs.
 The parse must reject invalid texts by reporting syntax errors
Parsing Techniques
Parsing techniques are divided into two different groups:
 Top-Down Parsing,
 Bottom-Up Parsing
Top-Down Parsing:
In the top-down parsing construction of the parse tree starts at the root and then proceeds
towards the leaves.
Two types of Top-down parsing are:
1. Predictive Parsing:
Predictive parse can predict which production should be used to replace the specific input
string. The predictive parser uses look-ahead point, which points towards next input symbols.
Backtracking is not an issue with this parsing technique. It is known as LL(1) Parser
2. Recursive Descent Parsing:
This parsing technique recursively parses the input to make a prase tree. It consists of several
small functions, one for each nonterminal in the grammar.
Bottom-Up Parsing:
In the bottom-up parsing technique the construction of the parse tree starts with the leave, and
then it processes towards its root. It is also called as shift-reduce parsing. This type of parsing
is created with the help of using some software tools.
Error – Recovery Methods
Common Errors that occur in Parsing
 Lexical: Name of an incorrectly typed identifier
 Syntactical: unbalanced parenthesis or a missing semicolon
 Semantical: incompatible value assignment
 Logical: Infinite loop and not reachable code
A parser should able to detect and report any error found in the program. So, whenever an
error occurred the parser. It should be able to handle it and carry on parsing the remaining
input. A program can have following types of errors at various compilation process stages.
There are five common error-recovery methods which can be implemented in the parser
Statement mode recovery
 In the case when the parser encounters an error, it helps you to take corrective steps.
This allows rest of inputs and states to parse ahead.
 For example, adding a missing semicolon is comes in statement mode recover
method. However, parse designer need to be careful while making these changes as
one wrong correction may lead to an infinite loop.
Panic-Mode recovery
 In the case when the parser encounters an error, this mode ignores the rest of the
statement and not process input from erroneous input to delimiter, like a semi-colon.
This is a simple error recovery method.
 In this type of recovery method, the parser rejects input symbols one by one until a
single designated group of synchronizing tokens is found. The synchronizing tokens
generally using delimiters like or.
Phrase-Level Recovery:
 Compiler corrects the program by inserting or deleting tokens. This allows it to
proceed to parse from where it was. It performs correction on the remaining input. It
can replace a prefix of the remaining input with some string this helps the parser to
continue the process.
Error Productions
 Error production recovery expands the grammar for the language which generates the
erroneous constructs. The parser then performs error diagnostic about that construct.
Global Correction:
 The compiler should make less number of changes as possible while processing an
incorrect input string. Given incorrect input string a and grammar c, algorithms will
search for a parse tree for a related string b. Like some insertions, deletions, and
modification made of tokens needed to transform an into b is as little as possible.

You might also like