Podcast
Questions and Answers
What is the purpose of the function scanner()
in the provided code?
What is the purpose of the function scanner()
in the provided code?
- To scan and identify symbols in the input stream (correct)
- To generate output from the input stream
- To handle syntax errors in the input
- To parse input files
Which case in the scanner handles printing commands?
Which case in the scanner handles printing commands?
- case 'i':
- case 'f':
- case 'p': (correct)
- case '=':
What action does the scanner take when it encounters a whitespace character?
What action does the scanner take when it encounters a whitespace character?
- It skips the character and continues scanning (correct)
- It returns an error
- It adds the whitespace to the symbol table
- It stops the scanning process
What will happen if the scanner reads an unexpected character?
What will happen if the scanner reads an unexpected character?
In the given code snippet, what does scanDigits()
likely do?
In the given code snippet, what does scanDigits()
likely do?
Which data type is used for symbol objects in the scanner?
Which data type is used for symbol objects in the scanner?
What does the malloc
function do in the context of the scanner?
What does the malloc
function do in the context of the scanner?
What type of symbol does the scanner associate with a variable name read as a lowercase letter?
What type of symbol does the scanner associate with a variable name read as a lowercase letter?
What is the purpose of the 'yywrap' function in the provided program?
What is the purpose of the 'yywrap' function in the provided program?
What will the lexer output if the input is '12345'?
What will the lexer output if the input is '12345'?
Which symbol is NOT accounted for in the lexer according to the requirements given?
Which symbol is NOT accounted for in the lexer according to the requirements given?
How does the macro 'YY_INPUT' contribute to the lexer functionality?
How does the macro 'YY_INPUT' contribute to the lexer functionality?
What would be an appropriate modification to handle floating point numbers?
What would be an appropriate modification to handle floating point numbers?
Which of the following represents the correct regular expression for an identifier in this lexer?
Which of the following represents the correct regular expression for an identifier in this lexer?
What is the main purpose of the lexical analyzer described in the content?
What is the main purpose of the lexical analyzer described in the content?
Which of the following actions should be performed for tokens categorized as comments?
Which of the following actions should be performed for tokens categorized as comments?
What will the function scanDigits() return if the input character stream contains a valid integer?
What will the function scanDigits() return if the input character stream contains a valid integer?
What is the purpose of the line 'symbolPtr= (Symbol*) malloc(sizeof(Symbol));' in the scanDigits() function?
What is the purpose of the line 'symbolPtr= (Symbol*) malloc(sizeof(Symbol));' in the scanDigits() function?
Which of the following tools was developed as a faster lexical analyzer compared to lex?
Which of the following tools was developed as a faster lexical analyzer compared to lex?
In the lex/flex example program, what is the purpose of the line 'printf("%c",yytext);'?
In the lex/flex example program, what is the purpose of the line 'printf("%c",yytext);'?
When is 'yywrap()' called in a lex/flex program?
When is 'yywrap()' called in a lex/flex program?
What is the primary role of the 'flex' tool in programming?
What is the primary role of the 'flex' tool in programming?
What type of symbol does the function assign when a decimal point is detected in scanDigits()?
What type of symbol does the function assign when a decimal point is detected in scanDigits()?
What is the expected behavior of the example lex/flex program when run?
What is the expected behavior of the example lex/flex program when run?
Which regular expression matches a string that starts with 'begin'?
Which regular expression matches a string that starts with 'begin'?
What does the regular expression '[^0-9]' represent?
What does the regular expression '[^0-9]' represent?
In the context of regular expressions, what does '*' signify?
In the context of regular expressions, what does '*' signify?
What is the purpose of the '' character in a regular expression?
What is the purpose of the '' character in a regular expression?
Which regular expression will match an optional '+' followed by one or more digits?
Which regular expression will match an optional '+' followed by one or more digits?
In the given context, how would you interpret the expression 'A{1,3}'?
In the given context, how would you interpret the expression 'A{1,3}'?
Which command sequence correctly compiles a lex program named 'ex4_echoer.lex'?
Which command sequence correctly compiles a lex program named 'ex4_echoer.lex'?
How does the '$' symbol function in a regex?
How does the '$' symbol function in a regex?
What does the line 'yylex();' accomplish in the program?
What does the line 'yylex();' accomplish in the program?
What is the role of the function yywrap() in the lex/flex program?
What is the role of the function yywrap() in the lex/flex program?
What happens if yywrap() returns a value of 1?
What happens if yywrap() returns a value of 1?
What does the regular expression '.*' mean in this context?
What does the regular expression '.*' mean in this context?
In the context of the lex/flex program, what is the outcome of typing 'quit'?
In the context of the lex/flex program, what is the outcome of typing 'quit'?
Which command compiles the lex file into a C source file?
Which command compiles the lex file into a C source file?
What is the significance of the line 'printf("\n");' in the given program?
What is the significance of the line 'printf("\n");' in the given program?
If you wanted to count characters and newlines, which task would be appropriate?
If you wanted to count characters and newlines, which task would be appropriate?
Flashcards
Tokenization
Tokenization
The process of converting a stream of characters (source code) into a sequence of tokens, which represent meaningful units of the program.
Tokenizer
Tokenizer
A program that performs tokenization on a source code file.
Token
Token
An individual unit of meaning in a program, such as keywords, identifiers, operators, and literals.
Input Character Stream
Input Character Stream
Signup and view all the flashcards
Symbol
Symbol
Signup and view all the flashcards
Symbol Type
Symbol Type
Signup and view all the flashcards
Symbol Value
Symbol Value
Signup and view all the flashcards
End Symbol
End Symbol
Signup and view all the flashcards
EOF (End of File)
EOF (End of File)
Signup and view all the flashcards
Regular Expression
Regular Expression
Signup and view all the flashcards
Lexer (Lexical Analyzer)
Lexer (Lexical Analyzer)
Signup and view all the flashcards
yywrap()
yywrap()
Signup and view all the flashcards
yylex()
yylex()
Signup and view all the flashcards
yytext
yytext
Signup and view all the flashcards
Lex Rules
Lex Rules
Signup and view all the flashcards
Action Code
Action Code
Signup and view all the flashcards
quoted strings
quoted strings
Signup and view all the flashcards
identifiers
identifiers
Signup and view all the flashcards
reserved words
reserved words
Signup and view all the flashcards
integers
integers
Signup and view all the flashcards
lexer
lexer
Signup and view all the flashcards
operators
operators
Signup and view all the flashcards
Period (.)
Period (.)
Signup and view all the flashcards
Vertical Bar (|)
Vertical Bar (|)
Signup and view all the flashcards
Square Brackets ([...])
Square Brackets ([...])
Signup and view all the flashcards
Negated Character Class ([^...])
Negated Character Class ([^...])
Signup and view all the flashcards
Parentheses ()
Parentheses ()
Signup and view all the flashcards
Asterisk (*)
Asterisk (*)
Signup and view all the flashcards
Plus Sign (+)
Plus Sign (+)
Signup and view all the flashcards
Question Mark (?)
Question Mark (?)
Signup and view all the flashcards
String
String
Signup and view all the flashcards
Lexical Analyzer
Lexical Analyzer
Signup and view all the flashcards
Language Grammar
Language Grammar
Signup and view all the flashcards
Flex
Flex
Signup and view all the flashcards
Multi-Component program
Multi-Component program
Signup and view all the flashcards
Component
Component
Signup and view all the flashcards
Study Notes
Course Information
- Course: CSC 448 Compiler Design
- Lecturer: Joseph Phillips
- University: De Paul University
- Date: 2018 April 9
Reading Material
- Book Title: "Crafting a Compiler"
- Authors: Charles Fischer, Ron Cytron, Richard LeBlanc Jr.
- Publication Year: 2010
- Chapter 3: Scanning - Theory and Practice
Topics
- Scanning (Practice)
- Flex
Compiler Structure
- Program → Compiler → Symbol Table → Executable
- Scanner → Parser → Type checker → Translator → Optimizer → Code generator
Hand-coded Tokenizer (Example)
- The code snippet shows a hand-coded tokenizer.
- The code tokenizes input characters.
- It uses different cases to identify and categorize the tokens: assignment, addition, subtraction, print, integer declaration, float declaration, identifier etc.
- Input stream is checked for whitespace before processing.
- It handles integers, and identifiers.
- The code contains error handling
Flex (Alternative Tokenizer)
- Flex is a tool for generating lexical analyzers (scanners) in C.
- It allows specifying regular expressions to define tokens.
- Flex automatically generates C code for the scanner.
- It has better performance and flexibility than a hand-coded tokenizer.
- Flex's history
- Lexical Analyzer created in 1970s by Mike Lesk and Eric Schmidt.
- Fast Lexical analyzer (flex) created in 1987 by Vern Paxson.
First Lex/Flex Example (Basic Echoer)
- Shows a compilation process of a simple lex/flex program that echoes input to output.
First Lex/Flex Program (Variations & Examples)
- First Lex/Flex program variations demonstrate the code structure.
- Explanation: There are example input and output text to illustrate various commands and compilation steps
Lex/Flex Rules
- Regular Expressions:
- Period (.) matches any character.
- Bracket expressions [ ] match the set of characters inside.
- Ranges 0-9 can be used.
- Negated bracketed expressions [^ ] matches any character not in the bracket.
- Repetition:
-
- matches zero or more occurrences of the preceding element.
-
- matches one or more occurrences of the preceding element
- ? matches zero or one occurrence of the preceding element
- {} matches a specific number of occurrences of the preceding element
- / matches the preceding element only if followed by the following element
-
- Anchoring:
- ^ matches the beginning of a line.
- $ matches the end of a line.
- Grouping and escaping:
- Parenthesis () group expressions
- \ character is used in regular expressions to escape special characters (Like \n)
Counting Characters, Newlines And Vowels
- Specific Lex/flex programs provided to show how to count characters/lines.
- Some examples to count vowel and nonvowel letters.
Lex-Flex Functions and Variables
- yyin : A FILE pointer to the input. Used like stdin.
- yywrap(): Called at the end of the input. Should return 0 to read a new file, or 1 if there's nothing more to read
- yytext: holds the currently read lexeme in a char array
- getchar/strdup: function used to read and create string copies
- YY_INPUT: A macro to efficiently read input buffer-fuls of characters.
Input Control
- The
input()
oryyinput
function in C++ Allows us to read characters from the input - Methods for controlling input such as skipping over comments
Nested Comments
- Lex rule examples of properly reading nested C comments
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explore the principles of scanning as discussed in Chapter 3 of 'Crafting a Compiler.' This quiz covers the theoretical and practical aspects of tokenization, focusing on hand-coded tokenizers and the use of Flex as an alternative. Test your understanding of compiler structures and the importance of scanning in the compilation process.