BNF Notation: Dive Deeper Into Python's Grammar – Real Python PDF
Document Details
Uploaded by PatriConnemara6019
Tags
Related
- BNF 85 (British National Formulary) March-September 2023 PDF
- BNF 85 (British National Formulary) March 2023 Guidance on prescribing PDF
- BNF 85 (British National Formulary) March 2023 - Gastro-intestinal System PDF
- BNF 85 (British National Formulary) March 2023 Cardiovascular System PDF
- BNF 78 - September 2019 - March 2020 Medicine Information PDF
- BNF 78 September 2019 – March 2020 PDF
Summary
This document provides a tutorial on Backus-Naur Form (BNF) notation, specifically focusing on Python's BNF variation. It explains the fundamentals of BNF, including terminals, nonterminals, and rules, and demonstrates how to read and interpret BNF grammar rules within the Python documentation. The tutorial is intended for individuals with a basic understanding of Python and programming languages, aiming to enhance their understanding of Python's syntax through practical examples.
Full Transcript
27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python In this tutorial, you’ll: Learn what BNF notation is and what it’s used for Explore the characteristics of Python’s BNF variation...
27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python In this tutorial, you’ll: Learn what BNF notation is and what it’s used for Explore the characteristics of Python’s BNF variation Learn how to read the BNF notation in the Python documentation Explore some best practices for reading Python’s BNF notation To get the most out of this tutorial, you should be familiar with Python syntax, including keywords, operators, and some common constructs like expressions, conditional statements, and loops. Get Your Code: Click here to download the free sample code that shows you how to read Python’s BNF notation. Getting to Know Backus-Naur Form Notation (BNF) The Backus–Naur form or Backus normal form (BNF) is a metasyntax notation for context-free grammars. Computer scientists often use this notation to describe the syntax of programming languages because it allows them to write a detailed description of a language’s grammar. The BNF notation consists of three core pieces: Component Description Examples Terminals Strings that must exactly match specific items in the "def", "return", ":" input. Nonterminals Symbols that will be replaced by a concrete value. , They may also be called simply syntactic variables. https://realpython.com/python-bnf-notation/ 1/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Component Description Examples Rules Conventions of terminals and nonterminals that ::= "a" define how these elements relate. By combining terminals and nonterminals, you can create BNF rules, which can get as detailed as you need. Nonterminals must have their own defining rules. In a piece of grammar, you’ll have a root rule and potentially many secondary rules that define the required nonterminals. This way, you may end up with a hierarchy of rules. BNF rules are the core components of a BNF grammar. So, a grammar is a set of BNF rules that are also called production rules. In practice, you can build a set of BNF rules to specify the grammar of a language. Here, language refers to a set of strings that are valid according to the rules defined in the corresponding grammar. BNF is mainly used for programming languages. For example, the Python syntax has a grammar that’s defined as a set of BNF rules, and these rules are used to validate the syntax of any piece of Python code. If the code doesn’t fulfill the rules, then you’ll get a SyntaxError. You’ll find many variations of the original BNF notation out there. Some of the most relevant include the extended Backus–Naur form (EBNF) and augmented Backus–Naur form (ABNF). In the following sections, you’ll learn the basics of creating BNF rules. Note that you’ll use a variation of BNF that matches the requirements of the BNF Playground site, which you’ll use for testing your rules. Remove ads https://realpython.com/python-bnf-notation/ 2/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python BNF Rules and Their Components As you already learned, by combining terminals and nonterminals, you can create BNF rules. These rules typically follow the syntax below: BNF Grammar ::= expression In the BNF rule syntax, you have the following parts: is a nonterminal variable, which is often enclosed in angle brackets (). ::= means that the nonterminal on the left will be replaced with the expression on the right. expression consists of a series of terminals, nonterminals, and other symbols that define a specific piece of grammar. When building BNF rules, you can use a variety of symbols with specific meanings. For example, if you’re going to use the BNF Playground site to compile and test your rules, then you’ll find yourself using some of the following symbols: Symbol Meaning "" Encloses a terminal symbol Indicates a nonterminal symbol () Indicates a group of valid options + Specifies one or more of the previous element * Specifies zero or more of the previous element https://realpython.com/python-bnf-notation/ 3/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Symbol Meaning ? Specifies zero or one occurrence of the previous element | Indicates that you can select one of the options [x-z] Indicates letter or digit intervals Once you know how to write a BNF rule and what symbols to use, you can start creating your own rules. Note that the BNF Playground has several additional symbols and syntactical constructs that you can use in your rules. For a complete reference, click the Grammar Help section at the top of the page. Now, it’s time to start playing with a couple of custom BNF rules. To kick things off, you’ll start with a generic example. A Generic Example: Grammar for a Full Name Say that you need to create a context-free grammar to define how a user should input a person’s full name. In this case, the full name will have three components: 1. First name 2. Middle name 3. Family name Between each component, you need to place exactly one whitespace. You should also treat the middle name as optional. Here’s how you can define this rule: BNF Grammar ::= " " ( " ")? https://realpython.com/python-bnf-notation/ 4/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python The left-hand part of your BNF rule is a nonterminal variable that identifies the person’s full name. The ::= symbol denotes that will be replaced with the right-hand part of the rule. The right-hand part of the rule has several components. First, you have the first name, which you define using the nonterminal. Next, you need a space to separate the first name from the following component. To define this space, you use a terminal, which consists of a space character between quotes. After the first name, you can accept a middle name, and after that, you need another space. So, you open parentheses to group these two elements. Then you create and the " " terminal. Both are optional, so you use a question mark (?) after to denote that condition. Finally, you need the family name. To define this component, you use another nonterminal,. That’s it! You’ve built your first BNF rule. However, you still don’t have a working grammar. You only have a root rule. To complete the grammar, you need to define rules for , , and. To do this, you need to meet some requirements: Each name component will accept only letters. Each name component will start with a capital letter and continue with lowercase letters. In this case, you can start by defining two rules, one for uppercase letters and one for lowercase letters: BNF Grammar ::= " " ( " ")? ::= [A-Z] ::= [a-z] In the highlighted lines of this grammar snippet, you create two pretty similar rules. The first rule accepts all the ASCII letters from uppercase A to Z. The second rule accepts all the lowercase letters. In this example, you don’t support accents or other non-ASCII letters. https://realpython.com/python-bnf-notation/ 5/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python With these rules in place, you can build the rest of your rules. To kick things off, go ahead and add the rule: BNF Grammar ::= " " ( " ")? ::= [A-Z] ::= [a-z] ::= * To define the rule, you start with the nonterminal to express that the first letter of the name must be an uppercase letter. Then, you continue with the nonterminal followed by an asterisk (*). This asterisk means that the first name will accept zero or more lowercase letters after the initial uppercase letter. You can follow this same pattern to build the and rules. Would you like to give it a try? Once you’re done, click the collapsible section below to get the complete grammar so that you can compare it with yours: Full name grammar Show/Hide You can check if your full name grammar works using the BNF Playground site. Here’s a demo: Once you navigate to the BNF Playground site, you can paste your grammar rules in the text input area. Then press the COMPILE BNF button. If everything is okay with your BNF rules, then you can enter a full name in the Test a string here! input field. Once you’ve entered a person’s full name, the field will turn green if the input string fulfills the rules. https://realpython.com/python-bnf-notation/ 6/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Remove ads A Programming-Related Example: Identifiers In the previous section, you learned how to create a BNF grammar that defines how your users must provide a person’s name. This is a generic example that may or may not relate to programming. In this section, you’ll get more technical by writing a short set of BNF rules to validate an identifier in a hypothetical programming language. An identifier can be a variable, function, class, or an object’s name. In your example, you’ll write a set of rules to check whether a given string meets the following requirements: The first character is an uppercase or lowercase letter or an underscore. The rest of the characters can be uppercase or lowercase letters, digits, or underscores. Here’s the root rule for your identifier: BNF Grammar ::= ( | )* In this rule, you have the nonterminal variable, which defines the root. On the right-hand side, you first have the nonterminal. The rest of the identifier is grouped inside parentheses. The asterisk after the group says that elements from the group can appear zero or more times. Each such element is either a character or a digit. Now, you need to define the and nonterminals with their own dedicated rules. They’ll look like in the code below: https://realpython.com/python-bnf-notation/ 7/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python BNF Grammar ::= ( | )* ::= [A-Z] | [a-z] | "_" ::= [0-9] The rule accepts one ASCII letter in either lowercase or uppercase. Alternatively, it can accept an underscore. Finally, the rule accepts a digit from 0 to 9. Now, your set of rules is complete. Go ahead and give it a try on the BNF Playground site. For you as a programmer, reading BNF rules can be a pretty useful skill. For example, you’ll often find that the official documentation of many programming languages includes the BNF grammar of the languages, in whole or in part. So, being able to read BNF allows you to better understand the language syntax and intricacies. From this point on, you’ll learn how to read Python’s BNF variation, which you’ll find in several parts of the language documentation. Understanding Python’s BNF Variation Python uses a custom variation of the BNF notation to define the language’s grammar. In many parts of the Python documentation, you’ll find portions of BNF grammar. These snippets can help you better understand any syntactic construct that you’re studying. Python’s BNF variation uses the following style: Symbol Meaning name Holds the name of a rule or nonterminal ::= Means expand into https://realpython.com/python-bnf-notation/ 8/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Symbol Meaning | Separates alternatives * Accepts zero or more repetitions of the preceding item + Accepts one or more repetitions of the preceding item [] Accepts zero or one occurrence, which means that the enclosed item is optional () Groups options "" Defines literal strings space Is only meaningful to separate tokens These symbols define Python’s BNF variation. One notable difference from what regular BNF rules look like is that Python doesn’t use angle brackets () to enclose nonterminal symbols. It only uses the nonterminal identifier or name. Arguably, this makes rules cleaner and more readable. Also note that the square brackets ([]) have a different meaning for Python. Up to this point, you’ve used them to enclose sets of characters like [a-z]. In Python, these brackets mean that the enclosed element is optional. To define something like [a-z] in Python’s BNF variation, you’ll use "a"..."z" instead. You’ll find many BNF snippets in the Python documentation. Learning how to navigate and read them is quite a useful skill for you as a Python developer. So, in the following sections, you’ll explore a few examples of BNF rules from the Python documentation, and you’ll learn how to read them. Reading BNF Rules From Python’s Documentation: Examples https://realpython.com/python-bnf-notation/ 9/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Now that you know the basics of reading the BNF notation and you’ve learned the characteristics of Python’s BNF variation, it’s time for you to start reading some BNF grammar from the Python documentation. This way, you’ll build the required skills to take advantage of this notation to learn more about Python and its syntax. Remove ads The pass and return Statements To kick things off, you’ll start with the pass statement, which is a simple statement that allows you to do nothing in Python. The BNF notation for this statement is like the following: BNF Grammar pass_stmt ::= "pass" Here, you have the name of the rule, pass_stmt. Then you have the ::= symbol to indicate that the rule expands to "pass", which is a terminal symbol. This means that this statement consists of the pass keyword on its own. There are no additional syntactical components. So, you end up knowing the syntax for the pass statement: Python pass The BNF rule for the pass statement is one of the simplest rules that you’ll find in the documentation. It only contains a terminal that defines the syntax straightforwardly. https://realpython.com/python-bnf-notation/ 10/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Another common statement that you’ll often use in your day-to-day coding is return. This statement is a bit more complex than pass. Here’s the BNF rule for return from the documentation: BNF Grammar return_stmt ::= "return" [expression_list] In this case, you have the rule’s name, return_stmt, and the ::= as usual. Then, you have a terminal symbol consisting of the word return. The second component of this rule is an optional list of expressions, expression_list. You know that this second component is optional because it’s enclosed in square brackets. Having an optional list of expressions after the word return is consistent with the fact that Python allows return statements without an explicit return value. In this case, the language automatically returns None, which is Python’s null value: Python >>> def func():... return... >>> print(func()) None This toy function uses a bare return without providing an explicit return value. In this case, Python automatically returns None for you. https://realpython.com/python-bnf-notation/ 11/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Now, if you click the expression_list variable on the documentation, then you’ll land on the rule below: BNF Grammar expression_list ::= expression ("," expression)* [","] Again, you have the rule’s name and the ::= symbol. Then, you have a required nonterminal variable, expression. This nonterminal symbol has its own definition rule, which you can access by clicking on the symbol itself. Up to this point, you have the syntax of a return statement with a single return value: Python >>> def func():... return "Hello!"... >>> func() 'Hello!' In this example, you use the "Hello!" string as the return value of your function. Note that the return value can be any Python object or expression. The rule continues by opening parentheses. Remember that BNF uses parentheses to group objects. In this case, you have a terminal consisting of a comma (","), and then you have the expression symbol again. The asterisk after the closing parentheses indicates that this construct can appear zero or more times. https://realpython.com/python-bnf-notation/ 12/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python This part of the rule describes those return statements with multiple return values: Python >>> def func():... return "Hello!", "Pythonista!"... >>> func() ('Hello!', 'Pythonista!') Now, your function returns two values. To do this, you provide a comma-separated series of values. When you call the function, you get a tuple of values. The final part of the rule is [","]. This tells you that the list of expressions can include an optional trailing comma. This comma may cause tricky results: Python >>> def func():... return "Hello!", https://realpython.com/python-bnf-notation/ 13/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python... >>> func() ('Hello!',) In this example, you use the trailing comma after a single return value. As a result, your function returns a tuple with a single item. However, note that the comma doesn’t cause any effect if you already have multiple comma-separated values: Python >>> def func():... return "Hello!", "Pythonista!",... >>> func() ('Hello!', 'Pythonista!') In this example, you add a trailing comma to a return statement with multiple return values. Again, you get a tuple of values when you call the function. https://realpython.com/python-bnf-notation/ 14/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Remove ads Assignment Expressions Another interesting BNF snippet that you can find in the Python documentation is the one that defines the syntax of assignment expressions, which you build with the walrus operator. Here’s the root BNF rule for this type of expression: BNF Grammar assignment_expression ::= [identifier ":="] expression The right-hand part of this rule starts with an optional component that includes a nonterminal called identifier and a terminal consisting of the ":=" symbol. This symbol is the walrus operator itself. Then, you have a required expression. Note: At first glance, it may be weird that the assignment part is optional, as the whole point of an assignment expression is the assignment itself. However, making this part optional greatly simplifies many of the grammar rules because an assignment expression is allowed almost everywhere a plain expression is. You’ll see an example of this simplification in the following section. This matches the syntax of an assignment expression with the walrus operator: Python identifier := expression https://realpython.com/python-bnf-notation/ 15/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Note that in an assignment expression, the assignment part is optional. You’ll get the same value out of evaluating the expression whether you perform the assignment or not. Here’s a working example of an assignment expression: Python >>> (length := len([1, 2, 3])) 3 >>> length 3 In this example, you create an assignment expression that assigns the number of items in a list to the length variable. Note that you’ve enclosed the expression in parentheses. Otherwise, it’ll fail with a SyntaxError exception. Check out the Walrus Operator Syntax section from The Walrus Operator: Python’s Assignment Expressions to figure out why you need the parentheses. Conditional Statements Now that you’ve learned how to read the BNF rules for simple expressions, you can jump into compound statements. Conditional statements are pretty common in any piece of Python code. The Python documentation provides the BNF rule for this type of statement: BNF Grammar https://realpython.com/python-bnf-notation/ 16/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python if_stmt ::= "if" assignment_expression ":" suite ("elif" assignment_expression ":" suite)* ["else" ":" suite] When you start reading this rule, you immediately find the "if" terminal symbol, which you must use to start any conditional statement. Then, you find the assignment_expression nonterminal, which you already studied in the previous section. Note: The if_stmt rule uses the assignment_expression nonterminal to define the condition. This allows you to use either an assignment expression or a plain expression in the condition. Remember that the assignment part is optional in assignment_expression. Next, you have the ":" terminal. This is the colon that you need to use at the end of a compound statement’s header. This colon denotes that the statement’s header is complete. Finally, you have a required nonterminal called suite, which is a set of indented statements. Following this first part of the rule, you end up with the following Python syntax: Python if assignment_expression: suite This is a bare-bones if statement. It starts with the if keyword. Then, you have an expression that Python evaluates for truth value. Finally, you have a colon that opens the possibility to have an indented block that works as the suite. The second line of the BNF rule defines the syntax of elif clauses. In this line, you have the elif keyword as a terminal symbol. Then, you have an expression, a colon, and again, a suite of indented code: Python https://realpython.com/python-bnf-notation/ 17/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python if assignment_expression: suite elif assignment_expression: suite You can have zero or more elif clauses in a conditional statement, which you know because of the asterisk after the closing parentheses. All of them will follow the same syntax. The final part of the conditional BNF rule is the else clause, which consists of the else keyword followed by a colon and an indented suite of code. Here’s how this translates to Python syntax: Python if assignment_expression: suite elif assignment_expression: suite else: suite The else clause is also optional in Python. In the BNF rule, you know that because of the square brackets surrounding the final line of the rule. Here’s a toy example of a working conditional statement: https://realpython.com/python-bnf-notation/ 18/25 27/11/2024, 11:10 BNF Notation: Dive Deeper Into Python's Grammar – Real Python Python >>> def read_temperature():... return 25... >>> if (temperature := read_temperature()) < 10:... print("The weather is cold!")... elif 10