PLP_5_2 PDF - Scope and Declarations in Programming Languages
Document Details
Uploaded by HilariousSagacity
Tags
Summary
This document discusses the scope and lifetime of variables in different programming languages. It explores how variables are declared and used in a variety of contexts and provides details on static and dynamic scoping, and how these concepts affect program execution and readability.
Full Transcript
The scope of a name defined with let inside a function definition is from the end of the defining expression to the end of the function. The scope of let can be limited by indenting the following code, which creates a new local scope. Although any indentation will work, the convention is that the in...
The scope of a name defined with let inside a function definition is from the end of the defining expression to the end of the function. The scope of let can be limited by indenting the following code, which creates a new local scope. Although any indentation will work, the convention is that the indentation is four spaces. Consider the following code: let n1 = let n2 = 7 let n3 = n2 + 3 n3;; let n4 = n3 + n1;; The scope of n1 extends over all of the code. However, the scope of n2 and n3 ends when the indentation ends. So, the use of n3 in the last let causes an error. The last line of the let n1 scope is the value bound to n1; it could be any expression. Chapter 15 includes more details of the let constructs in Scheme, ML, Haskell, and F#. 5.5.3 Declaration Order In C89, as well as in some other languages, all data declarations in a function except those in nested blocks must appear at the beginning of the function. However, some languages—for example, C99, C++, Java, JavaScript, and C# —allow variable declarations to appear anywhere a statement can appear in a program unit. Declarations may create scopes that are not associated with compound statements or subprograms. For example, in C99, C++, and Java, the scope of all local variables is from their declarations to the ends of the blocks in which those declarations appear. In the official documentation for C#, the scope of any variable declared in a block is said to be the whole block, regardless of the position of the declaration in the block, as long as it is not in a nested block. The same is true for methods. However, this is misleading, because the C# language definition requires that all variables be declared before they are used. Therefore, although the scope of a variable is said to extend from the declaration to the top of the block or subprogram in which that declaration appears, the variable still cannot be used above its declaration. Recall that C# does not allow the declaration of a variable in a nested block to have the same name as a variable in a nesting scope. This, together with the rule that the scope of a declaration is the whole block, makes the following nested declaration of x illegal: { } {int x; ... } int x; // Illegal Note that C# still requires that all be declared before they are used. Therefore, although the scope of a variable extends from the declaration to the top of the block or subprogram in which that declaration appears, the variable still cannot be used above its declaration. In JavaScript, local variables can be declared anywhere in a function, but the scope of such a variable is always the entire function. If used before its declaration in the function, such a variable has the value undefined. The reference is not illegal. The for statements of C++, Java, and C# allow variable definitions in their initialization expressions. In early versions of C++, the scope of such a variable was from its definition to the end of the smallest enclosing block. In the standard version, however, the scope is restricted to the for construct, as is the case with Java and C#. Consider the following skeletal method: void fun() { . . . for (int count = 0; count < 10; count++){ . . . } . . . } In later versions of C++, as well as in Java and C#, the scope of count is from the for statement to the end of its body (the right brace). 5.5.4 Global Scope Some languages, including C, C++, PHP, JavaScript, and Python, allow a program structure that is a sequence of function definitions, in which variable definitions can appear outside the functions. Definitions outside functions in a file create global variables, which potentially can be visible to those functions. C and C++ have both declarations and definitions of global data. Declarations specify types and other attributes but do not cause allocation of storage. Definitions specify attributes and cause storage allocation. For a specific global name, a C program can have any number of compatible declarations, but only a single definition. A declaration of a variable outside function definitions specifies that the variable is defined in a different file. A global variable in C is implicitly visible in all subsequent functions in the file, except those that include a declaration of a local variable with the same name. A global variable that is defined after a function can be made visible in the function by declaring it to be external, as in the following: extern int sum; In C99, definitions of global variables usually have initial values. Declarations of global variables never have initial values. If the declaration is outside function definitions, it need not include the extern qualifier. This idea of declarations and definitions carries over to the functions of C and C++, where prototypes declare names and interfaces of functions but do not provide their code. Function definitions, on the other hand, are complete. In C++, a global variable that is hidden by a local with the same name can be accessed using the scope operator (::). For example, if x is a global that is hidden in a function by a local named x, the global could be referenced as ::x. PHP statements can be interspersed with function definitions. Variables in PHP are implicitly declared when they appear in statements. Any variable that is implicitly declared outside any function is a global variable; variables implicitly declared in functions are local variables. The scope of global variables extends from their declarations to the end of the program but skips over any subsequent function definitions. So, global variables are not implicitly visible in any function. Global variables can be made visible in functions in their scope in two ways: (1) If the function includes a local variable with the same name as a global, that global can be accessed through the $GLOBALS array, using the name of the global as a string literal subscript, and (2) if there is no local variable in the function with the same name as the global, the global can be made visible by including it in a global declaration statement. Consider the following example: $day = "Monday"; $month = "January"; function calendar() { $day = "Tuesday"; global $month; print "local day is $day "; $gday = $GLOBALS['day']; print "global day is $gday <br \>"; print "global month is $month "; } calendar(); Interpretation of this code produces the following: local day is Tuesday global day is Monday global month is January The global variables of JavaScript are very similar to those of PHP, except that there is no way to access a global variable in a function that has declared a local variable with the same name. The visibility rules for global variables in Python are unusual. Variables are not normally declared, as in PHP. They are implicitly declared when they appear as the targets of assignment statements. A global variable can be referenced in a function, but a global variable can be assigned in a function only if it has been declared to be global in the function. Consider the following examples: day = "Monday" def tester(): print "The global day is:", day tester() The output of this script, because globals can be referenced directly in functions, is as follows: The global day is: Monday The following script attempts to assign a new value to the global day: day = "Monday" def tester(): print "The global day is:", day day = "Tuesday" print "The new value of day is:", day tester() This script creates an UnboundLocalError error message, because the assignment to day in the second line of the body of the function makes day a local variable, which makes the reference to day in the first line of the body of the function an illegal forward reference to the local. The assignment to day can be to the global variable if day is declared to be global at the beginning of the function. This prevents the assignment to day from creating a local variable. This is shown in the following script: day = "Monday" def tester(): global day print "The global day is:", day day = "Tuesday" print "The new value of day is:", day tester() The output of this script is as follows: The global day is: Monday The new value of day is: Tuesday Functions can be nested in Python. Variables defined in nesting functions are accessible in a nested function through static scoping, but such variables must be declared nonlocal in the nested function.9 An example skeletal program in Section 5.7 illustrates accesses to nonlocal variables. 9. The nonlocal reserved word was introduced in Python 3. All names defined outside function definitions in F# are globals. Their scope extends from their definitions to the end of the file. Declaration order and global variables are also issues in the class and member declarations in object-oriented languages. These are discussed in Chapter 12. 5.5.5 Evaluation of Static Scoping Static scoping provides a method of nonlocal access that works well in many situations. However, it is not without its problems. First, in most cases it allows more access to both variables and subprograms than is necessary. It is simply too crude a tool for concisely specifying such restrictions. Second, and perhaps more important, is a problem related to program evolution. Software is highly dynamic—programs that are used regularly continually change. These changes often result in restructuring, thereby destroying the initial structure that restricted variable and subprogram access in a staticscoped language. To avoid the complexity of maintaining these access restrictions, developers often discard structure when it gets in the way. Thus, getting around the restrictions of static scoping can lead to program designs that bear little resemblance to the original, even in areas of the program in which changes have not been made. Designers are encouraged to use far more globals than are necessary. All subprograms can end up being nested at the same level, in the main program, using globals instead of deeper levels of nesting.10 Moreover, the final design may be awkward and contrived, and it may not reflect the underlying conceptual design. These and other defects of static scoping are discussed in detail in Clarke, Wileden, and Wolf (1980). An alternative to the use of static scoping to control access to variables and subprograms is an encapsulation construct, which is included in many newer languages. Encapsulation constructs are discussed in Chapter 11. 10. Sounds like the structure of a C program, doesn’t it? 5.5.6 Dynamic Scope The scope of variables in APL, SNOBOL4, and the early versions of Lisp is dynamic. Perl and Common Lisp also allow variables to be declared to have dynamic scope, although the default scoping mechanism in these languages is static. Dynamic scoping is based on the calling sequence of subprograms, not on their spatial relationship to each other. Thus, the scope can be determined only at run time. Consider again the function big from Section 5.5.1, which is reproduced here, minus the function calls: function big() { function sub1() { var x = 7; } function sub2() { var y = x; var z = 3; } var x = 3; } Assume that dynamic-scoping rules apply to nonlocal references. The meaning of the identifier x referenced in sub2 is dynamic—it cannot be determined at compile time. It may reference the variable from either declaration of x, depending on the calling sequence. One way the correct meaning of x can be determined during execution is to begin the search with the local declarations. This is also the way the process begins with static scoping, but that is where the similarity between the two techniques ends. When the search of local declarations fails, the declarations of the dynamic parent, or calling function, are searched. If a declaration for x is not found there, the search continues in that function’s dynamic parent, and so forth, until a declaration for x is found. If none is found in any dynamic ancestor, it is a run-time error. Consider the two different call sequences for sub2 in the earlier example. First, big calls sub1, which calls sub2. In this case, the search proceeds from the local procedure, sub2, to its caller, sub1, where a declaration for x is found. So, the reference to x in sub2 in this case is to the x declared in sub1. Next, sub2 is called directly from big. In this case, the dynamic parent of sub2 is big, and the reference is to the x declared in big. Note that if static scoping were used, in either calling sequence discussed, the reference to x in sub2 would be to big’s x. Perl’s dynamic scoping is unusual—in fact, it is not exactly like that discussed in this section, although the semantics are often that of traditional dynamic scoping (see Programming Exercise 1). 5.5.7 Evaluation of Dynamic Scoping The effect of dynamic scoping on programming is profound. When dynamic scoping is used, the correct attributes of nonlocal variables visible to a program statement cannot be determined statically. Furthermore, a reference to the name of such a variable is not always to the same variable. A statement in a subprogram that contains a reference to a nonlocal variable can refer to different nonlocal variables during different executions of the subprogam. Several kinds of programming problems follow directly from dynamic scoping. First, during the time span beginning when a subprogram begins its execution and ending when that execution ends, the local variables of the subprogram are all visible to any other executing subprogram, regardless of its textual proximity or how execution got to the currently executing subprogram. There is no way to protect local variables from this accessibility. Subprograms are always executed in the environment of all previously called subprograms that have not yet completed their executions. As a result, dynamic scoping results in less reliable programs than static scoping. A second problem with dynamic scoping is the inability to type check references to nonlocals statically. This problem results from the inability to statically find the declaration for a variable referenced as a nonlocal. Dynamic scoping also makes programs much more difficult to read, because the calling sequence of subprograms must be known to determine the meaning of references to nonlocal variables. This task can be virtually impossible for a human reader. Finally, accesses to nonlocal variables in dynamic-scoped languages take far longer than accesses to nonlocals when static scoping is used. The reason for this is explained in Chapter 10. On the other hand, dynamic scoping is not without merit. In many cases, the parameters passed from one subprogram to another are variables that are defined in the caller. None of these needs to be passed in a dynamically scoped language, because they are implicitly visible in the called subprogram. It is not difficult to understand why dynamic scoping is not as widely used as static scoping. Programs in static-scoped languages are easier to read, are more reliable, and execute faster than equivalent programs in dynamicscoped languages. It was precisely for these reasons that dynamic scoping was replaced by static scoping in most current dialects of Lisp. Implementation methods for both static and dynamic scoping are discussed in Chapter 10. 5.6 Scope and Lifetime Sometimes the scope and lifetime of a variable appear to be related. For example, consider a variable that is declared in a Java method that contains no method calls. The scope of such a variable is from its declaration to the end of the method. The lifetime of that variable is the period of time beginning when the method is entered and ending when execution of the method terminates. Although the scope and lifetime of the variable are clearly not the same, because static scope is a textual, or spatial, concept whereas lifetime is a temporal concept, they at least appear to be related in this case. This apparent relationship between scope and lifetime does not hold in other situations. In C and C++, for example, a variable that is declared in a function using the specifier static is statically bound to the scope of that function and is also statically bound to storage. So, its scope is static and local to the function, but its lifetime extends over the entire execution of the program of which it is a part. Scope and lifetime are also unrelated when subprogram calls are involved. Consider the following C++ functions: void printheader() { . . . } /* end of printheader */ void compute() { int sum; . . . printheader(); } /* end of compute */ The scope of the variable sum is completely contained within the compute function. It does not extend to the body of the function printheader, although printheader executes in the midst of the execution of compute. However, the lifetime of sum extends over the time during which printheader executes. Whatever storage location sum is bound to before the call to printheader, that binding will continue during and after the execution of printheader. 5.7 Referencing Environments The referencing environment of a statement is the collection of all variables that are visible in the statement. The referencing environment of a statement in a static-scoped language is the variables declared in its local scope plus the collection of all variables of its ancestor scopes that are visible. In such a language, the referencing environment of a statement is needed while that statement is being compiled, so code and data structures can be created to allow references to variables from other scopes during run time. Techniques for implementing references to nonlocal variables in both static- and dynamic-scoped languages are discussed in Chapter 10. In Python, scopes can be created by function definitions. The referencing environment of a statement includes the local variables, plus all of the variables declared in the functions in which the statement is nested (excluding variables in nonlocal scopes that are hidden by declarations in nearer functions). Each function definition creates a new scope and thus a new environment. Consider the following Python skeletal program: g = 3; # A global def sub1(): a = 5; # Creates a local b = 7; # Creates another local . . . <------------------------------ 1 def sub2(): global g; # Global g is now assignable here c = 9; # Creates a new local . . . <------------------------------ 2 def sub3(): nonlocal c: # Makes nonlocal c visible here g = 11; # Creates a new local . . . <------------------------------ 3 The referencing environments of the indicated program points are as follows: Now consider the variable declarations of this skeletal program. First, note that, although the scope of sub1 is at a higher level (it is less deeply nested) than sub3, the scope of sub1 is not a static ancestor of sub3, so sub3 does not have access to the variables declared in sub1. There is a good reason for this. The variables declared in sub1 are stack dynamic, so they are not bound to storage if sub1 is not in execution. Because sub3 can be in execution when sub1 is not, it cannot be allowed to access variables in sub1, which would not necessarily be bound to storage during the execution of sub3. A subprogram is active if its execution has begun but has not yet terminated. The referencing environment of a statement in a dynamically scoped language is the locally declared variables, plus the variables of all other subprograms that are currently active. Once again, some variables in active subprograms can be hidden from the referencing environment. Recent subprogram activations can have declarations for variables that hide variables with the same names in previous subprogram activations. Consider the following example program. Assume that the only function calls are the following: main calls sub2, which calls sub1. void sub1() { int a, b; . . . <------------ 1 } /* end of sub1 */ void sub2() { int b, c; . . . . <------------ 2 sub1(); } /* end of sub2 */ void main() { int c, d; . . . <------------ 3 sub2(); } /* end of main */ The referencing environments of the indicated program points are as follows: 5.8 Named Constants A named constant is a variable that is bound to a value only once. Named constants are useful as aids to readability and program reliability. Readability can be improved, for example, using the name pi instead of the constant 3.14159265. Another important use of named constants is to parameterize a program. For example, consider a program that processes a fixed number of data values, say 100. Such a program usually uses the constant 100 in a number of locations for declaring array subscript ranges and for loop control limits. Consider the following skeletal Java program segment: void example() { int[] intList = new int[100]; String[] strList = new String[100]; . . . for (index = 0; index < 100; index++) { . . . } . . . for (index = 0; index < 100; index++) { . . . } . . . average = sum / 100; . . . } When this program must be modified to deal with a different number of data values, all occurrences of 100 must be found and changed. On a large program, this can be tedious and error prone. An easier and more reliable method is to use a named constant as a program parameter, as follows: void example() { final int len = 100; int[] intList = new int[len]; String[] strList = new String[len]; . . . for (index = 0; index < len; index++) { } . . . } . . . for (index = 0; index < len; index++) { . . . } . . . average = sum / len; . . . Now, when the length must be changed, only one line must be changed (the variable len), regardless of the number of times it is used in the program. This is another example of the benefits of abstraction. The name len is an abstraction for the number of elements in some arrays and the number of iterations in some loops. This illustrates how named constants can aid modifiability. C++ allows dynamic binding of values to named constants. This allows expressions containing variables to be assigned to constants in the declarations. For example, the C++ statement const int result = 2 * width + 1; declares result to be an integer type named constant whose value is set to the value of the expression 2 * width + 1, where the value of the variable width must be visible when result is allocated and bound to its value. Java also allows dynamic binding of values to named constants. In Java, named constants are defined with the final reserved word (as in the earlier example). The initial value can be given in the declaration statement or in a subsequent assignment statement. The assigned value can be specified with any expression. C# has two kinds of named constants: those defined with const and those defined with readonly. The const named constants, which are implicitly static, are statically bound to values; that is, they are bound to values at compile time, which means those values can be specified only with literals or other const members. The readonly named constants, which are dynamically bound to values, can be assigned in the declaration or with a static constructor.11 So, if a program needs a constant-valued object whose value is the same on every use of the program, a const constant is used. However, if a program needs a constant-valued object whose value is determined only when the object is created and can be different for different executions of the program, then a readonly constant is used. 11. Static constructors in C# run at some indeterminate time before the class is instantiated. The discussion of binding values to named constants naturally leads to the topic of initialization, because binding a value to a named constant is the same process, except it is permanent. In many instances, it is convenient for variables to have values before the code of the program or subprogram in which they are declared begins executing. The binding of a variable to a value at the time it is bound to storage is called initialization. If the variable is statically bound to storage, binding and initialization occur before run time. In these cases, the initial value must be specified as a literal or an expression whose only nonliteral operands are named constants that have already been defined. If the storage binding is dynamic, initialization is also dynamic and the initial values can be any expression. In most languages, initialization is specified on the declaration that creates the variable. For example, in C++, we could have int sum = 0; int* ptrSum = ∑ char name[] = "George Washington Carver"; SUMMARY Case sensitivity and the use of underscores are the design issues for names. Variables can be characterized by the sextuple of attributes: name, address, value, type, lifetime, and scope. Aliases are two or more variables bound to the same storage address. They are regarded as detrimental to reliability but are difficult to eliminate entirely from a language. Binding is the association of attributes with program entities. Knowledge of the binding times of attributes to entities is essential to understanding the semantics of programming languages. Binding can be static or dynamic. Declarations, either explicit or implicit, provide a means of specifying the static binding of variables to types. In general, dynamic binding allows greater flexibility but at the expense of readability, efficiency, and reliability. Scalar variables can be separated into four categories by considering their lifetimes: static, stack dynamic, explicit heap dynamic, and implicit heap dynamic. Static scoping is a central feature of ALGOL 60 and some of its descendants. It provides a simple, reliable, and efficient method of allowing visibility of nonlocal variables in subprograms. Dynamic scoping provides more flexibility than static scoping but, again, at the expense of readability, reliability, and efficiency. Most functional languages allow the user to create local scopes with let constructs, which limit the scope of their defined names. The referencing environment of a statement is the collection of all of the variables that are visible to that statement. Named constants are simply variables that are bound to values only once.