Functions
# Functions
The name of the function being called isn’t actually part of the call syntax. The thing being called, the callee, can be any expression that evaluates to a function.
getCallback()();
The first pair of parentheses has getCallback
as its callee. But the second call has the entire getCallback()
expression as its callee. It is the parentheses following an expression that indicate a function call. You can think of a call as sort of like a postfix operator that starts with (
.
Updating our grammar:
|
|
We can say that a function is one that implements an interface:
|
|
Interpreting function calls:
|
|
# Currying
Named after Haskell Curry, the rule uses *
to allow matching a series of calls like fn(1)(2)(3)
. In this style, defining a function that takes multiple arguments is as a series of nested functions. Each function takes one argument and returns a new function. That function consumes the next argument, returns yet another function, and so on.
# Arity
Arity is the fancy term for the number of arguments a function or operation expects. Unary operators have arity one, binary operators two, etc. With functions, the arity is determined by the number of parameters it declares.
# Native Functions
Primitives, external functions, or foreign functions. They are functions that the interpreter exposes to user code but that are implemented in the host language (in our case Java), not the language being implemented (Lox).
They provide access to the fundamental services that all programs are defined in terms of. If you don’t provide native functions to access the file system, a user’s going to have a hell of a time writing a program that reads and displays a file.
Add a new globals environment which will store all the native methods in fixed reference to the global scope:
|
|
# Function declaration
Updated grammar:
|
|
# Function Objects
The implementation of the call method is as follows:
|
|
- Functions should encapsulate its parameters, meaning no code outside the function should be able to see them. Create a new environment with access to global environment.
- Bind the params to the values based in as arguments
- Execute the body of the function in a block
# Call frames
The compiler allocates stack slots for local variables. How should that work when the set of local variables in a program is distributed across multiple functions?
We solved this above, by dynamically allocating memory for an environment each time a function was called or a block entered. In clox, we don’t want that kind of performance cost on every function call.
# Naive Static Option
One option would be to keep them totally separate. Each function would get its own dedicated set of slots in the VM stack that it would own forever, even when the function isn’t being called. Each local variable in the entire program would have a bit of memory in the VM that it keeps to itself.
# Frame Pointers
When a function is called, we don’t know where the top of the stack will be because it can be called from different contexts. But, wherever that top happens to be, we do know where all of the function’s local variables will be relative to that starting point.
|
|
At the beginning of each function call, the VM records the location of the first slot where that function’s own locals begin. The instructions for working with local variables access them by a slot index relative to that, instead of relative to the bottom of the stack like they do today. At compile time, we calculate those relative slots. At runtime, we convert that relative slot to an absolute stack index by adding the function call’s starting slot.
# Return Addresses
For each live function invocation—each call that hasn’t returned yet—we need to track where on the stack that function’s locals begin, and where the caller should resume. Again, thanks to recursion, there may be multiple return addresses for a single function, so this is a property of each invocation and not the function itself.
# Return Statements
In Lox, the body of a function is a list of statements which don’t produce values, so we need dedicated syntax for emitting a result.
|
|
# Closures
Because the interpreter does not keep the environment surrounding a function around, a closure is essentially a data structure which helps to hold onto surrounding variables where the function is declared.
|
|
Here we pass in the current state of the interpreter environment in function declaration semantics:
|
|
# Static Scoping
A variable usage refers to the preceding declaration with the same name in the innermost scope that encloses the expression where the variable is used. It is static scoping because running the program should not affect this.
|
|
“global” should be printed twice, as a
refers to the outermost a
which is the preceding declaration in the innermost scope. Code may not always execute in the textual order with the introduction of functions which can defer it.
A block is not all the same scope
It’s like each var
statement splits the block into two separate scopes, the scope before the variable is declared and the one after, which includes the new variable.
# Persistent Environments
Persistent data structures: unlike the squishy data structures you’re familiar with in imperative programming, a persistent data structure can never be directly modified. Instead, any “modification” to an existing structure produces a brand new object that contains all of the original data and the new modification. The original is left unchanged.
# Semantic Analysis for variable bindings
Where a parser tells only if a program is grammatically correct (a syntactic analysis), semantic analysis goes farther and starts to figure out what pieces of the program actually mean. In this case, our analysis will resolve variable bindings. We’ll know not just that an expression is a variable, but which variable it is.
# Variable resolution pass
After the parser produces the syntax tree, but before the interpreter starts executing it, we’ll do a single walk over the tree to resolve all of the variables it contains.
It walks the tree, visiting each node, but a static analysis is different from a dynamic execution:
- There are no side effects. When the static analysis visits a print statement, it doesn’t actually print anything. Calls to native functions or other operations that reach out to the outside world are stubbed out and have no effect.
- There is no control flow. Loops are visited only once. Both branches are visited in
if
statements. Logic operators are not short-circuited.