CS 440 MP3
Overview
In this machine problem you will be updating the interpreter we started building in class in order to implement a number of new language constructs.
The core language constructs you will be adding include:
- Boolean values:
#t
and#f
- If-expression:
if
- Relational expressions:
=
,<
You will also add the following "syntactic sugar" forms:
- Subtraction:
-
- Boolean expressions:
and
andor
- Cond-expression:
cond
- Relational expressions:
<=
,>
,>=
Finally, you will also be adding the define
form, which can be used to define functions using the same syntax as Racket. These functions will be loaded before the REPL is started (or some expression is evaluated). Unlike the functions supported by the interpreter we built in class, define
'd functions will support recursion.
Details
Core language additions
As with integer values, the Boolean values #t
and #f
evaluate to themselves (note that the reader already recognizes Boolean values, and you can match them in the parser using the boolean?
predicate)
The if
expression is a simplified version of Racket's. It has the following form:
(if BOOL-EXP TRUE-EXP FALSE-EXP)
where BOOL-EXP
is a form that evaluates to a Boolean value, and TRUE-EXP
and FALSE-EXP
are any valid form. Its semantics are the same as Racket's.
The relational expressions =
and <
have the forms:
(= LHS RHS)
(< LHS RHS)
where LHS
and RHS
are forms that evaluate to integer values. =
evaluates to #t
if LHS
and RHS
are equal and #f
otherwise. <
evaluates to #t
if LHS
is less than RHS
and #f
otherwise.
As with integer values, the Boolean values #t
and #f
evaluate to themselves (note that the reader already recognizes Boolean values, and you can match them in the parser using the boolean?
predicate)
The if
expression is a simplified version of Racket's. It has the following form:
(if BOOL-EXP TRUE-EXP FALSE-EXP)
where BOOL-EXP
is a form that evaluates to a Boolean value, and TRUE-EXP
and FALSE-EXP
are any valid form. Its semantics are the same as Racket's.
The relational expressions =
and <
have the forms:
(= LHS RHS)
(< LHS RHS)
where LHS
and RHS
are forms that evaluate to integer values. =
evaluates to #t
if LHS
and RHS
are equal and #f
otherwise. <
evaluates to #t
if LHS
is less than RHS
and #f
otherwise.
Syntactic sugar additions
-
(subtraction) takes two arguments that evaluate to integer values and evaluates to their difference. It is syntactic sugar for addition of the first value to the negative of the second. I.e.,
(- EXP1 EXP2)
desugars to:
(+ EXP1 (* -1 EXP2))
and
takes one or more argument forms, and evaluates to #t
if and only if all its arguments evaluate to #t
; otherwise it evaluates to #f
. It is syntactic sugar for one or more if
forms. E.g.,
(and BEXP1 BEXP2 BEXP3)
desugars to:
(if BEXP1 (if BEXP2 (if BEXP3 #t #f) #f) #f)
or
takes one or more argument forms, and evaluates to #t
if any of its arguments evaluate to #t
; otherwise it evaluates to #f
. It is syntactic sugar for one or more if
forms. E.g.,
(or BEXP1 BEXP2 BEXP3)
desugars to:
(if BEXP1 #t (if BEXP2 #t (if BEXP3 #t #f)))
cond
is a multi-way conditional, similar to Racket's. It is syntactic sugar for one or more if
forms. E.g.,
(cond [BEXP1 REXP1]
[BEXP2 REXP2]
[BEXP3 REXP3]
[else REXP4])
desugars to:
(if BEXP
REXP
(if BEXP2
REXP2
(if BEXP3
REXP3
REXP4)))
note that the last "default" case is mandatory, and that else
is syntactically required but not an identifier/variable.
<=
, >
, >=
are syntactic sugar for combinations of <
, =
, and or
. E.g.,
(>= LHS RHS)
desugars to:
(or (< RHS LHS) (= LHS RHS))
-
(subtraction) takes two arguments that evaluate to integer values and evaluates to their difference. It is syntactic sugar for addition of the first value to the negative of the second. I.e.,
(- EXP1 EXP2)
desugars to:
(+ EXP1 (* -1 EXP2))
and
takes one or more argument forms, and evaluates to #t
if and only if all its arguments evaluate to #t
; otherwise it evaluates to #f
. It is syntactic sugar for one or more if
forms. E.g.,
(and BEXP1 BEXP2 BEXP3)
desugars to:
(if BEXP1 (if BEXP2 (if BEXP3 #t #f) #f) #f)
or
takes one or more argument forms, and evaluates to #t
if any of its arguments evaluate to #t
; otherwise it evaluates to #f
. It is syntactic sugar for one or more if
forms. E.g.,
(or BEXP1 BEXP2 BEXP3)
desugars to:
(if BEXP1 #t (if BEXP2 #t (if BEXP3 #t #f)))
cond
is a multi-way conditional, similar to Racket's. It is syntactic sugar for one or more if
forms. E.g.,
(cond [BEXP1 REXP1]
[BEXP2 REXP2]
[BEXP3 REXP3]
[else REXP4])
desugars to:
(if BEXP
REXP
(if BEXP2
REXP2
(if BEXP3
REXP3
REXP4)))
note that the last "default" case is mandatory, and that else
is syntactically required but not an identifier/variable.
<=
, >
, >=
are syntactic sugar for combinations of <
, =
, and or
. E.g.,
(>= LHS RHS)
desugars to:
(or (< RHS LHS) (= LHS RHS))
define
The define
form will be used to define one or more functions in a separate source file. This source file will be loaded and evaluated to create an environment within which we can either run a REPL or evaluate an expression.
The syntax of define
is identical to that of Racket's (though we will not support "rest" parameters and any other options). E.g., below we define a function named sum
with two parameters x
and y
, which returns their sum.
(define (sum x y)
(+ x y))
You will implement the function load-defs
, which takes the name of a file and returns an associative list containing all the name → function-value mappings defined in that file. E.g., given a file named "test1.defs" with the following contents:
(define (fn-a x)
(+ x 10))
(define (fn-b x)
(* x 20))
(define (fn-c x)
(fn-a (fn-b x)))
Calling (load-defs "test1.defs")
would return the following (nested closures are omitted):
(list
(cons 'fn-a
(fun-val 'x
(arith-exp "+" (var-exp 'x) (int-exp 10))
'(...)))
(cons 'fn-b
(fun-val 'x
(arith-exp "*" (var-exp 'x) (int-exp 20))
'(...)))
(cons 'fn-c
(fun-val 'x
(app-exp (var-exp 'fn-a)
(app-exp (var-exp 'fn-b) (var-exp 'x)))
'(...))))
This list is suitable for passing as an initial env
argument to eval
. I.e., after modifying eval
to take an initial environment, we can do:
> (eval (desugar (parse '(fn-c 10)))
(load-defs "test1.defs"))
210
Critically, define
will allow us to define recursive functions. Note that our implementations of lambda
and function application in class did not support recursion (it's worth taking some time to make sure you understand why not!). After correctly implementing define
, however, we can evaluate a definition like:
(define (sum-to n)
(if (= n 0)
0
(+ n
(sum-to (- n 1)))))
Et voila:
> (eval (desugar (parse '(sum-to 10)))
(load-defs "test2.defs"))
55
This will likely be the toughest part of this machine problem (though it doesn't translate into much code!). The most straightforward implementation does require a new mechanism: a cyclic structure. If you feel up for a challenge and want to figure it out for yourself, check out Immutable Cyclic Data in the Racket documentation.
For more detailed hints, see the "Hint" section in the next section.
Implementation and Starter code
All your changes should be made to "mp2.rkt". It is the only source file we will evaluate.
We provide you with the (slightly amended) interpreter that we wrote together in class. We also provide you with the following starter code for load_defs
:
(define (load-defs filename)
(let* ([sexps (file->list filename)]
[fn-names (map (compose first second) sexps)])
fn-names))
which reads all the s-expressions (corresponding to define
forms) from the named file and returns a list of the function names being defined. You should use it as a starting point.
You are free to add new struct
definitions, alter existing struct
s, define new functions, alter existing functions, etc. Just take care that you do not change the APIs of the parse
, desugar
, eval
, load_defs
, and repl
functions, as we will be testing those directly.
Hint: On implementing recursion
First of all, recall that a "function value" is a structure that contains a function definition (consisting of parameter names and a body) and a closure. A closure represents the environment at the time the function is created, and is in our case just an associative list.
Here's the structure we defined:
(struct fun-val (id body env) #:transparent)
When we apply this function to an argument, we evaluate its body
in the environment env
, with a new mapping for the parameter and argument.
Here's the relevant bit from eval
:
[(app-exp f arg)
(match-let ([(fun-val id body clenv) (eval f env)]
[arg-val (eval arg env)])
(eval body (cons (cons id arg-val) clenv)))]
Now imagine that the body of the function contains a recursive call (i.e., a call to the function itself). Would we be able to locate the value corresponding to the function's own name?
No! The problem is that when we create a closure, we are saving the "outside" environment, but we are not saving the name of the function itself (which should map to the self-same function value). Here's where we create function values from lambda
s in eval
:
[(lambda-exp id body)
(fun-val id body env)] -- env is the closure
See how the function value (and closure) doesn't see its own "name"?
To fix this, your implementation will need to create a cyclic structure. Specifically, you want the closure to refer to the function value in which it is contained.
To create a cyclic structure in Racket, we can use the make-placeholder
, placeholder-set!
, and make-reader-graph
functions. Intuitively, make-placeholder
creates a bookmark that can later be filled in by placeholder-set!
, and make-reader-graph
constructs a graph (which, unlike a tree, may contain cycles) based on these bookmarks.
E.g., to create a cyclic list of the infinitely repeating sequence: 1, 2, 3, 1, 2, 3, 1, 2, 3, ..., we can do:
(define inf-list
(let* ([ph (make-placeholder '())] ; placeholder with val '()
[lst (cons 1 (cons 2 (cons 3 ph)))]) ; acyclic list ending with ph
(placeholder-set! ph lst) ; replace ph val with list head
(make-reader-graph lst))) ; read off the resulting cyclic list
We can use placeholders with Racket struct
s to create cyclic structures, too. We just need to mark those struct
s as "prefab", first. E.g., if we modify our function value struct
as follows:
(struct fun-val (id body env) #:prefab)
We can do:
(define cyc-env
(let* ([ph (make-placeholder '())]
[env (list (cons 'f (fun-val 'x (int-exp 10) ph)))])
(placeholder-set! ph env)
(make-reader-graph env)))
And now we have a closure that refers back to the environment in which its associated function is defined!
Check it out:
> cyc-env
#0=(list (cons 'f (fun-val 'x (int-exp 10) #0#)))
> (fun-val-env (cdr (assoc 'f cyc-env)))
#0=(list (cons 'f (fun-val 'x (int-exp 10) #0#)))
(The #0=
and #0#
notation is to help us visualize the cyclical structure -- those values aren't actually present as data.)
Testing
We have provided you with test cases in "mp2-test.rkt" and sample definition files in "test1.defs" and "test2.defs". Feel free to add to and alter any and all tests, as we will be using our own test suite to evaluate your work.
Note that passing all the tests does not guarantee full credit! In particular, we will be checking that your desugaring function correctly transforms the syntax of syntactic sugar to core language forms, and that you aren't using any metacircular hacks.