-
Notifications
You must be signed in to change notification settings - Fork 23
/
Overview.text
80 lines (75 loc) · 2.56 KB
/
Overview.text
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
Data types:
Rationale:
The more data types, the more code.
Basic:
character
()
pair (x . y)
Derived:
A string is a list of characters.
A symbol is a string that happens to be on the intern list.
Predicates return one of the symbols t and f.
cond checks for f as false, anything else as true.
Lexical syntax:
No comments. This starts to show how minimal we're getting.
List structure like a subset of Scheme, except:
* Case-sensitive symbols
* A character literal is written \x
Syntax:
program = form*
form = (define name expr)
| (to (name arg*) expr*)
| expr
expr = (quote s-expression)
| (cond (expr expr*)*)
| (symbol expr*)
| symbol
| other-s-expression
Semantics:
The usual Lisp eval, with tail call as jump with arguments.
(to (f x) foo) is like Scheme's (define (f x) foo).
Primitives:
eq?
null? char? pair?
cons car cdr set-car!
write-char read-char peek-char
abort
Implementation:
Bootstrap interpreter in C.
Test driver.
(Does the system still run on it?)
Compiler in Ichbins.
Mechanics:
"Standard library"
Usual utilities
Symbol table -- our one use of assignment.
Explain why we need primitives initially in symbol table.
Reader (no s-expression writer)
Some conventions:
e = expression
k = the remainder of the compiled code, which we'll cons onto
syms = symbol table (a mutable list of symbols)
Collect constants, symbols, defs, top-level expressions.
Re-express constants as calls to primitives.
Re-express refs to them as refs to a new global variable.
Turn top-level expressions into a (main) function.
Compile functions.
Lisp symbols to C identifiers.
How do we manage without arithmetic?
Punt to the C compiler.
Also, bp points to base of stack frame.
Runtime system in C incorporated literally as a string.
Heap, stack, data representation.
Stack-centric code because GC needs to know all the roots.
Mark-and-lazy-sweep garbage collector.
Calls and jumps.
System needs t and f symbols at the bottom of the stack.
Global variables get pushed after.
N.B. it's possible to reference a global before it's defined,
with dire consequences -- an unsound implementation in this
one respect.
Tracebacks are vaguely useful in gdb.
Testing and bootstrapping changes.
Performance.
Usability, or shortage thereof.
The overlapping datatypes make errors less immediate.