@@ -42,8 +42,6 @@ also open `blk/000` in a modern text editor. | |||
See `/emul/README.md` for getting an emulated system running. | |||
There is also `/notes.txt` for implementation notes. | |||
## Organisation of this repository | |||
* `forth`: Forth is slowly taking over this project (see issue #4). It comes | |||
@@ -1,3 +1,4 @@ | |||
MASTER INDEX | |||
2 Documentation | |||
3 Usage 30 Dictionary | |||
70 Implementation notes |
@@ -1,4 +0,0 @@ | |||
Documentation index | |||
3 Usage | |||
30 Dictionary |
@@ -12,5 +12,5 @@ ALLOT n -- Move HERE by n bytes | |||
C, b -- Write byte b in HERE and advance it. | |||
DELW a -- Delete wordref at a. If it shadows another | |||
definition, that definition is unshadowed. | |||
FORGET x -- Rewind the dictionary (both CURRENT and HERE) up to | |||
x's previous entry. (cont.) | |||
FORGET x -- Rewind the dictionary (both CURRENT and HERE) | |||
up to x's previous entry. (cont.) |
@@ -1,4 +1,5 @@ | |||
Disk | |||
BLK> -- a Address of the current block variable. | |||
LIST n -- Prints the contents of the block n on screen in the | |||
form of 16 lines of 64 columns. |
@@ -0,0 +1,6 @@ | |||
Implementation notes | |||
71 Execution model 73 Executing a word | |||
75 Stack management 77 Dictionary | |||
80 System variables 85 Word routines | |||
89 Initialization sequence |
@@ -0,0 +1,11 @@ | |||
EXECUTION MODEL | |||
After having read a line through readln, we want to interpret | |||
it. As a general rule, we go like this: | |||
1. read single word from line | |||
2. Can we find the word in dict? | |||
3. If yes, execute that word, goto 1 | |||
4. Is it a number? | |||
5. If yes, push that number to PS, goto 1 | |||
6. Error: undefined word. |
@@ -0,0 +1,16 @@ | |||
EXECUTING A WORD | |||
At it's core, executing a word is pushing the wordref on PS and | |||
calling EXECUTE. Then, we let the word do its things. Some | |||
words are special, but most of them are of the compiledWord | |||
type, and that's their execution that we describe here. | |||
First of all, at all time during execution, the Interpreter | |||
Pointer (IP) points to the wordref we're executing next. | |||
When we execute a compiledWord, the first thing we do is push | |||
IP to the Return Stack (RS). Therefore, RS' top of stack will | |||
contain a wordref to execute next, after we EXIT. | |||
At the end of every compiledWord is an EXIT. This pops RS, sets | |||
IP to it, and continues. |
@@ -0,0 +1,14 @@ | |||
Stack management | |||
The Parameter stack (PS) is maintained by SP and the Return | |||
stack (RS) is maintained by IX. This allows us to generally use | |||
push and pop freely because PS is the most frequently used. | |||
However, this causes a problem with routine calls: because in | |||
Forth, the stack isn't balanced within each call, our return | |||
offset, when placed by a CALL, messes everything up. This is | |||
one of the reasons why we need stack management routines below. | |||
IX always points to RS' Top Of Stack (TOS) | |||
This return stack contain "Interpreter pointers", that is a | |||
pointer to the address of a word, as seen in a compiled list of | |||
words. |
@@ -0,0 +1,16 @@ | |||
Dictionary | |||
A dictionary entry has this structure: | |||
- Xb name. Arbitrary long number of character (but can't be | |||
bigger than input buffer, of course). not null-terminated | |||
- 2b prev offset | |||
- 1b size + IMMEDIATE flag | |||
- 2b code pointer | |||
- Parameter field (PF) | |||
The prev offset is the number of bytes between the prev field | |||
and the previous word's code pointer. | |||
The size + flag indicate the size of the name field, with the | |||
7th bit being the IMMEDIATE flag. (cont.) |
@@ -0,0 +1,10 @@ | |||
(cont.) The code pointer point to "word routines". These | |||
routines expect to be called with IY pointing to the PF. They | |||
themselves are expected to end by jumping to the address at | |||
(IP). They will usually do so with "jp next". | |||
That's for "regular" words (words that are part of the dict | |||
chain). There are also "special words", for example NUMBER, | |||
LIT, FBR, that have a slightly different structure. They're | |||
also a pointer to an executable, but as for the other fields, | |||
the only one they have is the "flags" field. |
@@ -0,0 +1,16 @@ | |||
System variables | |||
There are some core variables in the core system that are | |||
referred to directly by their address in memory throughout the | |||
code. The place where they live is configurable by the RAMSTART | |||
constant in conf.fs, but their relative offset is not. In fact, | |||
they're mostly referred to directly as their numerical offset | |||
along with a comment indicating what this offset refers to. | |||
This system is a bit fragile because every time we change those | |||
offsets, we have to be careful to adjust all system variables | |||
offsets, but thankfully, there aren't many system variables. | |||
Here's a list of them: | |||
(cont.) |
@@ -0,0 +1,16 @@ | |||
(cont.) | |||
RAMSTART INITIAL_SP +53 readln's variables | |||
+02 CURRENT +55 adev's variables | |||
+04 HERE +57 blk's variables | |||
+06 IP +59 z80a's variables | |||
+08 FLAGS +5b FUTURE USES | |||
+0a PARSEPTR +70 DRIVERS | |||
+0c CINPTR +80 RAMEND | |||
+0e WORDBUF | |||
+2e BOOT C< PTR | |||
+4e INTJUMP | |||
+51 CURRENTPTR | |||
(cont.) |
@@ -0,0 +1,16 @@ | |||
(cont.) INITIAL_SP holds the initial Stack Pointer value so | |||
that we know where to reset it on ABORT | |||
CURRENT points to the last dict entry. | |||
HERE points to current write offset. | |||
IP is the Interpreter Pointer | |||
FLAGS holds global flags. Only used for prompt output control | |||
for now. | |||
PARSEPTR holds routine address called on (parse) | |||
CINPTR holds routine address called on C< | |||
(cont.) |
@@ -0,0 +1,16 @@ | |||
(cont.) WORDBUF is the buffer used by WORD | |||
BOOT C< PTR is used when Forth boots from in-memory | |||
source. See "Initialization sequence" below. | |||
INTJUMP All RST offsets (well, not *all* at this moment, I | |||
still have to free those slots...) in boot binaries are made to | |||
jump to this address. If you use one of those slots for an | |||
interrupt, write a jump to the appropriate offset in that RAM | |||
location. | |||
CURRENTPTR points to current CURRENT. The Forth CURRENT word | |||
doesn't return RAM+2 directly, but rather the value at this | |||
address. Most of the time, it points to RAM+2, but sometimes, | |||
when maintaining alternative dicts (during cross compilation | |||
for example), it can point elsewhere. (cont.) |
@@ -0,0 +1,6 @@ | |||
(cont.) FUTURE USES section is unused for now. | |||
DRIVERS section is reserved for recipe-specific | |||
drivers. Here is a list of known usages: | |||
* 0x70-0x78: ACIA buffer pointers in RC2014 recipes. |
@@ -0,0 +1,16 @@ | |||
Word routines | |||
This is the description of all word routine you can encounter | |||
in this Forth implementation. That is, a wordref will always | |||
point to a memory offset containing one of these numbers. | |||
0x17: nativeWord. This words PFA contains native binary code | |||
and is jumped to directly. | |||
0x0e: compiledWord. This word's PFA contains an atom list and | |||
its execution is described in "EXECUTION MODEL" above. | |||
0x0b: cellWord. This word is usually followed by a 2-byte value | |||
in its PFA. Upon execution, the *address* of the PFA is pushed | |||
to PS. | |||
(cont.) |
@@ -0,0 +1,16 @@ | |||
(cont.) | |||
0x2b: doesWord. This word is created by "DOES>" and is followed | |||
by a 2-byte value as well as the adress where "DOES>" was | |||
compiled. At that address is an atom list exactly like in a | |||
compiled word. Upon execution, after having pushed its cell | |||
addr to PSP, it execute its reference exactly like a | |||
compiledWord. | |||
0x20: numberWord. No word is actually compiled with this | |||
routine, but atoms are. Atoms with a reference to the number | |||
words routine are followed, *in the atom list*, of a 2-byte | |||
number. Upon execution, that number is fetched and IP is | |||
avdanced by an extra 2 bytes. | |||
0x24: addrWord. Exactly like a numberWord, except that it is | |||
treated differently by meta-tools. (cont.) |
@@ -0,0 +1,6 @@ | |||
(cont.) | |||
0x22: litWord. Similar to a number word, except that instead of | |||
being followed by a 2 byte number, it is followed by a | |||
null-terminated string. Upon execution, the address of that | |||
null-terminated string is pushed on the PSP and IP is advanced | |||
to the address following the null. |
@@ -0,0 +1,16 @@ | |||
Initialization sequence | |||
On boot, we jump to the "main" routine in boot.fs which does | |||
very few things. | |||
1. Set SP to 0x10000-6 | |||
2. Sets HERE to RAMEND (RAMSTART+0x80). | |||
3. Sets CURRENT to value of LATEST field in stable ABI. | |||
4. Look for the word "BOOT" and calls it. | |||
In a normal system, BOOT is in icore and does a few things: | |||
1. Find "(parse)" and set "(parse*)" to it. | |||
2. Find "(c<)" a set CINPTR to it (what C< calls). | |||
3. Write LATEST in SYSTEM SCRATCHPAD ( see below ) | |||
4. Find "INIT". If found, execute. Otherwise, "INTERPRET"(cont) |
@@ -0,0 +1,16 @@ | |||
(cont.) On a bare system (only boot+icore), this sequence will | |||
result in "(parse)" reading only decimals and (c<) reading | |||
characters from memory starting from CURRENT (this is why we | |||
put CURRENT in SYSTEM SCRATCHPAD, it tracks current pos ). | |||
This means that you can put initialization code in source form | |||
right into your binary, right after your last compiled dict | |||
entry and it's going to be executed as such until you set a new | |||
(c<). | |||
Note that there is no EMIT in a bare system. You have to take | |||
care of supplying one before your load core.fs and its higher | |||
levels. | |||
(cont.) |
@@ -0,0 +1,7 @@ | |||
(cont.) In the "/emul" binaries, "HERE" is readjusted to | |||
"CURRENT @" so that we don't have to relocate compiled dicts. | |||
Note that in this context, the initialization code is fighting | |||
for space with HERE: New entries to the dict will overwrite | |||
that code! Also, because we're barebone, we can't have | |||
comments. This can lead to peculiar code in this area where we | |||
try to "waste" space in initialization code. |
@@ -20,11 +20,11 @@ BLKPACK = ../tools/blkpack | |||
.PHONY: all | |||
all: $(TARGETS) | |||
$(STRIPFC): | |||
$(SLATEST): | |||
$(BIN2C): | |||
$(BLKPACK): | |||
$(MAKE) -C ../tools | |||
$(STRIPFC): $(BLKPACK) | |||
$(SLATEST): $(BLKPACK) | |||
$(BIN2C): $(BLKPACK) | |||
# z80c.bin is not in the prerequisites because it's a bootstrap | |||
# binary that should be updated manually through make updatebootstrap. | |||
@@ -77,5 +77,5 @@ updatebootstrap: forth/stage2 | |||
.PHONY: clean | |||
clean: | |||
rm -f $(TARGETS) emul.o forth/*-bin.h forth/forth?.bin | |||
rm -f $(TARGETS) emul.o forth/*-bin.h forth/forth?.bin blkfs | |||
$(MAKE) -C ../tools clean |
@@ -1,203 +0,0 @@ | |||
Collapse OS' Forth implementation notes | |||
*** EXECUTION MODEL | |||
After having read a line through readln, we want to interpret it. As a general | |||
rule, we go like this: | |||
1. read single word from line | |||
2. Can we find the word in dict? | |||
3. If yes, execute that word, goto 1 | |||
4. Is it a number? | |||
5. If yes, push that number to PS, goto 1 | |||
6. Error: undefined word. | |||
*** EXECUTING A WORD | |||
At it's core, executing a word is pushing the wordref on PS and calling EXECUTE. | |||
Then, we let the word do its things. Some words are special, but most of them | |||
are of the compiledWord type, and that's their execution that we describe here. | |||
First of all, at all time during execution, the Interpreter Pointer (IP) points | |||
to the wordref we're executing next. | |||
When we execute a compiledWord, the first thing we do is push IP to the Return | |||
Stack (RS). Therefore, RS' top of stack will contain a wordref to execute next, | |||
after we EXIT. | |||
At the end of every compiledWord is an EXIT. This pops RS, sets IP to it, and | |||
continues. | |||
*** Stack management | |||
The Parameter stack (PS) is maintained by SP and the Return stack (RS) is | |||
maintained by IX. This allows us to generally use push and pop freely because PS | |||
is the most frequently used. However, this causes a problem with routine calls: | |||
because in Forth, the stack isn't balanced within each call, our return offset, | |||
when placed by a CALL, messes everything up. This is one of the reasons why we | |||
need stack management routines below. IX always points to RS' Top Of Stack (TOS) | |||
This return stack contain "Interpreter pointers", that is a pointer to the | |||
address of a word, as seen in a compiled list of words. | |||
*** Dictionary | |||
A dictionary entry has this structure: | |||
- Xb name. Arbitrary long number of character (but can't be bigger than | |||
input buffer, of course). not null-terminated | |||
- 2b prev offset | |||
- 1b size + IMMEDIATE flag | |||
- 2b code pointer | |||
- Parameter field (PF) | |||
The prev offset is the number of bytes between the prev field and the previous | |||
word's code pointer. | |||
The size + flag indicate the size of the name field, with the 7th bit being the | |||
IMMEDIATE flag. | |||
The code pointer point to "word routines". These routines expect to be called | |||
with IY pointing to the PF. They themselves are expected to end by jumping to | |||
the address at (IP). They will usually do so with "jp next". | |||
That's for "regular" words (words that are part of the dict chain). There are | |||
also "special words", for example NUMBER, LIT, FBR, that have a slightly | |||
different structure. They're also a pointer to an executable, but as for the | |||
other fields, the only one they have is the "flags" field. | |||
*** System variables | |||
There are some core variables in the core system that are referred to directly | |||
by their address in memory throughout the code. The place where they live is | |||
configurable by the RAMSTART constant in conf.fs, but their relative offset is | |||
not. In fact, they're mostly referred to directly as their numerical offset | |||
along with a comment indicating what this offset refers to. | |||
This system is a bit fragile because every time we change those offsets, we | |||
have to be careful to adjust all system variables offsets, but thankfully, | |||
there aren't many system variables. Here's a list of them: | |||
RAMSTART INITIAL_SP | |||
+02 CURRENT | |||
+04 HERE | |||
+06 IP | |||
+08 FLAGS | |||
+0a PARSEPTR | |||
+0c CINPTR | |||
+0e WORDBUF | |||
+2e BOOT C< PTR | |||
+4e INTJUMP | |||
+51 CURRENTPTR | |||
+53 readln's variables | |||
+55 adev's variables | |||
+57 blk's variables | |||
+59 z80a's variables | |||
+5b FUTURE USES | |||
+70 DRIVERS | |||
+80 RAMEND | |||
INITIAL_SP holds the initial Stack Pointer value so that we know where to reset | |||
it on ABORT | |||
CURRENT points to the last dict entry. | |||
HERE points to current write offset. | |||
IP is the Interpreter Pointer | |||
FLAGS holds global flags. Only used for prompt output control for now. | |||
PARSEPTR holds routine address called on (parse) | |||
CINPTR holds routine address called on C< | |||
WORDBUF is the buffer used by WORD | |||
BOOT C< PTR is used when Forth boots from in-memory source. See "Initialization | |||
sequence" below. | |||
INTJUMP All RST offsets (well, not *all* at this moment, I still have to free | |||
those slots...) in boot binaries are made to jump to this address. If you use | |||
one of those slots for an interrupt, write a jump to the appropriate offset in | |||
that RAM location. | |||
CURRENTPTR points to current CURRENT. The Forth CURRENT word doesn't return | |||
RAM+2 directly, but rather the value at this address. Most of the time, it | |||
points to RAM+2, but sometimes, when maintaining alternative dicts (during | |||
cross compilation for example), it can point elsewhere. | |||
FUTURE USES section is unused for now. | |||
DRIVERS section is reserved for recipe-specific drivers. Here is a list of | |||
known usages: | |||
* 0x70-0x78: ACIA buffer pointers in RC2014 recipes. | |||
*** Word routines | |||
This is the description of all word routine you can encounter in this Forth | |||
implementation. That is, a wordref will always point to a memory offset | |||
containing one of these numbers. | |||
0x17: nativeWord. This words PFA contains native binary code and is jumped to | |||
directly. | |||
0x0e: compiledWord. This word's PFA contains an atom list and its execution is | |||
described in "EXECUTION MODEL" above. | |||
0x0b: cellWord. This word is usually followed by a 2-byte value in its PFA. | |||
Upon execution, the *address* of the PFA is pushed to PS. | |||
0x2b: doesWord. This word is created by "DOES>" and is followed by a 2-byte | |||
value as well as the adress where "DOES>" was compiled. At that address is an | |||
atom list exactly like in a compiled word. Upon execution, after having pushed | |||
its cell addr to PSP, it execute its reference exactly like a compiledWord. | |||
0x20: numberWord. No word is actually compiled with this routine, but atoms are. | |||
Atoms with a reference to the number words routine are followed, *in the atom | |||
list*, of a 2-byte number. Upon execution, that number is fetched and IP is | |||
avdanced by an extra 2 bytes. | |||
0x24: addrWord. Exactly like a numberWord, except that it is treated | |||
differently by meta-tools. | |||
0x22: litWord. Similar to a number word, except that instead of being followed | |||
by a 2 byte number, it is followed by a null-terminated string. Upon execution, | |||
the address of that null-terminated string is pushed on the PSP and IP is | |||
advanced to the address following the null. | |||
*** Initialization sequence | |||
On boot, we jump to the "main" routine in boot.fs which does very few things. | |||
1. Set SP to 0x10000-6 | |||
2. Sets HERE to RAMEND (RAMSTART+0x80). | |||
3. Sets CURRENT to value of LATEST field in stable ABI. | |||
4. Look for the word "BOOT" and calls it. | |||
In a normal system, BOOT is in icore and does a few things: | |||
1. Find "(parse)" and set "(parse*)" to it. | |||
2. Find "(c<)" a set CINPTR to it (what C< calls). | |||
3. Write LATEST in SYSTEM SCRATCHPAD ( see below ) | |||
4. Find "INIT". If found, execute. Otherwise, execute "INTERPRET" | |||
On a bare system (only boot+icore), this sequence will result in "(parse)" | |||
reading only decimals and (c<) reading characters from memory starting from | |||
CURRENT (this is why we put CURRENT in SYSTEM SCRATCHPAD, it tracks current | |||
pos ). | |||
This means that you can put initialization code in source form right into your | |||
binary, right after your last compiled dict entry and it's going to be executed | |||
as such until you set a new (c<). | |||
Note that there is no EMIT in a bare system. You have to take care of supplying | |||
one before your load core.fs and its higher levels. | |||
In the "/emul" binaries, "HERE" is readjusted to "CURRENT @" so that we don't | |||
have to relocate compiled dicts. Note that in this context, the initialization | |||
code is fighting for space with HERE: New entries to the dict will overwrite | |||
that code! Also, because we're barebone, we can't have comments. This can lead | |||
to peculiar code in this area where we try to "waste" space in initialization | |||
code. |