|
- Collapse OS' Forth implementation notes
-
- *** EXECUTION MODEL
-
- After having read a line through readln, we want to interpret it. As a general
- rule, we go like this:
-
- 1. read single word from line
- 2. Can we find the word in dict?
- 3. If yes, execute that word, goto 1
- 4. Is it a number?
- 5. If yes, push that number to PS, goto 1
- 6. Error: undefined word.
-
- *** EXECUTING A WORD
-
- At it's core, executing a word is pushing the wordref on PS and calling EXECUTE.
- Then, we let the word do its things. Some words are special, but most of them
- are of the compiledWord type, and that's their execution that we describe here.
-
- First of all, at all time during execution, the Interpreter Pointer (IP) points
- to the wordref we're executing next.
-
- When we execute a compiledWord, the first thing we do is push IP to the Return
- Stack (RS). Therefore, RS' top of stack will contain a wordref to execute next,
- after we EXIT.
-
- At the end of every compiledWord is an EXIT. This pops RS, sets IP to it, and
- continues.
-
- *** Stack management
-
- The Parameter stack (PS) is maintained by SP and the Return stack (RS) is
- maintained by IX. This allows us to generally use push and pop freely because PS
- is the most frequently used. However, this causes a problem with routine calls:
- because in Forth, the stack isn't balanced within each call, our return offset,
- when placed by a CALL, messes everything up. This is one of the reasons why we
- need stack management routines below. IX always points to RS' Top Of Stack (TOS)
-
- This return stack contain "Interpreter pointers", that is a pointer to the
- address of a word, as seen in a compiled list of words.
-
- *** Dictionary
-
- A dictionary entry has this structure:
-
- - Xb name. Arbitrary long number of character (but can't be bigger than
- input buffer, of course). not null-terminated
- - 2b prev offset
- - 1b size + IMMEDIATE flag
- - 2b code pointer
- - Parameter field (PF)
-
- The prev offset is the number of bytes between the prev field and the previous
- word's code pointer.
-
- The size + flag indicate the size of the name field, with the 7th bit being the
- IMMEDIATE flag.
-
- The code pointer point to "word routines". These routines expect to be called
- with IY pointing to the PF. They themselves are expected to end by jumping to
- the address at (IP). They will usually do so with "jp next".
-
- That's for "regular" words (words that are part of the dict chain). There are
- also "special words", for example NUMBER, LIT, FBR, that have a slightly
- different structure. They're also a pointer to an executable, but as for the
- other fields, the only one they have is the "flags" field.
-
- *** System variables
-
- There are some core variables in the core system that are referred to directly
- by their address in memory throughout the code. The place where they live is
- configurable by the RAMSTART constant in conf.fs, but their relative offset is
- not. In fact, they're mostlly referred to directly as their numerical offset
- along with a comment indicating what this offset refers to.
-
- This system is a bit fragile because every time we change those offsets, we
- have to be careful to adjust all system variables offsets, but thankfully,
- there aren't many system variables. Here's a list of them:
-
- RAMSTART INITIAL_SP
- +02 CURRENT
- +04 HERE
- +06 IP
- +08 FLAGS
- +0a PARSEPTR
- +0c CINPTR
- +0e WORDBUF
- +2e SYSVNXT
- +4e INTJUMP
- +51 SYSTEM SCRATCHPAD
- +60 RAMEND
-
- INITIAL_SP holds the initial Stack Pointer value so that we know where to reset
- it on ABORT
-
- CURRENT points to the last dict entry.
-
- HERE points to current write offset.
-
- IP is the Interpreter Pointer
-
- FLAGS holds global flags. Only used for prompt output control for now.
-
- PARSEPTR holds routine address called on (parse)
-
- CINPTR holds routine address called on C<
-
- WORDBUF is the buffer used by WORD
-
- SYSVNXT is the buffer+tracker used by (sysv)
-
- INTJUMP All RST offsets (well, not *all* at this moment, I still have to free
- those slots...) in boot binaries are made to jump to this address. If you use
- one of those slots for an interrupt, write a jump to the appropriate offset in
- that RAM location.
-
- SYSTEM SCRATCHPAD is reserved for temporary system storage or can be reserved
- by low-level drivers. These are the current usages of this space throughout the
- project:
-
- * 0x51-0x53: (c<) pointer during in-memory initialization (see below)
- * 0x53-0x5b: ACIA buffer pointers in RC2014 recipes.
-
- *** Initialization sequence
-
- On boot, we jump to the "main" routine in boot.fs which does very few things.
- It sets up the SP register, CURRENT and HERE to LATEST (saved in stable ABI),
- then look for the BOOT word and calls it.
-
- In a normal system, BOOT is in icore and does a few things:
-
- 1. Find "(parse)" and set "(parse*)" to it.
- 2. Find "(c<)" a set CINPTR to it (what C< calls).
- 3. Write LATEST in SYSTEM SCRATCHPAD ( see below )
- 4. Find "INIT". If found, execute. Otherwise, execute "INTERPRET"
-
- On a bare system (only boot+icore), this sequence will result in "(parse)"
- reading only decimals and (c<) reading characters from memory starting from
- CURRENT (this is why we put CURRENT in SYSTEM SCRATCHPAD, it tracks current
- pos ).
-
- This means that you can put initialization code in source form right into your
- binary, right after your last compiled dict entry and it's going to be executed
- as such until you set a new (c<).
-
- Note that there is no EMIT in a bare system. You have to take care of supplying
- one before your load core.fs and its higher levels.
-
- Also note that this initialization code is fighting for space with HERE: New
- entries to the dict will overwrite that code! Also, because we're barebone, we
- can't have comments. This leads to peculiar code in this area. If you see weird
- whitespace usage, it's probably because not using those whitespace would result
- in dict entry creation overwriting the code before it has the chance to be
- interpreted.
|