Mirror of CollapseOS
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

204 lines
7.6KB

  1. Collapse OS' Forth implementation notes
  2. *** EXECUTION MODEL
  3. After having read a line through readln, we want to interpret it. As a general
  4. rule, we go like this:
  5. 1. read single word from line
  6. 2. Can we find the word in dict?
  7. 3. If yes, execute that word, goto 1
  8. 4. Is it a number?
  9. 5. If yes, push that number to PS, goto 1
  10. 6. Error: undefined word.
  11. *** EXECUTING A WORD
  12. At it's core, executing a word is pushing the wordref on PS and calling EXECUTE.
  13. Then, we let the word do its things. Some words are special, but most of them
  14. are of the compiledWord type, and that's their execution that we describe here.
  15. First of all, at all time during execution, the Interpreter Pointer (IP) points
  16. to the wordref we're executing next.
  17. When we execute a compiledWord, the first thing we do is push IP to the Return
  18. Stack (RS). Therefore, RS' top of stack will contain a wordref to execute next,
  19. after we EXIT.
  20. At the end of every compiledWord is an EXIT. This pops RS, sets IP to it, and
  21. continues.
  22. *** Stack management
  23. The Parameter stack (PS) is maintained by SP and the Return stack (RS) is
  24. maintained by IX. This allows us to generally use push and pop freely because PS
  25. is the most frequently used. However, this causes a problem with routine calls:
  26. because in Forth, the stack isn't balanced within each call, our return offset,
  27. when placed by a CALL, messes everything up. This is one of the reasons why we
  28. need stack management routines below. IX always points to RS' Top Of Stack (TOS)
  29. This return stack contain "Interpreter pointers", that is a pointer to the
  30. address of a word, as seen in a compiled list of words.
  31. *** Dictionary
  32. A dictionary entry has this structure:
  33. - Xb name. Arbitrary long number of character (but can't be bigger than
  34. input buffer, of course). not null-terminated
  35. - 2b prev offset
  36. - 1b size + IMMEDIATE flag
  37. - 2b code pointer
  38. - Parameter field (PF)
  39. The prev offset is the number of bytes between the prev field and the previous
  40. word's code pointer.
  41. The size + flag indicate the size of the name field, with the 7th bit being the
  42. IMMEDIATE flag.
  43. The code pointer point to "word routines". These routines expect to be called
  44. with IY pointing to the PF. They themselves are expected to end by jumping to
  45. the address at (IP). They will usually do so with "jp next".
  46. That's for "regular" words (words that are part of the dict chain). There are
  47. also "special words", for example NUMBER, LIT, FBR, that have a slightly
  48. different structure. They're also a pointer to an executable, but as for the
  49. other fields, the only one they have is the "flags" field.
  50. *** System variables
  51. There are some core variables in the core system that are referred to directly
  52. by their address in memory throughout the code. The place where they live is
  53. configurable by the RAMSTART constant in conf.fs, but their relative offset is
  54. not. In fact, they're mostlly referred to directly as their numerical offset
  55. along with a comment indicating what this offset refers to.
  56. This system is a bit fragile because every time we change those offsets, we
  57. have to be careful to adjust all system variables offsets, but thankfully,
  58. there aren't many system variables. Here's a list of them:
  59. RAMSTART INITIAL_SP
  60. +02 CURRENT
  61. +04 HERE
  62. +06 IP
  63. +08 FLAGS
  64. +0a PARSEPTR
  65. +0c CINPTR
  66. +0e WORDBUF
  67. +2e BOOT C< PTR
  68. +4e INTJUMP
  69. +51 CURRENTPTR
  70. +53 readln's variables
  71. +55 adev's variables
  72. +57 FUTURE USES
  73. +59 z80a's variables
  74. +5b FUTURE USES
  75. +70 DRIVERS
  76. +80 RAMEND
  77. INITIAL_SP holds the initial Stack Pointer value so that we know where to reset
  78. it on ABORT
  79. CURRENT points to the last dict entry.
  80. HERE points to current write offset.
  81. IP is the Interpreter Pointer
  82. FLAGS holds global flags. Only used for prompt output control for now.
  83. PARSEPTR holds routine address called on (parse)
  84. CINPTR holds routine address called on C<
  85. WORDBUF is the buffer used by WORD
  86. BOOT C< PTR is used when Forth boots from in-memory source. See "Initialization
  87. sequence" below.
  88. INTJUMP All RST offsets (well, not *all* at this moment, I still have to free
  89. those slots...) in boot binaries are made to jump to this address. If you use
  90. one of those slots for an interrupt, write a jump to the appropriate offset in
  91. that RAM location.
  92. CURRENTPTR points to current CURRENT. The Forth CURRENT word doesn't return
  93. RAM+2 directly, but rather the value at this address. Most of the time, it
  94. points to RAM+2, but sometimes, when maintaining alternative dicts (during
  95. cross compilation for example), it can point elsewhere.
  96. FUTURE USES section is unused for now.
  97. DRIVERS section is reserved for recipe-specific drivers. Here is a list of
  98. known usages:
  99. * 0x70-0x78: ACIA buffer pointers in RC2014 recipes.
  100. *** Word routines
  101. This is the description of all word routine you can encounter in this Forth
  102. implementation. That is, a wordref will always point to a memory offset
  103. containing one of these numbers.
  104. 0x17: nativeWord. This words PFA contains native binary code and is jumped to
  105. directly.
  106. 0x0e: compiledWord. This word's PFA contains an atom list and its execution is
  107. described in "EXECUTION MODEL" above.
  108. 0x0b: cellWord. This word is usually followed by a 2-byte value in its PFA.
  109. Upon execution, the *address* of the PFA is pushed to PS.
  110. 0x2b: doesWord. This word is created by "DOES>" and is followed by a 2-byte
  111. value as well as the adress where "DOES>" was compiled. At that address is an
  112. atom list exactly like in a compiled word. Upon execution, after having pushed
  113. its cell addr to PSP, it execute its reference exactly like a compiledWord.
  114. 0x20: numberWord. No word is actually compiled with this routine, but atoms are.
  115. Atoms with a reference to the number words routine are followed, *in the atom
  116. list*, of a 2-byte number. Upon execution, that number is fetched and IP is
  117. avdanced by an extra 2 bytes.
  118. 0x24: addrWord. Exactly like a numberWord, except that it is treated
  119. differently by meta-tools.
  120. 0x22: litWord. Similar to a number word, except that instead of being followed
  121. by a 2 byte number, it is followed by a null-terminated string. Upon execution,
  122. the address of that null-terminated string is pushed on the PSP and IP is
  123. advanced to the address following the null.
  124. *** Initialization sequence
  125. On boot, we jump to the "main" routine in boot.fs which does very few things.
  126. 1. Set SP to 0x10000-6
  127. 2. Sets HERE to RAMEND (RAMSTART+0x80).
  128. 3. Sets CURRENT to value of LATEST field in stable ABI.
  129. 4. Look for the word "BOOT" and calls it.
  130. In a normal system, BOOT is in icore and does a few things:
  131. 1. Find "(parse)" and set "(parse*)" to it.
  132. 2. Find "(c<)" a set CINPTR to it (what C< calls).
  133. 3. Write LATEST in SYSTEM SCRATCHPAD ( see below )
  134. 4. Find "INIT". If found, execute. Otherwise, execute "INTERPRET"
  135. On a bare system (only boot+icore), this sequence will result in "(parse)"
  136. reading only decimals and (c<) reading characters from memory starting from
  137. CURRENT (this is why we put CURRENT in SYSTEM SCRATCHPAD, it tracks current
  138. pos ).
  139. This means that you can put initialization code in source form right into your
  140. binary, right after your last compiled dict entry and it's going to be executed
  141. as such until you set a new (c<).
  142. Note that there is no EMIT in a bare system. You have to take care of supplying
  143. one before your load core.fs and its higher levels.
  144. In the "/emul" binaries, "HERE" is readjusted to "CURRENT @" so that we don't
  145. have to relocate compiled dicts. Note that in this context, the initialization
  146. code is fighting for space with HERE: New entries to the dict will overwrite
  147. that code! Also, because we're barebone, we can't have comments. This can lead
  148. to peculiar code in this area where we try to "waste" space in initialization
  149. code.