Mirror of CollapseOS
您最多选择25个主题 主题必须以字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符

225 行
7.7KB

  1. # Implementation notes
  2. # Execution model
  3. After having read a line through readln, we want to interpret
  4. it. As a general rule, we go like this:
  5. 1. read single word from line
  6. 2. Can we find the word in dict?
  7. 3. If yes, execute that word, goto 1
  8. 4. Is it a number?
  9. 5. If yes, push that number to PS, goto 1
  10. 6. Error: undefined word.
  11. # Executing a word
  12. At it's core, executing a word is pushing the wordref on PS and
  13. calling EXECUTE. Then, we let the word do its things. Some
  14. words are special, but most of them are of the "compiled"
  15. type (regular nonnative word), and that's their execution that
  16. we describe here.
  17. First of all, at all time during execution, the Interpreter
  18. Pointer (IP) points to the wordref we're executing next.
  19. When we execute a compiled word, the first thing we do is push
  20. IP to the Return Stack (RS). Therefore, RS' top of stack will
  21. contain a wordref to execute next, after we EXIT.
  22. At the end of every compiled word is an EXIT. This pops RS, sets
  23. IP to it, and continues.
  24. # Stack management
  25. In all supported arches, The Parameter Stack and Return Stack
  26. tops are trackes by a registered assigned to this purpose. For
  27. example, in z80, it's SP and IX that do that. The value in those
  28. registers are referred to as PS Pointer (PSP) and RS Pointer
  29. (RSP).
  30. Those stacks are contiguous and grow in opposite directions. PS
  31. grows "down", RS grows "up".
  32. Stack underflow and overflow: In each native word involving
  33. PS popping, we check whether the stack is big enough. If it's
  34. not we go in "uflw" (underflow) error condition, then abort.
  35. We don't check RS for underflow because the cost of the check
  36. is significant and its usefulness is dubious: if RS isn't
  37. tightly in control, we're screwed anyways, and that, well
  38. before we reach underflow.
  39. Overflow condition happen when RSP and PSP meet somewhere in
  40. the middle. That check is made at each "next" call.
  41. # Dictionary entry
  42. A dictionary entry has this structure:
  43. - Xb name. Arbitrary long number of character (but can't be
  44. bigger than input buffer, of course). not null-terminated
  45. - 2b prev offset
  46. - 1b name size + IMMEDIATE flag (7th bit)
  47. - 1b entry type
  48. - Parameter field (PF)
  49. The prev offset is the number of bytes between the prev field
  50. and the previous word's code pointer.
  51. The size + flag indicate the size of the name field, with the
  52. 7th bit being the IMMEDIATE flag.
  53. The entry type is simply a number corresponding to a type which
  54. will determine how the word will be executed. See "Word types"
  55. below.
  56. # Word types
  57. There are 4 word types in Collapse OS. Whenever you have a
  58. wordref, it's pointing to a byte with numbers 0 to 3. This
  59. number is the word type and the word's behavior depends on it.
  60. 0: native. This words PFA contains native binary code and is
  61. jumped to directly.
  62. 1: compiled. This word's PFA contains an atom list and its
  63. execution is described in "Execution model" above.
  64. 2: cell. This word is usually followed by a 2-byte value in its
  65. PFA. Upon execution, the address of the PFA is pushed to PS.
  66. 3: DOES>. This word is created by "DOES>" and is followed
  67. by a 2-byte value as well as the address where "DOES>" was
  68. compiled. At that address is an atom list exactly like in a
  69. compiled word. Upon execution, after having pushed its cell
  70. addr to PSP, it executes its reference exactly like a
  71. compiled word.
  72. # System variables
  73. There are some core variables in the core system that are
  74. referred to directly by their address in memory throughout the
  75. code. The place where they live is configurable by the SYSVARS
  76. constant in xcomp unit, but their relative offset is not. In
  77. fact, they're mostly referred to directly as their numerical
  78. offset along with a comment indicating what this offset refers
  79. to.
  80. This system is a bit fragile because every time we change those
  81. offsets, we have to be careful to adjust all system variables
  82. offsets, but thankfully, there aren't many system variables.
  83. Here's a list of them:
  84. SYSVARS FUTURE USES +3c BLK(*
  85. +02 CURRENT +3e A@*
  86. +04 HERE +40 A!*
  87. +06 C<? +42 FUTURE USES
  88. +08 C<* override +51 CURRENTPTR
  89. +0a NLPTR +53 (emit) override
  90. +0c C<* +55 (key) override
  91. +0e WORDBUF +57 FUTURE USES
  92. +2e BOOT C< PTR
  93. +30 IN>
  94. +32 IN(* +70 DRIVERS
  95. +34 BLK@* +80 RAMEND
  96. +36 BLK!*
  97. +38 BLK>
  98. +3a BLKDTY
  99. CURRENT points to the last dict entry.
  100. HERE points to current write offset.
  101. IP is the Interpreter Pointer
  102. PARSEPTR holds routine address called on (parse)
  103. C<* holds routine address called on C<. If the C<* override
  104. at 0x08 is nonzero, this routine is called instead.
  105. IN> is the current position in IN(, which is the input buffer.
  106. IN(* is a pointer to the input buffer, allocated at runtime.
  107. CURRENTPTR points to current CURRENT. The Forth CURRENT word
  108. doesn't return RAM+2 directly, but rather the value at this
  109. address. Most of the time, it points to RAM+2, but sometimes,
  110. when maintaining alternative dicts (during cross compilation
  111. for example), it can point elsewhere.
  112. NLPTR points to an alternative routine for NL (by default,
  113. CRLF).
  114. BLK* see B416.
  115. FUTURE USES section is unused for now.
  116. DRIVERS section is reserved for recipe-specific drivers.
  117. # Initialization sequence
  118. (this describes the z80 boot sequence, but other arches have
  119. a very similar sequence, and, of course, once we enter Forth
  120. territory, identical)
  121. On boot, we jump to the "main" routine in B289 which does
  122. very few things.
  123. 1. Set SP to PS_ADDR and IX to RS_ADDR
  124. 2. Sets HERE to SYSVARS+0x80.
  125. 3. Sets CURRENT to value of LATEST field in stable ABI.
  126. 4. Execute the word referred to by 0x04 (BOOT) in stable ABI.
  127. In a normal system, BOOT is in core words at B396 and does a
  128. few things:
  129. 1. Initialize all overrides to 0.
  130. 2. Write LATEST in BOOT C< PTR ( see below )
  131. 3. Set "C<*", the word that C< calls to (boot<).
  132. 4. Call INTERPRET which interprets boot source code until
  133. ASCII EOT (4) is met. This usually init drivers.
  134. 5. Initialize rdln buffer, _sys entry (for EMPTY), prints
  135. "CollapseOS" and then calls (main).
  136. 6. (main) interprets from rdln input (usually from KEY) until
  137. EOT is met, then calls BYE.
  138. In RAM-only environment, we will typically have a
  139. "CURRENT @ HERE !" line during init to have HERE begin at the
  140. end of the binary instead of RAMEND.
  141. # Stable ABI
  142. Across all architectures, some offset are referred to by off-
  143. sets that don't change (well, not without some binary manipu-
  144. lation). Here's the complete list of these references:
  145. 04 BOOT addr 06 (uflw) addr 08 LATEST
  146. 13 (oflw) addr 2b (s) wordref 33 2>R wordref
  147. 42 EXIT wordref 53 (br) wordref 67 (?br) wordref
  148. 80 (loop) wordref bf (n) wordref
  149. BOOT, (uflw) and (oflw) exist because they are referred to
  150. before those words are defined (in core words). LATEST is a
  151. critical part of the initialization sequence.
  152. Stable wordrefs are there for more complicated reasons. When
  153. cross-compiling Collapse OS, we use immediate words from the
  154. host and some of them compile wordrefs (IF compiles (?br),
  155. LOOP compiles (loop), etc.). These compiled wordref need to
  156. be stable across binaries, so they're part of the stable ABI.
  157. Another layer of complexity is the fact that some binaries
  158. don't begin at offset 0. In that case, the stable ABI doesn't
  159. begin at 0 either. The EXECUTE word has a special handling of
  160. those case where any wordref < 0x100 has the binary offset
  161. applied to it.
  162. But that's not the end of our problems. If an offsetted binary
  163. cross compiles a binary with a different offset, stable ABI
  164. references will be > 0x100 and be broken.
  165. For this reason, any stable wordref compiled in the "hot zone"
  166. (B397-B400) has to be compiled by direct offset reference to
  167. avoid having any binary offset applied to it.