Mirror of CollapseOS
Nelze vybrat více než 25 témat Téma musí začínat písmenem nebo číslem, může obsahovat pomlčky („-“) a může být dlouhé až 35 znaků.

213 řádky
7.2KB

  1. # Implementation notes
  2. # Execution model
  3. After having read a line through readln, we want to interpret
  4. it. As a general rule, we go like this:
  5. 1. read single word from line
  6. 2. Can we find the word in dict?
  7. 3. If yes, execute that word, goto 1
  8. 4. Is it a number?
  9. 5. If yes, push that number to PS, goto 1
  10. 6. Error: undefined word.
  11. # Executing a word
  12. At its core, executing a word is pushing the wordref on PS and
  13. calling EXECUTE. Then, we let the word do its things. Some
  14. words are special, but most of them are of the "compiled"
  15. type (regular nonnative word), and that's their execution that
  16. we describe here.
  17. First of all, at all time during execution, the Interpreter
  18. Pointer (IP) points to the wordref we're executing next.
  19. When we execute a compiled word, the first thing we do is push
  20. IP to the Return Stack (RS). Therefore, RS' top of stack will
  21. contain a wordref to execute next, after we EXIT.
  22. At the end of every compiled word is an EXIT. This pops RS, sets
  23. IP to it, and continues.
  24. A compiled word is simply a list of wordrefs, but not all those
  25. wordrefs are 2 bytes in length. Some wordrefs are special. For
  26. example, a reference to (n) will be followed by an extra 2 bytes
  27. number. It's the responsibility of the (n) word to advance IP
  28. by 2 extra bytes.
  29. # Stack management
  30. In all supported arches, The Parameter Stack and Return Stack
  31. tops are tracked by a registered assigned to this purpose. For
  32. example, in z80, it's SP and IX that do that. The value in those
  33. registers are referred to as PS Pointer (PSP) and RS Pointer
  34. (RSP).
  35. Those stacks are contiguous and grow in opposite directions. PS
  36. grows "down", RS grows "up".
  37. Stack underflow and overflow: In each native word involving
  38. PS popping, we check whether the stack is big enough. If it's
  39. not we go in "uflw" (underflow) error condition, then abort.
  40. We don't check RS for underflow because the cost of the check
  41. is significant and its usefulness is dubious: if RS isn't
  42. tightly in control, we're screwed anyways, and that, well
  43. before we reach underflow.
  44. Overflow condition happen when RSP and PSP meet somewhere in
  45. the middle. That check is made at each "next" call.
  46. # Dictionary entry
  47. A dictionary entry has this structure:
  48. - Xb name. Arbitrary long number of character (but can't be
  49. bigger than input buffer, of course). not null-terminated
  50. - 2b prev offset
  51. - 1b name size + IMMEDIATE flag (7th bit)
  52. - 1b entry type
  53. - Parameter field (PF)
  54. The prev offset is the number of bytes between the prev field
  55. and the previous word's entry type.
  56. The size + flag indicate the size of the name field, with the
  57. 7th bit being the IMMEDIATE flag.
  58. The entry type is simply a number corresponding to a type which
  59. will determine how the word will be executed. See "Word types"
  60. below.
  61. # Word types
  62. There are 6 word types in Collapse OS. Whenever you have a
  63. wordref, it's pointing to a byte with numbers 0 to 5. This
  64. number is the word type and the word's behavior depends on it.
  65. 0: native. This words PFA contains native binary code and is
  66. jumped to directly.
  67. 1: compiled. This word's PFA contains a list of wordrefs and its
  68. execution is described in "Execution model" above.
  69. 2: cell. This word is usually followed by a 2-byte value in its
  70. PFA. Upon execution, the address of the PFA is pushed to PS.
  71. 3: DOES>. This word is created by "DOES>" and is followed
  72. by a 2-bytes value as well as the address where "DOES>" was
  73. compiled. At that address is an wordref list exactly like in a
  74. compiled word. Upon execution, after having pushed its cell
  75. addr to PSP, it executes its reference exactly like a
  76. compiled word.
  77. 4: alias. See usage.txt. PFA is like a cell, but instead of
  78. pushing it to PS, we execute it.
  79. 5: switch. Same as alias, but with an added indirection.
  80. # System variables
  81. There are some core variables in the core system that are
  82. referred to directly by their address in memory throughout the
  83. code. The place where they live is configurable by the SYSVARS
  84. constant in xcomp unit, but their relative offset is not. In
  85. fact, they're mostly referred to directly as their numerical
  86. offset along with a comment indicating what this offset refers
  87. to.
  88. This system is a bit fragile because every time we change those
  89. offsets, we have to be careful to adjust all system variables
  90. offsets, but thankfully, there aren't many system variables.
  91. Here's a list of them:
  92. SYSVARS FUTURE USES +3c BLK(*
  93. +02 CURRENT +3e A@*
  94. +04 HERE +40 A!*
  95. +06 C<? +42 FUTURE USES
  96. +08 C<* override +51 CURRENTPTR
  97. +0a NLPTR +53 (emit) override
  98. +0c C<* +55 (key) override
  99. +0e WORDBUF +57 FUTURE USES
  100. +2e BOOT C< PTR
  101. +30 IN>
  102. +32 IN(* +70 DRIVERS
  103. +34 BLK@* +80 RAMEND
  104. +36 BLK!*
  105. +38 BLK>
  106. +3a BLKDTY
  107. CURRENT points to the last dict entry.
  108. HERE points to current write offset.
  109. C<* holds routine address called on C<. If the C<* override
  110. at 0x08 is nonzero, this routine is called instead.
  111. IN> is the current position in IN(, which is the input buffer.
  112. IN(* is a pointer to the input buffer, allocated at runtime.
  113. CURRENTPTR points to current CURRENT. The Forth CURRENT word
  114. doesn't return RAM+2 directly, but rather the value at this
  115. address. Most of the time, it points to RAM+2, but sometimes,
  116. when maintaining alternative dicts (during cross compilation
  117. for example), it can point elsewhere.
  118. NLPTR points to an alternative routine for NL (by default,
  119. CRLF).
  120. BLK* see B416.
  121. DRIVERS section is reserved for recipe-specific drivers.
  122. FUTURE USES section is unused for now.
  123. # Initialization sequence
  124. (this describes the z80 boot sequence, but other arches have
  125. a very similar sequence, and, of course, once we enter Forth
  126. territory, identical)
  127. On boot, we jump to the "main" routine in B289 which does
  128. very few things.
  129. 1. Set SP to PS_ADDR and IX to RS_ADDR.
  130. 2. Set CURRENT to value of LATEST field in stable ABI.
  131. 3. Set HERE to HERESTART const if defined, to CURRENT other-
  132. wise.
  133. 4. Execute the word referred to by 0x04 (BOOT) in stable ABI.
  134. In a normal system, BOOT is in core words at B396 and does a
  135. few things:
  136. 1. Initialize all overrides to 0.
  137. 2. Write LATEST in BOOT C< PTR ( see below ).
  138. 3. Set "C<*", the word that C< calls, to (boot<).
  139. 4. Call INTERPRET which interprets boot source code until
  140. ASCII EOT (4) is met. This usually initializes drivers.
  141. 5. Initialize rdln buffer, _sys entry (for EMPTY), prints
  142. "CollapseOS" and then calls (main).
  143. 6. (main) interprets from rdln input (usually from KEY) until
  144. EOT is met, then calls BYE.
  145. # Stable ABI
  146. The Stable ABI lives at the beginning of the binary and prov-
  147. ides a way for Collapse OS code to access values that would
  148. otherwise be difficult to access. Here's the complete list of
  149. these references:
  150. 04 BOOT addr 06 (uflw) addr 08 LATEST
  151. 13 (oflw) addr 1a next addr
  152. BOOT, (uflw) and (oflw) exist because they are referred to
  153. before those words are defined (in core words). LATEST is a
  154. critical part of the initialization sequence.
  155. All Collapse OS binaries, regardless of architecture, have
  156. those values at those offsets of them. Some binaries are built
  157. to run at offset different than zero. This stable ABI lives at
  158. that offset, not 0.