More to come...
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

371 line
36KB

  1. #ifndef CHAPTER_1_HEADER
  2. #define CHAPTER_1_HEADER
  3. #include "chapter_0.h" // We need this header file for some core enumeration definitions and function declarations.
  4. /*
  5. In this chapter, you'll learn about:
  6. - Keywords
  7. - Normal identifiers
  8. - Types
  9. - Character functions
  10. - File functions
  11. - ASCII table
  12. Now that you've read the chapter zero, we should get into some technical details about C programming language. This will be the most imporant chapter for people who know some
  13. other lower level programming language, or who intuitively understand "building blocks" of some system. Below is just a simple matrix of them, 8 x 4, so you can see that there are
  14. really 32 keywords (ANSI C standard, I dislike newer standards), and even more below we'll categorize them. Also, keep in mind that I'll briefly talk about other C standards such
  15. as K&R, C99, etc., and use parts of them in some places, so you don't get confused when you see them in other peoples' source code.
  16. - Keywords:
  17. static signed if typedef
  18. extern unsigned else enum
  19. const float for union
  20. auto double do struct
  21. register char while return
  22. volatile int switch goto
  23. sizeof short case break
  24. void long default continue
  25. Keywords that you should never use, under any circumstances (except for writing your own C compiler or working in embedded) are:
  26. - register: Hint to the compiler that some variable will be used a lot, so it can be stored in CPU register, modern compilers ignore this keyword.
  27. - volatile: Hint to the compiler that some variable may be changed by some external source. Such a cool keyword, but never use it.
  28. - auto: It specifies automatic storage duration for local variables inside a block. Simply, variable won't exist outside {} block it's defined in.
  29. - signed: We'll talk about types in C more, but this is always assumed, so use only 'unsigned', when you really need to.
  30. - union: You'll have very little excuse to use unions, because they often don't go well with type checking and introduce complexity for no reason.
  31. - goto: I'm kidding, feel free to use 'goto' statement, it's not harmful, some people just abused it to the point of it being bullied.
  32. Keywords that you should really consider not to use in general, but only for some very specific cases (compiler warnings or using some libraries):
  33. - const: Constant qualifier sometimes tells to the compiler that a value won't be changed. Use it only to silence the compiler warnings...
  34. - unsigned: Again, you shouldn't care if your variable is signed or unsigned, because C really likes integers, and hates naturals. We'll talk more about it.
  35. - float: Unless you're making a game engine with GPU acceleration or neural network, I wouldn't use this type at all due to it's limitations.
  36. - double: Well, same as 'float', but twice bigger and more precise floating point values. You'll see it being used very rarely.
  37. - short: Use it only to silence warnings, especially those from XCB and Xlib libraries, for whatever reason (read: X11) they use million different types.
  38. - long: Also use this only to silence warnings, because some standard library functions use this type, pure C-hating cancer...
  39. - typedef: I don't like it at all, but this one is my personal preference, because it just introduces unneeded mental overhead when programming.
  40. - do: It can only be used before '{', then you do the loop thing, then '}' and then 'while' statement. I prefer just to use 'for' everywhere.
  41. Okay, now we're left with following actually useful and C-loving keywords, 18 of them, that I'll cover more:
  42. - char: Type for storing ASCII characters, 8 bits wide, implicitly casted to 'int' in sometimes, arrays that use them are string literals.
  43. - int: Type for storing signed numerical values, 32 bits wide, most number literals are casted to this type, compiler will warn you for mixing types.
  44. - void: This is a black hole, nothing can escape it.
  45. - sizeof: Some people say this behaves like a function, but they are wrong, it's a statement. It literally does what is says, more on that later...
  46. - static: This qualifier has several meanings, that function or variable will only be used in that file or when inside a function, that variables persists.
  47. - extern: This qualifier is more simple, and implicit in older compilers, newer ones warn when it's not used. Something will be used outside that file.
  48. - if: If some condition(s) are met (equal to true or non-zero value), do the thing specified in the block.
  49. - else: With 'else' you can "bind" multiple 'if' statements, but know that you can completely avoid it, and get the same results.
  50. - for: Very complex kind of loop statement, that leads to segmentation faults and many other memory and safety related bugs. I love to use it.
  51. - while: Very simple king of loop statement, that leads to segmentation faults and many other memory and safety related bugs. I don't love to use it.
  52. - switch: This statement is the most powerful in the whole C language, it's not just if-else made pretty, we'll see examples later.
  53. - case: Used only inside 'switch' statement, it sets some number literal as an implicit label, so when the expression in 'switch' is equals it, it jumps there.
  54. - default: Used only because C didn't force curly brackets everywhere back when it was made, but every 'switch' should have just one 'default'.
  55. - enum: Sometimes treated as type (with 'typedef'), and always very good way to define a lot of constants, we'll talk about them more.
  56. - struct: I really advice against using structures, but most libraries use them, so you need to know how they work. In C, they are very weak part of the language.
  57. - return: Used in functions to return from them with or without a result (return value). It's best when function only uses it once.
  58. - break: When inside a loop, this statement will exit the loop. In newer standards it can take simple arguments, but we'll see why that's bad.
  59. - continue: Even more specific case, when inside a loop, this statement will skip to the end of the loop, and then continue again.
  60. Now, of those 18 actually useful C keywords, I like to avoid 'struct', 'switch', 'case', 'default', 'while', and use (functional) 'static' and 'extern' only in order to silence
  61. compiler warnings, 'static' inside a function is more useful. That leaves us (me) with 12 C keywords that I love, out of complete 32 keywords in ANSI C standard. However, we'll
  62. see that sometimes is preferable to use switch statement somewhere, or while loop when for loop feels like overkill. In most real-world cases, you'll need to use some API or
  63. library that internally used structures everywhere, so you'll need to adapt to it, we'll see examples later...
  64. So, real men need these keywords { char, int, void, sizeof, static, if, else, for, enum, return }, and use the rest of them in order to silence compiler warnings, use some
  65. standard library functions, clean the source code or access an API / library / header file. Lets see some more examples and keep in mind code formatting... Also, since this is a
  66. book about C, I'll write comments and use those keywords I dislike, much to my dismay. You surely have different point of view on certain things than me, which is completely
  67. natural, but always remember one thing:
  68. There's a huge difference in being smart and in not being stupid. I'm simply not stupid, but I'm not smart, use this book to write something smarter than I did.
  69. @C
  70. extern int this_is_kind (kind_type this);
  71. // extern - we declare this function as external one, because we'll create an object file (.o), and link it with other programs (or object files).
  72. // int - we set 'int', integer as return type of this function, and use 0 as false, and 1 as true, instead of 'bool' type from <stdbool.h>.
  73. // this_is_kind - we named our function like this, you'll see more of verbose naming when I write programs, because I think it's better...
  74. // kind_type - we choose the type of arguments in our function, since our function deals with 'kind', we choose proper 'kind_type'.
  75. // this - we named our argument about what's it supposed to be, use similar approach in return type, function, argument type and argument names.
  76. @
  77. You'll see this pattern a lot in my code, being consistent makes you more productive and efficient when doing any kind of task, from programming and studying foreign languages, to
  78. cleaning your living room and washing your dishes. Be consistent no matter what you do.
  79. Before continuing, lets describe some imporant types in C very simply:
  80. Word: Bytes: Bits: Kind: Minimum: Maximum:
  81. void / / Black hole / /
  82. void * 8 64 Memory address / /
  83. char 1 8 Integer -128 127
  84. short 2 16 Integer -32768 32767
  85. int 4 32 Integer -2147483648 2147483647
  86. long 8 64 Integer -9223372036854775808 9223372036854775807
  87. unsigned char 1 8 Natural 0 256
  88. unsigned short 2 16 Natural 0 65535
  89. unsigned int 4 32 Natural 0 4294967295
  90. unsigned long 8 64 Natural 0 18446744073709551615
  91. float 4 32 Real (IEEE754) / /
  92. double 8 64 Real (IEEE754) / /
  93. Note that you shouldn't care for now about 'void' and 'void *', because they're special cases, nor about their minimum and maximum. Also, the less types you use, the more type
  94. safe your code is. That's the reason why I use 'int', 'char *' and 'char', and sometimes 'void *' and 'int *'. You get less compiler and linter warnings about conversions, but you
  95. need to know exactly what you're doing and what your code will do in order to have no bugs. Again, think twice, write once. When I write programs in Ada language however, I make
  96. a lot of types, since it's completely different language and has different coding practices.
  97. */
  98. enum {
  99. // We won't even cover all of those file formats, this is just an example of how to do similar task without hardcoding file extensions, we'll use it later.
  100. FILE_TYPE_TEXT, FILE_TYPE_COMMON_ASSEMBLY, FILE_TYPE_FLAT_ASSEMBLY, FILE_TYPE_GNU_ASSEMBLY,
  101. FILE_TYPE_NETWIDE_ASSEMBLY, FILE_TYPE_YET_ANOTHER_ASSEMBLY, FILE_TYPE_C_SOURCE, FILE_TYPE_C_HEADER,
  102. FILE_TYPE_ADA_BODY, FILE_TYPE_ADA_SPECIFICATION, FILE_TYPE_CPP_SOURCE, FILE_TYPE_CPP_HEADER,
  103. FILE_TYPE_COUNT
  104. };
  105. /*
  106. Here are some "utility" functions that we'll maybe use, reimplementation of those from standard library header file <ctype.h>. I prefer names like this, and this is for learning
  107. purposes, so it's nice that you can see it here instead of searching through folders such as "/usr/include/". But before that, lets talk about ASCII table now!
  108. In functions 'character_is_uppercase', 'character_is_lowercase' and 'character_is_digit', we use characters that are in certain range on ASCII table, which we'll show just below.
  109. So, it's safe to use '>=' and '<=' operators, but in other cases, we want to compare them selectively, and for simplicity we use function 'character_compare_array'... Here's how
  110. ASCII table looks like, I don't like encodings like UTF-8 and others, so neither should you. We'll also write a subprogram that prints this to terminal or graphical window.
  111. ASCII table:
  112. - 0B: Binary representation.
  113. - 0O: Octal representation.
  114. - 0D: Decimal representation.
  115. - 0X: Hexadecimal representation.
  116. _______________________________________________________________________________________________________________________________________________________________
  117. |_0B______|_0O__|_0D__|_0X_|_SYM_|_Full_name____________________________________|_0B______|_0O__|_0D__|_0X_|_SYM_|_Full_name____________________________________|
  118. | | | | | | | | | | | | |
  119. | 0000000 | 000 | 0 | 00 | NUL | Null | 0000001 | 001 | 1 | 01 | SOH | Start of heading |
  120. | 0000010 | 002 | 2 | 02 | STX | Start of text | 0000011 | 003 | 3 | 03 | ETX | End of text |
  121. | 0000100 | 004 | 4 | 04 | EOT | End of transmission | 0000101 | 005 | 5 | 05 | ENQ | Enquiry |
  122. | 0000110 | 006 | 6 | 06 | ACK | Acknowledge | 0000111 | 007 | 7 | 07 | BEL | Bell |
  123. | 0001000 | 010 | 8 | 08 | BS | Backspace | 0001001 | 011 | 9 | 09 | HT | Horizontal tab |
  124. | 0001010 | 012 | 10 | 0A | LF | Line feed | 0001011 | 013 | 11 | 0B | VT | Vertical tab |
  125. | 0001100 | 014 | 12 | 0C | FF | Form feed | 0001101 | 015 | 13 | 0D | CR | Carriage return |
  126. | 0001110 | 016 | 14 | 0E | SO | Shift out | 0001111 | 017 | 15 | 0F | SI | Shift in |
  127. | 0010000 | 020 | 16 | 10 | DLE | Data link escape | 0010001 | 021 | 17 | 11 | DC1 | Device control 1 |
  128. | 0010010 | 022 | 18 | 12 | DC2 | Device control 2 | 0010011 | 023 | 19 | 13 | DC3 | Device control 3 |
  129. | 0010100 | 024 | 20 | 14 | DC4 | Device control 4 | 0010101 | 025 | 21 | 15 | NAK | Negative acknowledge |
  130. | 0010110 | 026 | 22 | 16 | SYN | Synchronous idle | 0010111 | 027 | 23 | 17 | ETB | End transmission block |
  131. | 0011000 | 030 | 24 | 18 | CAN | Cancel | 0011001 | 031 | 25 | 19 | EM | End of medium |
  132. | 0011010 | 032 | 26 | 1A | SUB | Substitute | 0011011 | 033 | 27 | 1B | ESC | Escape |
  133. | 0011100 | 034 | 28 | 1C | FS | File separator | 0011101 | 035 | 29 | 1D | GS | Group separator |
  134. | 0011110 | 036 | 30 | 1E | RS | Record separator | 0011111 | 037 | 31 | 1F | US | Unit separator |
  135. | 0100000 | 040 | 32 | 20 | | Space | 0100001 | 041 | 33 | 21 | ! | Exclamation mark |
  136. | 0100010 | 042 | 34 | 22 | " | Speech mark | 0100011 | 043 | 35 | 23 | # | Number sign |
  137. | 0100100 | 044 | 36 | 24 | $ | Dollar sign | 0100101 | 045 | 37 | 25 | % | Percent |
  138. | 0100110 | 046 | 38 | 26 | & | Ampersand | 0100111 | 047 | 39 | 27 | ' | Quote |
  139. | 0101000 | 050 | 40 | 28 | ( | Open parenthesis | 0101001 | 051 | 41 | 29 | ) | Close parenthesis |
  140. | 0101010 | 052 | 42 | 2A | * | Asterisk | 0101011 | 053 | 43 | 2B | + | Plus |
  141. | 0101100 | 054 | 44 | 2C | , | Comma | 0101101 | 055 | 45 | 2D | - | Minus |
  142. | 0101110 | 056 | 46 | 2E | . | Period | 0101111 | 057 | 47 | 2F | / | Slash |
  143. | 0110000 | 060 | 48 | 30 | 0 | Zero | 0110001 | 061 | 49 | 31 | 1 | One |
  144. | 0110010 | 062 | 50 | 32 | 2 | Two | 0110011 | 063 | 51 | 33 | 3 | Three |
  145. | 0110100 | 064 | 52 | 34 | 4 | Four | 0110101 | 065 | 53 | 35 | 5 | Five |
  146. | 0110110 | 066 | 54 | 36 | 6 | Six | 0110111 | 067 | 55 | 37 | 7 | Seven |
  147. | 0111000 | 070 | 56 | 38 | 8 | Eight | 0111001 | 071 | 57 | 39 | 9 | Nine |
  148. | 0111010 | 072 | 58 | 3A | : | Colon | 0111011 | 073 | 59 | 3B | ; | Semicolon |
  149. | 0111100 | 074 | 60 | 3C | < | Open angled bracket | 0111101 | 075 | 61 | 3D | = | Equal |
  150. | 0111110 | 076 | 62 | 3E | > | Close angled bracket | 0111111 | 077 | 63 | 3F | ? | Question mark |
  151. | 1000000 | 100 | 64 | 40 | @ | At sign | 1000001 | 101 | 65 | 41 | A | Uppercase A |
  152. | 1000010 | 102 | 66 | 42 | B | Uppercase B | 1000011 | 103 | 67 | 43 | C | Uppercase C |
  153. | 1000100 | 104 | 68 | 44 | D | Uppercase D | 1000101 | 105 | 69 | 45 | E | Uppercase E |
  154. | 1000110 | 106 | 70 | 46 | F | Uppercase F | 1000111 | 107 | 71 | 47 | G | Uppercase G |
  155. | 1001000 | 110 | 72 | 48 | H | Uppercase H | 1001001 | 111 | 73 | 49 | I | Uppercase I |
  156. | 1001010 | 112 | 74 | 4A | J | Uppercase J | 1001011 | 113 | 75 | 4B | K | Uppercase K |
  157. | 1001100 | 114 | 76 | 4C | L | Uppercase L | 1001101 | 115 | 77 | 4D | M | Uppercase M |
  158. | 1001110 | 116 | 78 | 4E | N | Uppercase N | 1001111 | 117 | 79 | 4F | O | Uppercase O |
  159. | 1010000 | 120 | 80 | 50 | P | Uppercase P | 1010001 | 121 | 81 | 51 | Q | Uppercase Q |
  160. | 1010010 | 122 | 82 | 52 | R | Uppercase R | 1010011 | 123 | 83 | 53 | S | Uppercase S |
  161. | 1010100 | 124 | 84 | 54 | T | Uppercase T | 1010101 | 125 | 85 | 55 | U | Uppercase U |
  162. | 1010110 | 126 | 86 | 56 | V | Uppercase V | 1010111 | 127 | 87 | 57 | W | Uppercase W |
  163. | 1011000 | 130 | 88 | 58 | X | Uppercase X | 1011001 | 131 | 89 | 59 | Y | Uppercase Y |
  164. | 1011010 | 132 | 90 | 5A | Z | Uppercase Z | 1011011 | 133 | 91 | 5B | [ | Opening bracket |
  165. | 1011100 | 134 | 92 | 5C | \ | Backslash | 1011101 | 135 | 93 | 5D | ] | Closing bracket |
  166. | 1011110 | 136 | 94 | 5E | ^ | Caret | 1011111 | 137 | 95 | 5F | _ | Underscore |
  167. | 1100000 | 140 | 96 | 60 | ` | Grave | 1100001 | 141 | 97 | 61 | a | Lowercase a |
  168. | 1100010 | 142 | 98 | 62 | b | Lowercase b | 1100011 | 143 | 99 | 63 | c | Lowercase c |
  169. | 1100100 | 144 | 100 | 64 | d | Lowercase d | 1100101 | 145 | 101 | 65 | e | Lowercase e |
  170. | 1100110 | 146 | 102 | 66 | f | Lowercase f | 1100111 | 147 | 103 | 67 | g | Lowercase g |
  171. | 1101000 | 150 | 104 | 68 | h | Lowercase h | 1101001 | 151 | 105 | 69 | i | Lowercase i |
  172. | 1101010 | 152 | 106 | 6A | j | Lowercase j | 1101011 | 153 | 107 | 6B | k | Lowercase k |
  173. | 1101100 | 154 | 108 | 6C | l | Lowercase l | 1101101 | 155 | 109 | 6D | m | Lowercase m |
  174. | 1101110 | 156 | 110 | 6E | n | Lowercase n | 1101111 | 157 | 111 | 6F | o | Lowercase o |
  175. | 1110000 | 160 | 112 | 70 | p | Lowercase p | 1110001 | 161 | 113 | 71 | q | Lowercase q |
  176. | 1110010 | 162 | 114 | 72 | r | Lowercase r | 1110011 | 163 | 115 | 73 | s | Lowercase s |
  177. | 1110100 | 164 | 116 | 74 | t | Lowercase t | 1110101 | 165 | 117 | 75 | u | Lowercase u |
  178. | 1110110 | 166 | 118 | 76 | v | Lowercase v | 1110111 | 167 | 119 | 77 | w | Lowercase w |
  179. | 1111000 | 170 | 120 | 78 | x | Lowercase x | 1111001 | 171 | 121 | 79 | y | Lowercase y |
  180. | 1111010 | 172 | 122 | 7A | z | Lowercase z | 1111011 | 173 | 123 | 7B | { | Opening brace |
  181. | 1111100 | 174 | 124 | 7C | | | Vertical bar | 1111101 | 175 | 125 | 7D | } | Closing brace |
  182. | 1111110 | 176 | 126 | 7E | ~ | Tilde | 1111111 | 177 | 127 | 7F | DEL | Delete |
  183. |_________|_____|_____|____|_____|______________________________________________|_________|_____|_____|____|_____|______________________________________________|
  184. You can see that values of 'A' ... 'Z', 'a' ... 'z' and '0' ... '9' are sequential, but symbols and "system" characters are mixed up. You can also look at it this way:
  185. - 0 ... 7: Upper 3 bits (0B0---0000).
  186. - 0 ... F: Lower 4 bits (0B0000----).
  187. ___________________________________________________
  188. |___|__0__|__1__|__2__|__3__|__4__|__5__|__6__|__7__|
  189. | | | | | | | | | |
  190. | 0 | NUL | DLE | | 0 | @ | P | ` | p |
  191. | 1 | SOH | DC1 | ! | 1 | A | Q | a | q |
  192. | 2 | STX | DC2 | " | 2 | B | R | b | r |
  193. | 3 | ETX | DC3 | # | 3 | C | S | c | s |
  194. | 4 | EOT | DC4 | $ | 4 | D | T | d | t |
  195. | 5 | ENQ | NAK | % | 5 | E | U | e | u |
  196. | 6 | ACK | SYN | & | 6 | F | V | f | v |
  197. | 7 | BEL | ETB | ' | 7 | G | W | g | w |
  198. | 8 | BS | CAN | ( | 8 | H | X | h | x |
  199. | 9 | HT | EM | ) | 9 | I | Y | i | y |
  200. | A | LF | SUB | * | : | J | Z | j | z |
  201. | B | VT | ESC | + | ; | K | [ | k | { |
  202. | C | FF | FS | , | < | L | \ | l | | |
  203. | D | CR | GS | - | = | M | ] | m | } |
  204. | E | SO | RS | . | > | N | ^ | n | ~ |
  205. | F | SI | US | / | ? | O | _ | o | DEL |
  206. |___|_____|_____|_____|_____|_____|_____|_____|_____|
  207. You can notice that if you toggle 5th bit of alphabet characters, you can set them to lowercase or uppercase, since we're dealing with binaries in this case, when only 5th bit is
  208. 1, and others are 0, that's 2**5 (2 to the power of 5), which is 32, which is again equal to space character. That table also works for hexadecimals, you can see that for example,
  209. character 'H' is in '4' column and '8' row, so hexadecimal value for character literal 'H' is 0X48. Of course, there's no need to memorize any of those, they can just be handy if
  210. you feel exceptionally smart and want to do some bit manipulation on strings.
  211. Lets talk very shortly about C preprocessor, I don't like to use it, but sometimes you have to, and I'll show few useful examples in later chapters. Note that you can make a
  212. complete project in C programming language, without using the preprocessor even once, but since modern programs are large and often split into separate source files, and there's
  213. no truly good build system (in any language), people use them in order not to copy paste structures, unions, enumerations and function declarations, and global variables were
  214. villified because of multiple people working on the same project, and fucking things up. I felt the need to repeat this below too...
  215. In C language, we have C source files with the extension '.c', and C header files with the extension '.h'. Both of those are just plain text files, and please use 7-bit ASCII
  216. encoding, since it's common sense, UTF is cancer, and 8-bit ASCII is for enlightened people like Terrence Andrew Davis. C language is completely separate (on some C compilers)
  217. from its' preprocessor, whose directives start with '#' character, continue on '\' character and break on '\n' (read: LINE FEED) character.
  218. @C
  219. #include <path/to/file/file_name.h> // Copy the entire file from '/usr/include/' directory into this file, on the place where it was specified.
  220. #include "path/to/file/file_name.h" // Copy the entire file from current directory into this file, again on the place where it was specified.
  221. #define SOMETHING // This will add additional information to the preprocessor about this file, it's mostly used for flags and header-guards.
  222. #undef SOMETHING // This will remove that additional information you've provided...
  223. #if SOMETHING // If SOMETHING (condition obviously) is true, then code until '#elif', '#else' or '#endif' will be included.
  224. #ifdef SOMETHING // If SOMETHING was previously '#define'-d, then code until '#elif', '#else' or '#endif' will be included.
  225. #ifndef SOMETHING // If SOMETHING wasn't (NOT!) previously '#define'-d, then code until '#elif', '#else' or '#endif' will be included.
  226. #elif // Essentially "else if" for preprocessor, it's very ugly, and nesting them looks bad and is a bad practice.
  227. #else // Essentially "else" for preprocessor, I don't think I ever used it in my entire life, but I saw other people use it.
  228. #endif // End if... Self-explanatory, and a sad thing that we need to ruin the beautiful C code with it.
  229. @
  230. Okay, that's all you really need to know about C preprocessor, since we won't use it much. You can write a completely pure C project, using only C language, but you'll end up with
  231. copying and pasting a lot of code, especially external function and variable declarations. Because of that we need '#include' directive, and because of it, we need header guards,
  232. so it's all C-hating in the end. However, we need to cover some simple macros, so you can deal with other peoples' code bases. Remember, the less "building blocks" you have, if
  233. you learn them well, you can make anything, and you should be proud of "reinventing the wheel". If wheels weren't reinvented over and over again, then some expensive BMW would've
  234. wooden wheels attached to it. You can also use '#define' to write entire functions, but that can lead to lot of compiler warnings for only forgetting to put one character, and it
  235. bloats the code, so we'll show how to write them, and never use them again.
  236. Then, as you probably noticed, comments in older C standards (K&R and ANSI) begin with "/(merged)*" and end with "*(merged)/". I put merged there, because otherwise it'd end this
  237. comment there, and everything below it would be treated as C source code. Preprocessor modifies C source code before the compiler, so it would remove all comments, copy+paste the
  238. content of those files we included, conditionally if we used if / ifdef/ ifndef and more. Usually syntax highlighting will be weird if you do something like this, and you'll fix
  239. it easily, so don't worry much about it. In newer standards (C99 and forward), you can have single-line comments that beign with "//" and end with (you guessed it) new line, aka
  240. character literal '\n' aka line feed. Again, I don't write comments at all except licence notice in my projects, but this book project is an exception. You should express what
  241. your code does in writing it properly, not writing obfuscated code and add comments about what it does.
  242. Otherwise, roughly speaking, you have constants, variables, functions and pointers. In the end, it all comes down to CPU instructions that use registers, immediate values and
  243. memory addresses (REG / IMM / MEM), which we'll mention way later.
  244. Your constants can be internal, external, defined or enumerated:
  245. @C
  246. static const char DELETE = '\177'; // Used in only one file, where's it declared and defined.
  247. extern const char DELETE; // Used in C header file (.h).
  248. const char DELETE = (char) 127; // Used in C source file (.c).
  249. #define DELETE (0X7F) // Used where file containing this line was '#include'-d.
  250. // Note that I consider this bad practice, since you also need to include file containing this, and it values start from 0, so in this case we need to set it to 127.
  251. // With 'typedef'-ed example, we can provide a name for that enumeration, and use it as 'enum my_enumeration_verbose_or_same_name' or 'my_enumeration'.
  252. // Enumerations are only useful (from my experience) when you need to '#define' a lot of values in incremental order, from 0 to some number, and they are 'int' type by default.
  253. enum { DELETE = 127 };
  254. typedef enum my_enumeration_verbose_or_same_name { DELETE = 127 } my_enumeration;
  255. @
  256. It's very similar for variables, but they can't be defined (nor should be!) or enumerated (unless you're thinking about arrays). You can have variables inside and outside
  257. functions, those inside are called local variables, and those outside are called global variables. Global variables can make the code shorter, simpler and easier to change, but if
  258. you or people you work with don't know what the hell they're doing, it can lead to messy code or difficult to track bugs. Just don't think that they are evil and should never be
  259. used, because that's not the case.
  260. Local variables (that you'll see inside my functions) are by default declared with 'auto' instead of 'static' or 'extern', but you don't need to write 'auto' before them, since
  261. it's implicitly there. In old C language (K&R standard), they used to write code like this example below, which isn't something you should do nowdays. Also, local variables aren't
  262. accessable or modifiable after the function ends, you'll see examples of that later.
  263. @C
  264. // K&R example (old C language), which would be something like following in newer standards:
  265. output (data, size) static int output (char * data, int size) {
  266. char * data; { (void) write (STDOUT_FILENO, (void *) data, (size_t) size * sizeof (* data));
  267. write (1, data, size); return (0);
  268. } }
  269. static char * string_pointer = NULL; // Later we can modify them, use them in functions and much more, but only in file those were defined and declared.
  270. static char * string = NULL;
  271. // Somewhere later in the file, inside some function:
  272. // string = calloc (1024UL, sizeof (* string));
  273. // string = strncpy (string, "Heyo world!");
  274. // ...
  275. // free (string);
  276. extern int subprogram_id; // Used in C header file (.h).
  277. int subprogram_id = 0; // Used in C source file (.c).
  278. // Later, any function can modify this variable with just:
  279. // subprogram_id = 144000;
  280. @
  281. Functions simply are part of the program that modifies variables or execute some code that causes a side-effect, and then return a value or nothing (void). For example, our
  282. 'character_*' functions below preform some operations on operand 'character' of type 'char', without modifying it, and return some value of type 'int', and they are declared as
  283. external with 'extern', because they are defined in another text file. In this case, our family of functions 'character_*' will return FALSE (0) or TRUE (1), depending on what
  284. they do in their definitions, and I like to put '_is_' in functions that return boolean value (but not always).
  285. */
  286. extern int character_is_uppercase (char character); // Notice how we align those functions, I believe this improves the readability of any program, in any programming language.
  287. extern int character_is_lowercase (char character); // Some people would just use 'ischrlow' or 'islower', but I hate reading code written like that...
  288. extern int character_is_digit (char character); // Important note is also that a programming language is not, and it should be like natural language, why?
  289. extern int character_is_blank (char character); // Because we need strict rules in programming language, same like in mathematical languages, now now, don't be scared.
  290. extern int character_is_alpha (char character);
  291. extern int character_is_symbol (char character);
  292. extern int character_is_visible (char character);
  293. extern int character_is_invisible (char character);
  294. extern int character_is_escape (char character);
  295. extern int character_is_underscore (char character);
  296. extern int character_is_hexadecimal (char character);
  297. extern int character_compare_array (char character, char * character_array); // This function is singled out, because it's different from those above, and we use it internally.
  298. extern int character_count (char * string, char character, int from, int to, char stop); // We'll use this to count characters in null-terminated strings.
  299. /*
  300. And here are also utility functions that handle files, most of them are reimplemented using "system calls" from <fcntl.h> and <unistd.h>, but you also have access to <stdio.h>,
  301. which is probably the most used header file in C language. It handles the 'FILE *' type, not a file descriptors which are 'int', and has functions that are prefixed with character
  302. 'f', for example, 'fopen / fclose / fread / fwrite / fseek' and many more.
  303. */
  304. extern int file_open (char * name, int mode); // We open a file descriptor 'name' with 'mode', obviously...
  305. extern int file_close (int file); // We. Every opened file descriptor should be closed when program finishes.
  306. extern void file_read (int file, void * data, int size); // We read from 'file' into 'data', by 'size' amount, similar to 'in'.
  307. extern void file_write (int file, void * data, int size); // We write from 'data' into 'file', by 'size' amount, similar to 'out'.
  308. extern int file_seek (int file, int whence); // We retrieve data about offsets in 'file'.
  309. extern int file_size (char * name); // We get the size of the file by its' 'name'.
  310. extern int file_type (char * name); // We get the type of the file by its' 'name' (by file name extension).
  311. extern char * file_record (char * name); // We store an entire file into some memory address.
  312. // These will be useful in chapter three, where we'll learn about 'printf' function. It's too complex to cover it at this point.
  313. extern char * number_to_string (int number);
  314. extern char * format_to_string (int number, int sign, int base, int amount, char character);
  315. extern int randomize (int minimum, int maximum);
  316. #endif