xhartae/chapters/chapter_1.h

252 lines
28 KiB
C

/*
Copyright (c) 2023 : Ognjen 'xolatile' Milan Robovic
Xhartae is free software! You will redistribute it or modify it under the terms of the GNU General Public License by Free Software Foundation.
And when you do redistribute it or modify it, it will use either version 3 of the License, or (at yours truly opinion) any later version.
It is distributed in the hope that it will be useful or harmful, it really depends... But no warranty what so ever, seriously. See GNU/GPLv3.
*/
#ifndef CHAPTER_1_HEADER
#define CHAPTER_1_HEADER
#include "chapter_0.h" // We need this header file for some core enumeration definitions and function declarations.
/*
Now that you've read the chapter zero, we should get into some technical details about C programming language. This will be the most imporant chapter for people who know some
other lower level programming language, or who intuitively understand "building blocks" of some system. Below is just a simple matrix of them, 8 x 4, so you can see that there are
really 32 keywords (ANSI C standard, I dislike newer standards), and even more below we'll categorize them. Also, keep in mind that I'll briefly talk about other C standards such
as K&R, C99, etc., and use parts of them in some places, so you don't get confused when you see them in other peoples' source code.
- Keywords:
static signed if typedef
extern unsigned else enum
const float for union
auto double do struct
register char while return
volatile int switch goto
sizeof short case break
void long default continue
Keywords that you should never use, under any circumstances (except for writing your own C compiler or working in embedded) are:
- register: Hint to the compiler that some variable will be used a lot, so it can be stored in CPU register, modern compilers ignore this keyword.
- volatile: Hint to the compiler that some variable may be changed by some external source. Such a cool keyword, but never use it.
- auto: It specifies automatic storage duration for local variables inside a block. Simply, variable won't exist outside {} block it's defined in.
- signed: We'll talk about types in C more, but this is always assumed, so use only 'unsigned', when you really need to.
- union: You'll have very little excuse to use unions, because they often don't go well with type checking and introduce complexity for no reason.
- goto: I'm kidding, feel free to use 'goto' statement, it's not harmful, some people just abused it to the point of it being bullied.
Keywords that you should really consider not to use in general, but only for some very specific cases (compiler warnings or using some libraries):
- const: Constant qualifier sometimes tells to the compiler that a value won't be changed. Use it only to silence the compiler warnings...
- unsigned: Again, you shouldn't care if your variable is signed or unsigned, because C really likes integers, and hates naturals. We'll talk more about it.
- float: Unless you're making a game engine with GPU acceleration or neural network, I wouldn't use this type at all due to it's limitations.
- double: Well, same as 'float', but twice bigger and more precise floating point values. You'll see it being used very rarely.
- short: Use it only to silence warnings, especially those from XCB and Xlib libraries, for whatever reason (read: X11) they use million different types.
- long: Also use this only to silence warnings, because some standard library functions use this type, pure C-hating cancer...
- typedef: I don't like it at all, but this one is my personal preference, because it just introduces unneeded mental overhead when programming.
- do: It can only be used before '{', then you do the loop thing, then '}' and then 'while' statement. I prefer just to use 'for' everywhere.
Okay, now we're left with following actually useful and C-loving keywords, 18 of them, that I'll cover more:
- char: Type for storing ASCII characters, 8 bits wide, implicitly casted to 'int' in sometimes, arrays that use them are string literals.
- int: Type for storing signed numerical values, 32 bits wide, most number literals are casted to this type, compiler will warn you for mixing types.
- void: This is a black hole, nothing can escape it.
- sizeof: Some people say this behaves like a function, but they are wrong, it's a statement. It literally does what is says, more on that later...
- static: This qualifier has several meanings, that function or variable will only be used in that file or when inside a function, that variables persists.
- extern: This qualifier is more simple, and implicit in older compilers, newer ones warn when it's not used. Something will be used outside that file.
- if: If some condition(s) are met (equal to true or non-zero value), do the thing specified in the block.
- else: With 'else' you can "bind" multiple 'if' statements, but know that you can completely avoid it, and get the same results.
- for: Very complex kind of loop statement, that leads to segmentation faults and many other memory and safety related bugs. I love to use it.
- while: Very simple king of loop statement, that leads to segmentation faults and many other memory and safety related bugs. I don't love to use it.
- switch: This statement is the most powerful in the whole C language, it's not just if-else made pretty, we'll see examples later.
- case: Used only inside 'switch' statement, it sets some number literal as an implicit label, so when the expression in 'switch' is equals it, it jumps there.
- default: Used only because C didn't force curly brackets everywhere back when it was made, but every 'switch' should have just one 'default'.
- enum: Sometimes treated as type (with 'typedef'), and always very good way to define a lot of constants, we'll talk about them more.
- struct: I really advice against using structures, but most libraries use them, so you need to know how they work. In C, they are very weak part of the language.
- return: Used in functions to return from them with or without a result (return value). It's best when function only uses it once.
- break: When inside a loop, this statement will exit the loop. In newer standards it can take simple arguments, but we'll see why that's bad.
- continue: Even more specific case, when inside a loop, this statement will skip to the end of the loop, and then continue again.
Now, of those 18 actually useful C keywords, I like to avoid 'struct', 'switch', 'case', 'default', 'while', and use (functional) 'static' and 'extern' only in order to silence
compiler warnings, 'static' inside a function is more useful. That leaves us (me) with 12 C keywords that I love, out of complete 32 keywords in ANSI C standard. However, we'll
see that sometimes is preferable to use switch statement somewhere, or while loop when for loop feels like overkill. In most real-world cases, you'll need to use some API or
library that internally used structures everywhere, so you'll need to adapt to it, we'll see examples later...
So, real men need these keywords { char, int, void, sizeof, static, if, else, for, enum, return }, and use the rest of them in order to silence compiler warnings, use some
standard library functions, clean the source code or access an API / library / header file. Lets see some more examples and keep in mind code formatting...
@C
extern int this_is_kind (kind_type this);
// extern - we declare this function as external one, because we'll create an object file (.o), and link it with other programs (or object files).
// int - we set 'int', integer as return type of this function, and use 0 as false, and 1 as true, instead of 'bool' type from <stdbool.h>.
// this_is_kind - we named our function like this, you'll see more of verbose naming when I write programs, because I think it's better...
// kind_type - we choose the type of arguments in our function, since our function deals with 'kind', we choose proper 'kind_type'.
// this - we named our argument about what's it supposed to be, use similar approach in return type, function, argument type and argument names.
@
Before continuing, lets describe some imporant types in C very simply:
Word: Bytes: Bits: Kind: Minimum: Maximum:
void / / Black hole / /
void * 8 64 Memory address / /
char 1 8 Integer -128 127
short 2 16 Integer -32768 32767
int 4 32 Integer -2147483648 2147483647
long 8 64 Integer -9223372036854775808 9223372036854775807
unsigned char 1 8 Natural 0 256
unsigned short 2 16 Natural 0 65535
unsigned int 4 32 Natural 0 4294967295
unsigned long 8 64 Natural 0 18446744073709551615
float 4 32 Real (IEEE754) / /
double 8 64 Real (IEEE754) / /
Note that you shouldn't care for now about 'void' and 'void *', because they're special cases, nor about their minimum and maximum. Also, the less types you use, the more
type-safe your code is. That's the reason why I use 'int', 'char *' and 'char', and sometimes 'void *' and 'int *'. You get less compiler and linter warnings about conversions,
but you need to know exactly what you're doing and what your code will do in order to have no bugs. Again, think twice, write once.
*/
enum { // We won't even cover all of those, this is just an example of how to do similar task without hardcoding file extensions.
FILE_TYPE_TEXT, FILE_TYPE_COMMON_ASSEMBLY, FILE_TYPE_FLAT_ASSEMBLY, FILE_TYPE_GNU_ASSEMBLY,
FILE_TYPE_NETWIDE_ASSEMBLY, FILE_TYPE_YET_ANOTHER_ASSEMBLY, FILE_TYPE_C_SOURCE, FILE_TYPE_C_HEADER,
FILE_TYPE_ADA_BODY, FILE_TYPE_ADA_SPECIFICATION, FILE_TYPE_CPP_SOURCE, FILE_TYPE_CPP_HEADER,
FILE_TYPE_COUNT
};
/*
Here are some "utility" functions that we'll maybe use, reimplementation of those from standard library header file <ctype.h>. I prefer names like this, and this is for learning
purposes, so it's nice that you can see it here instead of searching through folders such as "/usr/include/". Lets talk about ASCII table now!
In functions 'character_is_uppercase', 'character_is_lowercase' and 'character_is_digit', we use characters that are in certain range on ASCII table, which we'll show just below.
So, it's safe to use '>=' and '<=' operators, but in other cases, we want to compare them selectively, and for simplicity we use function 'character_compare_array'... Here's how
ASCII table looks like, I don't like encodings like UTF-8 and others, so neither should you. We'll also write a subprogram that prints this to terminal or graphical window.
ASCII table:
_______________________________________________________________________________________________________________________________________________________________
|_0B______|_0O__|_0D__|_0X_|_SYM_|_Full_name____________________________________|_0B______|_0O__|_0D__|_0X_|_SYM_|_Full_name____________________________________|
| | |
| 0000000 | 000 | 0 | 00 | NUL | Null | 0000001 | 001 | 1 | 01 | SOH | Start of heading |
| 0000010 | 002 | 2 | 02 | STX | Start of text | 0000011 | 003 | 3 | 03 | ETX | End of text |
| 0000100 | 004 | 4 | 04 | EOT | End of transmission | 0000101 | 005 | 5 | 05 | ENQ | Enquiry |
| 0000110 | 006 | 6 | 06 | ACK | Acknowledge | 0000111 | 007 | 7 | 07 | BEL | Bell |
| 0001000 | 010 | 8 | 08 | BS | Backspace | 0001001 | 011 | 9 | 09 | HT | Horizontal tab |
| 0001010 | 012 | 10 | 0A | LF | Line feed | 0001011 | 013 | 11 | 0B | VT | Vertical tab |
| 0001100 | 014 | 12 | 0C | FF | Form feed | 0001101 | 015 | 13 | 0D | CR | Carriage return |
| 0001110 | 016 | 14 | 0E | SO | Shift out | 0001111 | 017 | 15 | 0F | SI | Shift in |
| 0010000 | 020 | 16 | 10 | DLE | Data link escape | 0010001 | 021 | 17 | 11 | DC1 | Device control 1 |
| 0010010 | 022 | 18 | 12 | DC2 | Device control 2 | 0010011 | 023 | 19 | 13 | DC3 | Device control 3 |
| 0010100 | 024 | 20 | 14 | DC4 | Device control 4 | 0010101 | 025 | 21 | 15 | NAK | Negative acknowledge |
| 0010110 | 026 | 22 | 16 | SYN | Synchronous idle | 0010111 | 027 | 23 | 17 | ETB | End transmission block |
| 0011000 | 030 | 24 | 18 | CAN | Cancel | 0011001 | 031 | 25 | 19 | EM | End of medium |
| 0011010 | 032 | 26 | 1A | SUB | Substitute | 0011011 | 033 | 27 | 1B | ESC | Escape |
| 0011100 | 034 | 28 | 1C | FS | File separator | 0011101 | 035 | 29 | 1D | GS | Group separator |
| 0011110 | 036 | 30 | 1E | RS | Record separator | 0011111 | 037 | 31 | 1F | US | Unit separator |
| 0100000 | 040 | 32 | 20 | | Space | 0100001 | 041 | 33 | 21 | ! | Exclamation mark |
| 0100010 | 042 | 34 | 22 | " | Speech mark | 0100011 | 043 | 35 | 23 | # | Number sign |
| 0100100 | 044 | 36 | 24 | $ | Dollar sign | 0100101 | 045 | 37 | 25 | % | Percent |
| 0100110 | 046 | 38 | 26 | & | Ampersand | 0100111 | 047 | 39 | 27 | ' | Quote |
| 0101000 | 050 | 40 | 28 | ( | Open parenthesis | 0101001 | 051 | 41 | 29 | ) | Close parenthesis |
| 0101010 | 052 | 42 | 2A | * | Asterisk | 0101011 | 053 | 43 | 2B | + | Plus |
| 0101100 | 054 | 44 | 2C | , | Comma | 0101101 | 055 | 45 | 2D | - | Minus |
| 0101110 | 056 | 46 | 2E | . | Period | 0101111 | 057 | 47 | 2F | / | Slash |
| 0110000 | 060 | 48 | 30 | 0 | Zero | 0110001 | 061 | 49 | 31 | 1 | One |
| 0110010 | 062 | 50 | 32 | 2 | Two | 0110011 | 063 | 51 | 33 | 3 | Three |
| 0110100 | 064 | 52 | 34 | 4 | Four | 0110101 | 065 | 53 | 35 | 5 | Five |
| 0110110 | 066 | 54 | 36 | 6 | Six | 0110111 | 067 | 55 | 37 | 7 | Seven |
| 0111000 | 070 | 56 | 38 | 8 | Eight | 0111001 | 071 | 57 | 39 | 9 | Nine |
| 0111010 | 072 | 58 | 3A | : | Colon | 0111011 | 073 | 59 | 3B | ; | Semicolon |
| 0111100 | 074 | 60 | 3C | < | Open angled bracket | 0111101 | 075 | 61 | 3D | = | Equal |
| 0111110 | 076 | 62 | 3E | > | Close angled bracket | 0111111 | 077 | 63 | 3F | ? | Question mark |
| 1000000 | 100 | 64 | 40 | @ | At sign | 1000001 | 101 | 65 | 41 | A | Uppercase A |
| 1000010 | 102 | 66 | 42 | B | Uppercase B | 1000011 | 103 | 67 | 43 | C | Uppercase C |
| 1000100 | 104 | 68 | 44 | D | Uppercase D | 1000101 | 105 | 69 | 45 | E | Uppercase E |
| 1000110 | 106 | 70 | 46 | F | Uppercase F | 1000111 | 107 | 71 | 47 | G | Uppercase G |
| 1001000 | 110 | 72 | 48 | H | Uppercase H | 1001001 | 111 | 73 | 49 | I | Uppercase I |
| 1001010 | 112 | 74 | 4A | J | Uppercase J | 1001011 | 113 | 75 | 4B | K | Uppercase K |
| 1001100 | 114 | 76 | 4C | L | Uppercase L | 1001101 | 115 | 77 | 4D | M | Uppercase M |
| 1001110 | 116 | 78 | 4E | N | Uppercase N | 1001111 | 117 | 79 | 4F | O | Uppercase O |
| 1010000 | 120 | 80 | 50 | P | Uppercase P | 1010001 | 121 | 81 | 51 | Q | Uppercase Q |
| 1010010 | 122 | 82 | 52 | R | Uppercase R | 1010011 | 123 | 83 | 53 | S | Uppercase S |
| 1010100 | 124 | 84 | 54 | T | Uppercase T | 1010101 | 125 | 85 | 55 | U | Uppercase U |
| 1010110 | 126 | 86 | 56 | V | Uppercase V | 1010111 | 127 | 87 | 57 | W | Uppercase W |
| 1011000 | 130 | 88 | 58 | X | Uppercase X | 1011001 | 131 | 89 | 59 | Y | Uppercase Y |
| 1011010 | 132 | 90 | 5A | Z | Uppercase Z | 1011011 | 133 | 91 | 5B | [ | Opening bracket |
| 1011100 | 134 | 92 | 5C | \ | Backslash | 1011101 | 135 | 93 | 5D | ] | Closing bracket |
| 1011110 | 136 | 94 | 5E | ^ | Caret | 1011111 | 137 | 95 | 5F | _ | Underscore |
| 1100000 | 140 | 96 | 60 | ` | Grave | 1100001 | 141 | 97 | 61 | a | Lowercase a |
| 1100010 | 142 | 98 | 62 | b | Lowercase b | 1100011 | 143 | 99 | 63 | c | Lowercase c |
| 1100100 | 144 | 100 | 64 | d | Lowercase d | 1100101 | 145 | 101 | 65 | e | Lowercase e |
| 1100110 | 146 | 102 | 66 | f | Lowercase f | 1100111 | 147 | 103 | 67 | g | Lowercase g |
| 1101000 | 150 | 104 | 68 | h | Lowercase h | 1101001 | 151 | 105 | 69 | i | Lowercase i |
| 1101010 | 152 | 106 | 6A | j | Lowercase j | 1101011 | 153 | 107 | 6B | k | Lowercase k |
| 1101100 | 154 | 108 | 6C | l | Lowercase l | 1101101 | 155 | 109 | 6D | m | Lowercase m |
| 1101110 | 156 | 110 | 6E | n | Lowercase n | 1101111 | 157 | 111 | 6F | o | Lowercase o |
| 1110000 | 160 | 112 | 70 | p | Lowercase p | 1110001 | 161 | 113 | 71 | q | Lowercase q |
| 1110010 | 162 | 114 | 72 | r | Lowercase r | 1110011 | 163 | 115 | 73 | s | Lowercase s |
| 1110100 | 164 | 116 | 74 | t | Lowercase t | 1110101 | 165 | 117 | 75 | u | Lowercase u |
| 1110110 | 166 | 118 | 76 | v | Lowercase v | 1110111 | 167 | 119 | 77 | w | Lowercase w |
| 1111000 | 170 | 120 | 78 | x | Lowercase x | 1111001 | 171 | 121 | 79 | y | Lowercase y |
| 1111010 | 172 | 122 | 7A | z | Lowercase z | 1111011 | 173 | 123 | 7B | { | Opening brace |
| 1111100 | 174 | 124 | 7C | | | Vertical bar | 1111101 | 175 | 125 | 7D | } | Closing brace |
| 1111110 | 176 | 126 | 7E | ~ | Tilde | 1111111 | 177 | 127 | 7F | DEL | Delete |
|_______________________________________________________________________________|_______________________________________________________________________________|
You can see that values of 'A' ... 'Z', 'a' ... 'z' and '0' ... '9' are sequential, but symbols and "system" characters are mixed up.
In C language, we have C source files with the extension '.c', and C header files with the extension '.h'. Both of those are just plain text files, and please use 7-bit ASCII
encoding, since it's common sense, UTF is cancer, and 8-bit ASCII is for enlightened people like Terrence Andrew Davis. C language is completely separate (on some C compilers)
from its' preprocessor, whose directives start with '#' character, continue on '\' character and break on '\n' (read: LINE FEED) character.
@C
#include <path/to/file/file_name.h> // Copy the entire file from '/usr/include/' directory into this file, on the place where it was specified.
#include "path/to/file/file_name.h" // Copy the entire file from current directory into this file, again on the place where it was specified.
#define SOMETHING // This will add additional information to the preprocessor about this file, it's mostly used for flags and header-guards.
#undef SOMETHING // This will remove that additional information you've provided...
#if SOMETHING // If SOMETHING (condition obviously) is true, then code until '#elif', '#else' or '#endif' will be included.
#ifdef SOMETHING // If SOMETHING was previously '#define'-d, then code until '#elif', '#else' or '#endif' will be included.
#ifndef SOMETHING // If SOMETHING wasn't (NOT!) previously '#define'-d, then code until '#elif', '#else' or '#endif' will be included.
#elif // Essentially "else if" for preprocessor, it's very ugly, and nesting them looks bad and is a bad practice.
#else // Essentially "else" for preprocessor, I don't think I ever used it in my entire life, but I saw other people use it.
#endif // End if... Self-explanatory, and a sad thing that we need to ruin the beautiful C code with it.
@
Okay, that's all you really need to know about C preprocessor, since we won't use it much. You can write a completely pure C project, using only C language, but you'll end up with
copying and pasting a lot of code, especially external function and variable declarations. Because of that we need '#include' directive, and because of it, we need header guards,
so it's all C-hating in the end. However, we need to cover some simple macros, so you can deal with other peoples' code bases. Remember, the less "building blocks" you have, if
you learn them well, you can make anything, and you should be proud of "reinventing the wheel". If wheels weren't reinvented over and over again, then some expensive BMW would've
wooden wheels attached to it.
*/
extern int character_is_uppercase (char character); // Notice how we align those functions, I believe this improves the readability of any program, in any programming language.
extern int character_is_lowercase (char character); // Some people would just use 'ischrupp' or something, but I hate reading code written like that...
extern int character_is_digit (char character); // Important note is also that a programming language is not, and it should be like natural language, why?
extern int character_is_blank (char character); // Because we need strict rules in programming language, same like in mathematical languages, now now, don't be scared.
extern int character_is_alpha (char character);
extern int character_is_symbol (char character);
extern int character_is_visible (char character);
extern int character_is_invisible (char character);
extern int character_is_escape (char character);
extern int character_is_underscore (char character);
extern int character_is_hexadecimal (char character);
extern int character_compare_array (char character, char * character_array); // This function is singled out, because it's different from those above, and we use it internally.
/*
And here are also utility functions that handle files, most of them are reimplemented using "system calls" from <fcntl.h> and <unistd.h>, but you also have access to <stdio.h>,
which is probably the most used header file in C language. It handles the 'FILE *' type, not a file descriptors which are 'int', and has functions that are prefixed with character
'f', for example, 'fopen / fclose / fread / fwrite / fseek' and many more.
*/
extern int file_open (char * name, int mode); // We open a file descriptor 'name' with 'mode', obviously...
extern int file_close (int file); // We. Every opened file descriptor should be closed when program finishes.
extern void file_read (int file, void * data, int size); // We read from 'file' into 'data', by 'size' amount, similar to 'in'.
extern void file_write (int file, void * data, int size); // We write from 'data' into 'file', by 'size' amount, similar to 'out'.
extern int file_seek (int file, int whence); // We retrieve data about offsets in 'file'.
extern int file_size (char * name); // We get the size of the file by its' 'name'.
extern int file_type (char * name); // We get the type of the file by its' 'name' (by file name extension).
extern void * file_record (char * name); // We store an entire file into some memory address.
#endif