libhl/README.md

180 lines
4.8 KiB
Markdown
Raw Permalink Normal View History

2023-09-18 15:33:12 -04:00
# libhl
2023-08-19 14:46:57 -04:00
2023-09-18 15:33:12 -04:00
## API
2023-09-18 15:47:48 -04:00
```C
2023-09-18 16:55:19 -04:00
int hl_init(void);
int hl_deinit(void);
2023-09-18 15:47:48 -04:00
```
2023-09-18 15:33:12 -04:00
These functions are responsible for the library's "life time".
`hl_init()` must be called before any other library function.
`hl_deinit()` will ensure all occupied memory is freed.
2023-08-19 14:56:20 -04:00
2023-09-18 17:28:36 -04:00
2023-09-18 15:47:48 -04:00
```C
2023-09-20 02:50:57 -04:00
#define HLPATH ~/.local/hl/:~/.vim/syntax/
```
Coma separated list of directories to be searched for syntax scripts. `#undef` to disable it entirely.
```C
2023-09-18 17:28:36 -04:00
void render_string(const char * const string, const char * const mode); //XXX: rename
2023-09-18 15:47:48 -04:00
```
2023-09-18 15:33:12 -04:00
This function matches _string_ against all known highlighting rules and dispatches the appropriate callback depending on _mode_.
2023-08-19 14:46:57 -04:00
2023-09-18 17:28:36 -04:00
2023-09-18 15:47:48 -04:00
```C
2023-09-20 02:50:57 -04:00
int token_fits(const token_t * const token, const char * const to, const int string_offset, const bool is_start_of_line, int * match_offset);
2023-09-18 16:52:29 -04:00
```
2023-09-20 02:50:57 -04:00
Fit a specific token against a string. `render_string()` uses this function internally.
2023-09-18 16:52:29 -04:00
2023-09-18 17:28:36 -04:00
2023-09-18 16:52:29 -04:00
```C
2023-09-18 16:55:19 -04:00
typedef void (*attribute_callback_t)(const char * const string, const int length, void * const attributes);
2023-09-18 15:47:48 -04:00
```
2023-09-18 17:28:36 -04:00
The type used for defining appropriate callbacks for render\_string().
2023-09-20 16:42:06 -04:00
+ string - string to be processed (probably printed)
+ length - number of characters to be processed from _string_
+ attributes - arbitrary data associated with the matched token; intended to hold color/font information for example; if no token was matched NULL will be passed
2023-08-19 14:46:57 -04:00
2023-09-18 17:28:36 -04:00
```C
struct token_table_t;
```
Holds a group of tokens belonging to the same language.
2023-09-18 15:47:48 -04:00
```C
typedef struct {
char * key;
attribute_callback_t callback;
} display_t;
```
2023-08-19 14:46:57 -04:00
The type for defining display modes.
2023-09-18 17:28:36 -04:00
```C
void new_display_mode(display_t * mode);
```
This is how you append a display mode that render\_string() will search based on _.key_.
2023-08-19 14:46:57 -04:00
2023-09-18 15:47:48 -04:00
```C
2023-09-18 16:55:19 -04:00
typedef enum {
KEYSYMBOL,
KEYWORD,
MATCH,
REGION
} token_type_t;
2023-09-18 15:47:48 -04:00
```
2023-08-19 14:46:57 -04:00
These are the valid type of distinct token types.
2023-08-19 14:54:59 -04:00
2023-09-18 15:33:12 -04:00
+ KEYSYMBOL - a string which is contextless, the surounding text is ignored
"mysymbol" will match inside all of these:
"something mysymbol something"
"somethingmysymbolsomething"
it is intended to match such thing as programming language operators
+ KEYWORD - a string which is recognized when surounded by word bundaries such as ' ' or '\t'
+ MATCH - a regular expression to be recognized
+ REGION - a regular expression where the starting and ending patters are to be distinguished from the contents
2023-08-19 14:54:59 -04:00
2023-09-18 17:28:36 -04:00
2023-08-19 14:46:57 -04:00
The universal way to add a new pattern to be recognized is with:
2023-09-18 15:47:48 -04:00
```C
2023-09-18 16:55:19 -04:00
token * new_token(const char * const syntax, const token_type_t t, const hl_group_t * const g);
2023-09-18 15:47:48 -04:00
```
2023-09-18 17:28:36 -04:00
2023-08-19 14:46:57 -04:00
There are also convinience functions:
2023-09-18 15:47:48 -04:00
```C
2023-09-18 16:55:19 -04:00
// NOTE: the return value is the number tokens successfully inserted
int new_keyword_tokens(const char * const * words, hl_group_t * const g); // _words_ must be NULL terminated
2023-09-20 02:50:57 -04:00
int new_syntax_char_tokens(const char * const chars, hl_group_t * const g);
token_t * new_symbol_token(const char * const c, hl_group_t * const g);
int new_symbol_tokens(const char * const * symbols, hl_group_t * const g);
int new_char_tokens(const char * str, hl_group_t * const g);
token_t * new_keyword_token(const char * const word, hl_group_t * const g);
token_t * new_region_token(const char * start, const char * end, hl_group_t * g);
2023-09-18 15:47:48 -04:00
```
2023-09-18 17:28:36 -04:00
The regex engine used for MATCH-es is Jeger by default, emulating Vim regex.
2023-09-18 15:33:12 -04:00
However the regex engine can be overridden:
2023-09-18 15:47:48 -04:00
```C
2023-09-18 15:33:12 -04:00
// ?!
2023-09-18 15:47:48 -04:00
```
2023-09-18 15:33:12 -04:00
2023-09-18 17:28:36 -04:00
### Default
There are default of most anything defined for convenience. They can be disable with `#undef`-ing the following macro:
```C
#define HL_DEFAULTS
```
```C
hl_group_t * normal_hl
hl_group_t * error_hl
hl_group_t * warning_hl
hl_group_t * search_hl
hl_group_t * underlined_hl
hl_group_t * bold_hl
hl_group_t * italics_hl
hl_group_t * comment_hl
hl_group_t * block_hl
hl_group_t * operator_hl
hl_group_t * constant_hl
hl_group_t * special_hl
hl_group_t * identifier_hl
hl_group_t * type_hl
// ---
token_table_t std_token_table;
```
2023-09-18 15:33:12 -04:00
---
2023-09-18 17:28:36 -04:00
2023-09-18 15:47:48 -04:00
# hl
2023-09-18 15:33:12 -04:00
General purpose highlighter (and demo program for libhl).
## Usage
hl will read from stdin and write to stdout.
2023-09-18 15:47:48 -04:00
```bash
2023-09-18 16:55:19 -04:00
hl < source/main.c
2023-09-18 15:47:48 -04:00
```
2023-09-18 15:33:12 -04:00
### Cli Options
2023-09-18 15:47:48 -04:00
```bash
2023-09-18 16:55:19 -04:00
-h : display help message
-I <dir> : syntax file look up directory
-s <syntax> : specify syntax to load
2023-09-18 15:47:48 -04:00
```
2023-09-18 15:33:12 -04:00
### Environment variables
2023-09-18 15:47:48 -04:00
```bash
2023-09-18 16:55:19 -04:00
$HLPATH : colon separated list of directories searched for syntax script files;
overriddes the value of the HLPATH macro
2023-09-18 15:47:48 -04:00
```
2023-09-18 15:33:12 -04:00
---
2023-08-19 14:46:57 -04:00
# Scripting
hl can parse a small subset of VimScript: the few instructions related to highlighing, and it ignores everything else.
All Vim highlighing scripts should be valid hl scripts.
The instrunctions in particular are:
2023-08-19 14:54:59 -04:00
2023-09-18 15:47:48 -04:00
```vimscript
2023-09-18 16:55:19 -04:00
sy[ntax] keyword <hl_group> <word>+
sy[ntax] match <hl_group> <regex>
sy[ntax] region <hl_group> start=<string|match> end=<string|match>
hi[ghtlight] link <from_group> <to_group>
hi[ghtlight] def <group> <display_t>=<data>+
2023-09-18 15:47:48 -04:00
```
2023-08-19 14:54:59 -04:00
2023-08-19 14:46:57 -04:00
Additionally hl recognizes:
2023-08-19 14:54:59 -04:00
2023-09-18 15:47:48 -04:00
```vimscript
2023-09-18 16:55:19 -04:00
syn[ntax] keysymbol <char>+
2023-09-18 15:47:48 -04:00
```