libhl/README.md
2023-09-20 22:42:06 +02:00

4.8 KiB

libhl

API

int hl_init(void);
int hl_deinit(void);

These functions are responsible for the library's "life time". hl_init() must be called before any other library function. hl_deinit() will ensure all occupied memory is freed.

#define HLPATH ~/.local/hl/:~/.vim/syntax/

Coma separated list of directories to be searched for syntax scripts. #undef to disable it entirely.

void render_string(const char * const string, const char * const mode);	//XXX: rename

This function matches string against all known highlighting rules and dispatches the appropriate callback depending on mode.

int token_fits(const token_t * const token, const char * const to, const int string_offset, const bool is_start_of_line, int * match_offset);

Fit a specific token against a string. render_string() uses this function internally.

typedef void (*attribute_callback_t)(const char * const string, const int length, void * const attributes);

The type used for defining appropriate callbacks for render_string().

  • string - string to be processed (probably printed)
  • length - number of characters to be processed from string
  • attributes - arbitrary data associated with the matched token; intended to hold color/font information for example; if no token was matched NULL will be passed
struct token_table_t;

Holds a group of tokens belonging to the same language.

typedef struct {
	char * key;
	attribute_callback_t callback;
} display_t;

The type for defining display modes.

void new_display_mode(display_t * mode);

This is how you append a display mode that render_string() will search based on .key.

typedef enum {
	KEYSYMBOL,
	KEYWORD,
	MATCH,
	REGION
} token_type_t;

These are the valid type of distinct token types.

  • KEYSYMBOL - a string which is contextless, the surounding text is ignored "mysymbol" will match inside all of these: "something mysymbol something" "somethingmysymbolsomething" it is intended to match such thing as programming language operators
  • KEYWORD - a string which is recognized when surounded by word bundaries such as ' ' or '\t'
  • MATCH - a regular expression to be recognized
  • REGION - a regular expression where the starting and ending patters are to be distinguished from the contents

The universal way to add a new pattern to be recognized is with:

token * new_token(const char * const syntax, const token_type_t t, const hl_group_t * const g);

There are also convinience functions:

// NOTE: the return value is the number tokens successfully inserted
int new_keyword_tokens(const char * const * words, hl_group_t * const g);	// _words_ must be NULL terminated
int new_syntax_char_tokens(const char * const chars, hl_group_t * const g);
token_t * new_symbol_token(const char * const c, hl_group_t * const g);
int new_symbol_tokens(const char * const * symbols, hl_group_t * const g);
int new_char_tokens(const char * str, hl_group_t * const g);
token_t * new_keyword_token(const char * const word, hl_group_t * const g);
token_t * new_region_token(const char * start, const char * end, hl_group_t * g);

The regex engine used for MATCH-es is Jeger by default, emulating Vim regex. However the regex engine can be overridden:

	// ?!

Default

There are default of most anything defined for convenience. They can be disable with #undef-ing the following macro:

#define HL_DEFAULTS
hl_group_t * normal_hl
hl_group_t * error_hl
hl_group_t * warning_hl
hl_group_t * search_hl

hl_group_t * underlined_hl
hl_group_t * bold_hl
hl_group_t * italics_hl

hl_group_t * comment_hl
hl_group_t * block_hl
hl_group_t * operator_hl
hl_group_t * constant_hl
hl_group_t * special_hl
hl_group_t * identifier_hl
hl_group_t * type_hl

// ---
token_table_t std_token_table;

hl

General purpose highlighter (and demo program for libhl).

Usage

hl will read from stdin and write to stdout.

hl < source/main.c

Cli Options

-h          : display help message
-I <dir>    : syntax file look up directory
-s <syntax> : specify syntax to load

Environment variables

$HLPATH	: colon separated list of directories searched for syntax script files;
           overriddes the value of the HLPATH macro

Scripting

hl can parse a small subset of VimScript: the few instructions related to highlighing, and it ignores everything else. All Vim highlighing scripts should be valid hl scripts. The instrunctions in particular are:

sy[ntax] keyword <hl_group> <word>+
sy[ntax] match   <hl_group> <regex>
sy[ntax] region  <hl_group> start=<string|match> end=<string|match>
hi[ghtlight] link <from_group> <to_group>
hi[ghtlight] def  <group> <display_t>=<data>+

Additionally hl recognizes:

syn[ntax] keysymbol <char>+