summaryrefslogtreecommitdiffstats
path: root/contrib/libucl/doc/api.md
blob: 41e660a428eab13c71da1c5cbb43624c5eca32c8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
# API documentation

**Table of Contents**  *generated with [DocToc](http://doctoc.herokuapp.com/)*

- [Synopsis](#synopsis)
- [Description](#description)
	- [Parser functions](#parser-functions)
	- [Emitting functions](#emitting-functions)
	- [Conversion functions](#conversion-functions)
	- [Generation functions](#generation-functions)
	- [Iteration functions](#iteration-functions)
	- [Validation functions](#validation-functions)
	- [Utility functions](#utility-functions)
- [Parser functions](#parser-functions-1)
	- [ucl_parser_new](#ucl_parser_new)
	- [ucl_parser_register_macro](#ucl_parser_register_macro)
	- [ucl_parser_register_variable](#ucl_parser_register_variable)
	- [ucl_parser_add_chunk](#ucl_parser_add_chunk)
	- [ucl_parser_add_string](#ucl_parser_add_string)
	- [ucl_parser_add_file](#ucl_parser_add_file)
	- [ucl_parser_get_object](#ucl_parser_get_object)
	- [ucl_parser_get_error](#ucl_parser_get_error)
	- [ucl_parser_free](#ucl_parser_free)
	- [ucl_pubkey_add](#ucl_pubkey_add)
	- [ucl_parser_set_filevars](#ucl_parser_set_filevars)
	- [Parser usage example](#parser-usage-example)
- [Emitting functions](#emitting-functions-1)
	- [ucl_object_emit](#ucl_object_emit)
	- [ucl_object_emit_full](#ucl_object_emit_full)
- [Conversion functions](#conversion-functions-1)
- [Generation functions](#generation-functions-1)
	- [ucl_object_new](#ucl_object_new)
	- [ucl_object_typed_new](#ucl_object_typed_new)
	- [Primitive objects generation](#primitive-objects-generation)
	- [ucl_object_fromstring_common](#ucl_object_fromstring_common)
- [Iteration functions](#iteration-functions-1)
	- [ucl_iterate_object](#ucl_iterate_object)
- [Validation functions](#validation-functions-1)
	- [ucl_object_validate](#ucl_object_validate)

# Synopsis

`#include <ucl.h>`

# Description

Libucl is a parser and `C` API to parse and generate `ucl` objects. Libucl consist of several groups of functions:

### Parser functions
Used to parse `ucl` files and provide interface to extract `ucl` object. Currently, `libucl` can parse only full `ucl` documents, for instance, it is impossible to parse a part of document and therefore it is impossible to use `libucl` as a streaming parser. In future, this limitation can be removed.

### Emitting functions
Convert `ucl` objects to some textual or binary representation. Currently, libucl supports the following exports:

- `JSON` - valid json format (can possibly loose some original data, such as implicit arrays)
- `Config` - human-readable configuration format (losseless)
- `YAML` - embedded yaml format (has the same limitations as `json` output)

### Conversion functions
Help to convert `ucl` objects to C types. These functions are used to convert `ucl_object_t` to C primitive types, such as numbers, strings or boolean values.

### Generation functions
Allow creating of `ucl` objects from C types and creating of complex `ucl` objects, such as hashes or arrays from primitive `ucl` objects, such as numbers or strings.

### Iteration functions
Iterate over `ucl` complex objects or over a chain of values, for example when a key in an object has multiple values (that can be treated as implicit array or implicit consolidation).

### Validation functions
Validation functions are used to validate some object `obj` using json-schema compatible object `schema`. Both input and schema must be UCL objects to perform validation.

### Utility functions
Provide basic utilities to manage `ucl` objects: creating, removing, retaining and releasing reference count and so on.

# Parser functions

Parser functions operates with `struct ucl_parser`.

### ucl_parser_new

~~~C
struct ucl_parser* ucl_parser_new (int flags);
~~~

Creates new parser with the specified flags:

- `UCL_PARSER_KEY_LOWERCASE` - lowercase keys parsed
- `UCL_PARSER_ZEROCOPY` - try to use zero-copy mode when reading files (in zero-copy mode text chunk being parsed without copying strings so it should exist till any object parsed is used)
- `UCL_PARSER_NO_TIME` - treat time values as strings without parsing them as floats

### ucl_parser_register_macro

~~~C
void ucl_parser_register_macro (struct ucl_parser *parser,
    const char *macro, ucl_macro_handler handler, void* ud);
~~~

Register new macro with name .`macro` parsed by handler `handler` that accepts opaque data pointer `ud`. Macro handler should be of the following type:

~~~C
bool (*ucl_macro_handler) (const unsigned char *data,
    size_t len, void* ud);`
~~~

Handler function accepts macro text `data` of length `len` and the opaque pointer `ud`. If macro is parsed successfully the handler should return `true`. `false` indicates parsing failure and the parser can be terminated.

### ucl_parser_register_variable

~~~C
void ucl_parser_register_variable (struct ucl_parser *parser,
    const char *var, const char *value);
~~~

Register new variable $`var` that should be replaced by the parser to the `value` string.

### ucl_parser_add_chunk

~~~C
bool ucl_parser_add_chunk (struct ucl_parser *parser, 
    const unsigned char *data, size_t len);
~~~

Add new text chunk with `data` of length `len` to the parser. At the moment, `libucl` parser is not a streamlined parser and chunk *must* contain the *valid* ucl object. For example, this object should be valid:

~~~json
{ "var": "value" }
~~~

while this one won't be parsed correctly:

~~~json
{ "var": 
~~~

This limitation may possible be removed in future.

### ucl_parser_add_string
~~~C
bool ucl_parser_add_string (struct ucl_parser *parser, 
    const char *data, size_t len);
~~~

This function acts exactly like `ucl_parser_add_chunk` does but if `len` argument is zero, then the string `data` must be zero-terminated and the actual length is calculated up to `\0` character. 

### ucl_parser_add_file

~~~C
bool ucl_parser_add_file (struct ucl_parser *parser, 
    const char *filename);
~~~

Load file `filename` and parse it with the specified `parser`. This function uses `mmap` call to load file, therefore, it should not be `shrinked` during parsing. Otherwise, `libucl` can cause memory corruption and terminate the calling application. This function is also used by the internal handler of `include` macro, hence, this macro has the same limitation.

### ucl_parser_get_object

~~~C
ucl_object_t* ucl_parser_get_object (struct ucl_parser *parser);
~~~

If the `ucl` data has been parsed correctly this function returns the top object for the parser. Otherwise, this function returns the `NULL` pointer. The reference count for `ucl` object returned is increased by one, therefore, a caller should decrease reference by using `ucl_object_unref` to free object after usage.

### ucl_parser_get_error

~~~C
const char *ucl_parser_get_error(struct ucl_parser *parser);
~~~

Returns the constant error string for the parser object. If no error occurred during parsing a `NULL` object is returned. A caller should not try to free or modify this string.

### ucl_parser_free

~~~C
void ucl_parser_free (struct ucl_parser *parser);
~~~

Frees memory occupied by the parser object. The reference count for top object is decreased as well, however if the function `ucl_parser_get_object` was called previously then the top object won't be freed.

### ucl_pubkey_add

~~~C
bool ucl_pubkey_add (struct ucl_parser *parser, 
    const unsigned char *key, size_t len);
~~~

This function adds a public key from text blob `key` of length `len` to the `parser` object. This public key should be in the `PEM` format and can be used by `.includes` macro for checking signatures of files included. `Openssl` support should be enabled to make this function working. If a key cannot be added (e.g. due to format error) or `openssl` was not linked to `libucl` then this function returns `false`.

### ucl_parser_set_filevars

~~~C
bool ucl_parser_set_filevars (struct ucl_parser *parser, 
    const char *filename, bool need_expand);
~~~

Add the standard file variables to the `parser` based on the `filename` specified:

- `$FILENAME` - a filename of `ucl` input
- `$CURDIR` - a current directory of the input

For example, if a `filename` param is `../something.conf` then the variables will have the following values:

- `$FILENAME` - "../something.conf"
- `$CURDIR` - ".."

if `need_expand` parameter is `true` then all relative paths are expanded using `realpath` call. In this example if `..` is `/etc/dir` then variables will have these values:

- `$FILENAME` - "/etc/something.conf"
- `$CURDIR` - "/etc"

## Parser usage example

The following example loads, parses and extracts `ucl` object from stdin using `libucl` parser functions (the length of input is limited to 8K):

~~~C
char inbuf[8192];
struct ucl_parser *parser = NULL;
int ret = 0, r = 0;
ucl_object_t *obj = NULL;
FILE *in;

in = stdin;
parser = ucl_parser_new (0);
while (!feof (in) && r < (int)sizeof (inbuf)) {
	r += fread (inbuf + r, 1, sizeof (inbuf) - r, in);
}
ucl_parser_add_chunk (parser, inbuf, r);
fclose (in);

if (ucl_parser_get_error (parser)) {
	printf ("Error occured: %s\n", ucl_parser_get_error (parser));
	ret = 1;
}
else {
    obj = ucl_parser_get_object (parser);
}

if (parser != NULL) {
	ucl_parser_free (parser);
}
if (obj != NULL) {
	ucl_object_unref (obj);
}
return ret;
~~~

# Emitting functions

Libucl can transform UCL objects to a number of tectual formats:

- configuration (`UCL_EMIT_CONFIG`) - nginx like human readable configuration file where implicit arrays are transformed to the duplicate keys
- compact json: `UCL_EMIT_JSON_COMPACT` - single line valid json without spaces
- formatted json: `UCL_EMIT_JSON` - pretty formatted JSON with newlines and spaces
- compact yaml: `UCL_EMIT_YAML` - compact YAML output

Moreover, libucl API allows to select a custom set of emitting functions allowing 
efficent and zero-copy output of libucl objects. Libucl uses the following structure to support this feature:

~~~C
struct ucl_emitter_functions {
	/** Append a single character */
	int (*ucl_emitter_append_character) (unsigned char c, size_t nchars, void *ud);
	/** Append a string of a specified length */
	int (*ucl_emitter_append_len) (unsigned const char *str, size_t len, void *ud);
	/** Append a 64 bit integer */
	int (*ucl_emitter_append_int) (int64_t elt, void *ud);
	/** Append floating point element */
	int (*ucl_emitter_append_double) (double elt, void *ud);
	/** Opaque userdata pointer */
	void *ud;
};
~~~

This structure defines the following callbacks:

- `ucl_emitter_append_character` - a function that is called to append `nchars` characters equal to `c`
- `ucl_emitter_append_len` - used to append a string of length `len` starting from pointer `str`
- `ucl_emitter_append_int` - this function applies to integer numbers
- `ucl_emitter_append_double` - this function is intended to output floating point variable

The set of these functions could be used to output text formats of `UCL` objects to different structures or streams.

Libucl provides the following functions for emitting UCL objects:

### ucl_object_emit

~~~C
unsigned char *ucl_object_emit (const ucl_object_t *obj, enum ucl_emitter emit_type);
~~~

Allocate a string that is suitable to fit the underlying UCL object `obj` and fill it with the textual representation of the object `obj` according to style `emit_type`. The caller should free the returned string after using.

### ucl_object_emit_full

~~~C
bool ucl_object_emit_full (const ucl_object_t *obj, enum ucl_emitter emit_type,
		struct ucl_emitter_functions *emitter);
~~~

This function is similar to the previous with the exception that it accepts the additional argument `emitter` that defines the concrete set of output functions. This emit function could be useful for custom structures or streams emitters (including C++ ones, for example).

# Conversion functions

Conversion functions are used to convert UCL objects to primitive types, such as strings, numbers or boolean values. There are two types of conversion functions:

- safe: try to convert an ucl object to a primitive type and fail if such a conversion is not possible
- unsafe: return primitive type without additional checks, if the object cannot be converted then some reasonable default is returned (NULL for strings and 0 for numbers)

Also there is a single `ucl_object_tostring_forced` function that converts any UCL object (including compound types - arrays and objects) to a string representation. For compound and numeric types this function performs emitting to a compact json format actually.

Here is a list of all conversion functions:

- `ucl_object_toint` - returns `int64_t` of UCL object
- `ucl_object_todouble` - returns `double` of UCL object
- `ucl_object_toboolean` - returns `bool` of UCL object
- `ucl_object_tostring` - returns `const char *` of UCL object (this string is NULL terminated)
- `ucl_object_tolstring` - returns `const char *` and `size_t` len of UCL object (string can be not NULL terminated)
- `ucl_object_tostring_forced` - returns string representation of any UCL object

Strings returned by these pointers are associated with the UCL object and exist over its lifetime. A caller should not free this memory.

# Generation functions

It is possible to generate UCL objects from C primitive types. Moreover, libucl permits to create and modify complex UCL objects, such as arrays or associative objects. 

## ucl_object_new
~~~C
ucl_object_t * ucl_object_new (void)
~~~

Creates new object of type `UCL_NULL`. This object should be released by caller.

## ucl_object_typed_new
~~~C
ucl_object_t * ucl_object_typed_new (unsigned int type)
~~~

Create an object of a specified type:
- `UCL_OBJECT` - UCL object - key/value pairs
- `UCL_ARRAY` - UCL array
- `UCL_INT` - integer number
- `UCL_FLOAT` - floating point number
- `UCL_STRING` - NULL terminated string
- `UCL_BOOLEAN` - boolean value
- `UCL_TIME` - time value (floating point number of seconds)
- `UCL_USERDATA` - opaque userdata pointer (may be used in macros)
- `UCL_NULL` - null value

This object should be released by caller.

## Primitive objects generation
Libucl provides the functions similar to inverse conversion functions called with the specific C type:
- `ucl_object_fromint` - converts `int64_t` to UCL object
- `ucl_object_fromdouble` - converts `double` to UCL object
- `ucl_object_fromboolean` - converts `bool` to UCL object
- `ucl_object_fromstring` - converts `const char *` to UCL object (this string is NULL terminated)
- `ucl_object_fromlstring` - converts `const char *` and `size_t` len to UCL object (string can be not NULL terminated)

Also there is a function to generate UCL object from a string performing various parsing or conversion operations called `ucl_object_fromstring_common`.

## ucl_object_fromstring_common
~~~C
ucl_object_t * ucl_object_fromstring_common (const char *str, 
	size_t len, enum ucl_string_flags flags)
~~~

This function is used to convert a string `str` of size `len` to an UCL objects applying `flags` conversions. If `len` is equal to zero then a `str` is assumed as NULL-terminated. This function supports the following flags (a set of flags can be specified using logical `OR` operation):

- `UCL_STRING_ESCAPE` - perform JSON escape
- `UCL_STRING_TRIM` - trim leading and trailing whitespaces
- `UCL_STRING_PARSE_BOOLEAN` - parse passed string and detect boolean
- `UCL_STRING_PARSE_INT` - parse passed string and detect integer number
- `UCL_STRING_PARSE_DOUBLE` - parse passed string and detect integer or float number
- `UCL_STRING_PARSE_TIME` - parse time values as floating point numbers
- `UCL_STRING_PARSE_NUMBER` - parse passed string and detect number (both float, integer and time types)
- `UCL_STRING_PARSE` - parse passed string (and detect booleans, numbers and time values)
- `UCL_STRING_PARSE_BYTES` - assume that numeric multipliers are in bytes notation, for example `10k` means `10*1024` and not `10*1000` as assumed without this flag

If parsing operations fail then the resulting UCL object will be a `UCL_STRING`. A caller should always check the type of the returned object and release it after using.

# Iteration functions

Iteration are used to iterate over UCL compound types: arrays and objects. Moreover, iterations could be performed over the keys with multiple values (implicit arrays). To iterate over an object, an array or a key with multiple values there is a function `ucl_iterate_object`.

## ucl_iterate_object
~~~C
const ucl_object_t* ucl_iterate_object (const ucl_object_t *obj, 
	ucl_object_iter_t *iter, bool expand_values);
~~~

This function accept opaque iterator pointer `iter`. In the first call this iterator *must* be initialized to `NULL`. Iterator is changed by this function call. `ucl_iterate_object` returns the next UCL object in the compound object `obj` or `NULL` if all objects have been iterated. The reference count of the object returned is not increased, so a caller should not unref the object or modify its content (e.g. by inserting to another compound object). The object `obj` should not be changed during the iteration process as well. `expand_values` flag speicifies whether `ucl_iterate_object` should expand keys with multiple values. The general rule is that if you need to iterate throught the *object* or *explicit array*, then you always need to set this flag to `true`. However, if you get some key in the object and want to extract all its values then you should set `expand_values` to `false`. Mixing of iteration types are not permitted since the iterator is set according to the iteration type and cannot be reused. Here is an example of iteration over the objects using libucl API (assuming that `top` is `UCL_OBJECT` in this example):

~~~C
ucl_object_iter_t it = NULL, it_obj = NULL;
const ucl_object_t *cur, *tmp;

/* Iterate over the object */
while ((obj = ucl_iterate_object (top, &it, true))) {
	printf ("key: \"%s\"\n", ucl_object_key (obj));
	/* Iterate over the values of a key */
	while ((cur = ucl_iterate_object (obj, &it_obj, false))) {
		printf ("value: \"%s\"\n", 
			ucl_object_tostring_forced (cur));
	}
}
~~~

# Validation functions

Currently, there is only one validation function called `ucl_object_validate`. It performs validation of object using the specified schema. This function is defined as following:

## ucl_object_validate
~~~C
bool ucl_object_validate (const ucl_object_t *schema,
	const ucl_object_t *obj, struct ucl_schema_error *err);
~~~

This function uses ucl object `schema`, that must be valid in terms of `json-schema` draft v4, to validate input object `obj`. If this function returns `true` then validation procedure has been succeed. Otherwise, `false` is returned and `err` is set to a specific value. If caller set `err` to NULL then this function does not set any error just returning `false`. Error is the structure defined as following:

~~~C
struct ucl_schema_error {
	enum ucl_schema_error_code code;	/* error code */
	char msg[128];				/* error message */
	ucl_object_t *obj;			/* object where error occured */
};
~~~

Caller may use `code` field to get a numeric error code:

~~~C
enum ucl_schema_error_code {
	UCL_SCHEMA_OK = 0,          /* no error */
	UCL_SCHEMA_TYPE_MISMATCH,   /* type of object is incorrect */
	UCL_SCHEMA_INVALID_SCHEMA,  /* schema is invalid */
	UCL_SCHEMA_MISSING_PROPERTY,/* missing properties */
	UCL_SCHEMA_CONSTRAINT,      /* constraint found */
	UCL_SCHEMA_MISSING_DEPENDENCY, /* missing dependency */
	UCL_SCHEMA_UNKNOWN          /* generic error */
};
~~~

`msg` is a stiring description of an error and `obj` is an object where error has been occurred. Error object is not allocated by libucl, so there is no need to free it after validation (a static object should thus be used).
OpenPOWER on IntegriCloud