1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
|
# Builder Interface Reference
<!-- vim-markdown-toc GFM -->
* [Introduction](#introduction)
* [Size Prefixed Buffers](#size-prefixed-buffers)
* [Namespaces](#namespaces)
* [Error Codes](#error-codes)
* [Endianess](#endianess)
* [Deprecated](#deprecated)
* [Buffers](#buffers)
* [Tables](#tables)
* [Adding Fields](#adding-fields)
* [Nested Tables](#nested-tables)
* [Packing tables](#packing-tables)
* [Strings](#strings)
* [Structs](#structs)
* [Fixed Length Arrays in Structs](#fixed-length-arrays-in-structs)
* [Nested Buffers](#nested-buffers)
* [Scalars and Enums](#scalars-and-enums)
* [Vectors](#vectors)
* [Unions](#unions)
* [Union Vectors](#union-vectors)
* [Unions of Strings and Structs](#unions-of-strings-and-structs)
* [Error Handling](#error-handling)
* [Type System Overview](#type-system-overview)
* [Cloning](#cloning)
* [Picking](#picking)
* [Sorting Vectors](#sorting-vectors)
* [Dangers of Sorting](#dangers-of-sorting)
* [Scanning](#scanning)
* [Example of different interface type users](#example-of-different-interface-type-users)
* [Special Emitters](#special-emitters)
<!-- vim-markdown-toc -->
## Introduction
We assume a separate read-only file and add extensions to this with
support from a builder library and a builder object.
The underlying builder library supports two modes of operation that mix
together: `create` which sends data directly to the target buffer
(emitter object) and a stack driven `start/end` approach which allocates
objects and vectors on the stack. The code generator chooses the most
efficient approach given the circumstances.
Unlike most FlatBuffer language interfaces, tables and vectors are not
created back to front: They are either created completely in one
operation, or they are constructed on a stack front to back until they
can be emitted. The final buffer is still constructed back to front.
For big-endian platforms this may require temporary stack allocation of
complete vectors where little endian platforms can emit directly.
Tables and vectors stored in other tables or vectors must be completed
before the can be stored, but unlike must language interfaces they can
be constructed while a parent is also being constructed as long as
nesting remains balanced. While this occasionally may require more
stack, it may also avoid external temporary allocation.
A builder object is required to start buffer construction. The builder
must be initialized first and can be reset and reused between buffers,
reusing stack allocation. The builder can have a customized emitter
object but here we use the default. Finalizing the buffer depends
the emitter and we can use a default finalizer only because we use the
default emitter - it allocates and populates a linear buffer from a
paged emitter ring buffer.
Note that in most cases `flatcc_builder_finalize_buffer` is sufficient,
but to be strictly portable, use
`flatcc_builder_finalize_aligned_buffer` and `aligned_free`.
`aligned_free` is often implemented as `free` in `flatcc/portable` but
not on all platforms. As of flatcc version 0.5.0
`flatcc_builder_aligned_free` is provided to add robustness in case the
applications `aligned_free` implementation might differ from the library
version due to changes in compile time flags.
Generally we use the monster example with various extensions, but to
show a simple complete example we use a very simple schema (`myschema.fbs`):
table mytable { myfield1: int; myfield2: int; }
#include "myschema_builder.h"
void testfun() {
void *buffer;
size_t size;
flatcc_builder_t builder, *B;
mytable_table_t mt;
B = &builder;
flatcc_builder_init(B);
/* Construct a buffer specific to schema. */
mytable_create_as_root(B, 1, 2);
/* Retrieve buffer - see also `flatcc_builder_get_direct_buffer`. */
/* buffer = flatcc_builder_finalize_buffer(B, &size); */
buffer = flatcc_builder_finalize_aligned_buffer(B, &size);
/* This is read-only buffer access. */
mt = mytable_as_root(buffer);
assert(mytable_myfield1(mt) == 1);
assert(mytable_myfield2(mt) == 2);
/* free(buffer); */
flatcc_builder_aligned_free(buffer);
/*
* Reset, but keep allocated stack etc.,
* or optionally reduce memory using `flatcc_builder_custom_reset`.
*/
flatcc_builder_reset(B);
/* ... construct another a buffer */
/* Reclaim all memory. */
flatcc_builder_clear(B);
}
Note that a compiled schema generates a `myschema_reader.h` file and
optionally a `myschema_builder.h` and some common support files. When
building a buffer the `myschema_builder.h` must be used but when only
reading then the `myschema_reader.h` file should be used instead. Here
we are only concerned with building. When building, it is necessary to
link with `libflatccrt.a` runtime library but when reading, all
nesessary code is contained in the generated header files.
The builder object only manages a stack of currently active objects and
does not store an object that is complete. Instead it calls an emitter
object with the partial data ready for emission, similar to a write
function. A default emitter is provided which implements a ring buffer
and the result may be written to a file, copied to a buffer or a
finalized to an allocated buffer. The builder supports these methods
directly for default emitter, and only the default emitter because
emitters are otherwise defined by only one simple emit function - see
`emit_test.c` for a simple example of a custom emitter.
A custom allocator may be useful when working with small buffers in a
constrained environment - the allocator handles temporary stacks,
virtual table caches etc. but not the emitter.
The allocator and emitter interface is documented in the builder library
header pflatcc_builder.h] and the default implementation in
[flatcc_emitter.h]. The default allocator is implemented as part of the
flatcc_builder source.
The builder can be reused between buffers using the `reset` operation.
The default emitter can also be reused and will automaticallhy reset
when the buffer is. For custom emitters, any reset operation must be
called manually. The same applies to clear. The reset operations
maintain allocated memory by also reduce memory consumption across
multiple resets heuristically.
## Size Prefixed Buffers
Buffers can be created with a size prefix of type `uoffset_t`. When
doing this, the buffer is aligned relative to the size prefix such that
buffers can be stacked in a file and for example be accessed via memory
mapping.
The usual `create_as_root` and `start_as_root` has a variant called
`create_as_root_with_size` and `start_as_root_with_size`.
To read a buffer with a size prefix use:
size_t size;
buffer = flatbuffers_read_size_prefix(rawbuffer, &size);
The size the size of the buffer excluding the size prefix. When
verifying buffers the buffer and size arguments should be used. See also
[monster_test.c] for an example.
Note that the size prefix ensures internal alignment but does not
guarantee that the next buffer in a file can be appended directly
because the next buffers alignment is unknown and because it potentially
wastes padding bytes. The buffer size at offset 0 can increased to the
needed alignment as long as endianness is handled and the size of the
size field is subtracted, and zeroes are appended as necesary.
## Namespaces
The generated code is typically wrapped in a custom namespace and
functions and definitions that are library specific are usually mapped
into the namespace. We often use an empty namespace for custom types and
`flatbuffers_` for library names, but usually a `foo_` prefix could also
be used on both cases, where `foo` is a custom namespace.
Note that the name `flatcc_emitter` is only used with the default emitter
and the name [flatcc_builder] is only used for buffer management but not
for constructing content. Once a valid buffer is ready the common and
namespace (`flatbuffers`) and schema specific (or empty) namespace is used
with schema specific operations.
All schema specific content is prefixed with a namespace to avoid
conflicts - although the namespace is empty if the schema doesn't
specify any. Note that the same schema can have multiple
namespaces. An example of a namespace prefixed operation:
MyGame_Example_Monster_create_as_root(B, ... lots of args);
To simplify this we can use a macro to prefix a namespace. The use
of the name `ns` is arbitrary and we can choose different names for
different namespaces.
#undef ns
#define ns(x) MyGame_Example_ ## x
But the above doesn't work with nested calls to ns such as
ns(Monster_color_add(B, ns(Color_Green));
it would have to be:
ns(Monster_color_add)(B, ns(Color_Green);
Therefore we have a helper macro the does allow nesting:
#undef ns
#define ns(x) FLATBUFFERS_WRAP_NAMESPACE(MyGame_Example, x)
The common namespace can also be wrapped for a more consistent
appearance:
#undef nsc
#define nsc(x) FLATBUFFERS_WRAP_NAMESPACE(flatbuffers, x)
nsc(string_ref_t) s;
s = nsc(string_create_str(B, "hello, world!"));
instead of
flatbuffers_string_ref_t s;
s = flatbuffers_string_create_str(B, "hellow, world!);
## Error Codes
Functions return values can be grouped roughly into 4 groups: functions
returning pointer, references, `size_t` lengths, and `int` status codes.
Pointers and references return 0 on error. Sizes do not return error.
Status codes return 0 on success or an error code that is usually -1.
Status codes may be checked with `flatbuffers_failed(...)`.
## Endianess
The function `flatbuffers_is_native_pe()` provide an efficient runtime
check for endianness. Since FlatBuffers are little endian, the function
returns true when the native endianness matches the protocol endianness
which for FlatBuffers is little endian. We do not hardcode little endian
because it enables us to support other protocols in the future - for
example the struct conversions may be very useful for big endian network
protocols.
> As of flatcc 0.4.0 it is possible to compile flatcc with native
> big-endian support which has been tested on AIX. More details in
> [README Endianness](https://github.com/dvidelabs/flatcc#endianness)
By testing `is_native_pe` dependencies on speficic compile time flags
can be avoided, and these are fragile:
During build, vectors and structs behave differently from tables: A
table updates one field at a time, doing endian conversion along the
way. A struct is either placed in a table, and is converted by the table
specific operation, or it is placed in a vector. A vector only does the
endian conversion when the vector is finished, so when a vector is not
created atomically with a single `create` call, the elements are placed on a
stack. By default this is in native format, but the user may choose to
place buffer encoded structs or scalars in the vector and call
`vec_end_pe`. The same `push` operation can be used to place a
natively encoded struct and a buffer encoded struct in the vector
because it does no conversion at that point. Therefore there is also no
`push_pe` method that would mean to push an unconverted element unto
the stack. Only for tables and entire vectors does the pe command make
sense. If a vector wishes to push a buffer encoded struct when the
vector is otherwise constructed in native encoding or vice versa, the
vector may be extended empty and then assigned using any of the
`assign`, `assign_from_pe` or `assign_to_pe` calls.
We did not mention that a struct can also be a standalone object
as a buffer root, and for that it has a `end_pe` call that essentially
works like a single element vector without a length prefix.
The `clone` operation is a more userfriendly `pe` operation which takes
an object or a vector from an existing buffer and places it in a new
buffer without endian conversion.
### Deprecated
__NOTE: `FLATBUFFERS_LITTLEENDIAN` is deprecated and will be removed in
a future version. It just complicates endina handling.__
The header files tries to define `FLATBUFFERS_LITTLEENDIAN` to 0 or 1
based on system definitions but otherwise leaves the flag undefined.
Simply testing for
#if FLATBUFFERS_LITTLEENDIAN
...
#endif
will not fail if the endianness is undetected but rather give the
impression that the system is big endian, which is not necessarily true.
The `flatbuffers_is_native_pe()` relates to the detected or system
provided conversion functions if a suitable `endian.h` file after the
header file gave up on its own detection (e.g. `le16toh(1) == 1`).
Therefore, it is better to use `flatbuffers_is_native_pe()` in most
cases. It also avoids making assumptions on whether the protocol is
little or big endian.
## Buffers
A buffer can most simply be created with the `create_as_root` call for
a table or a struct as seen ealier. The `as_root` part is just a thin
wrapper around buffer start and stop calls and using these allows for
more flexibility. the `as_root` also automatically uses the defined file
identifier if any.
The build process begins with starting a buffer. The buffer may contain
a struct or table, so one of these should be constructed subsequently.
Structs are generally created inline in tables, only at the buffer level
is a struct created independently. The api actually permits other
formats, but it will not be valid flatbuffers then.
flatcc_builder_ref_t root;
flatcc_builder_init(B);
/* 0 indicates no file identifier. */
flatcc_builder_buffer_start(B, 0);
root = /* ... construct a table or a struct */
flatcc_builder_buffer_end(B, root);
`buffer_start` takes a file identifier as second argument. If null or a
string with null characters, the identifier is not stored in the buffer.
Regardless of whether a struct or table is declared as root in the schema or
not, there are methods to automatically start both the buffer and struct or buffer
and table such as `Monster_start/end_as_root`. This is also valid for
nested buffers. If the schema has a file identifier, it is used as
identifier for the created object. The alternative
`create_as_root_with_identifier` allows for explicitly setting an id or
explicitly dropping an id by providing a null argument. The
corresponding reader function `Monster_as_root(buffer)` also has a
`Monster_as_root_with_identifier(buffer, id)`. Here the id is ignored if the id
is null, and otherwise the operation returns null if the id does not match.
For the most part ids are handled transparently by these defaults.
The buffer can be started with block alignment and/or a custom
identifier using the `flatcc_builder_buffer_start_aligned`:
flatcc_builder_buffer_start_aligned(B, "myid", 16);
...
flatcc_builder_buffer_end(B, root);
The alignment can be 0 using the minimum required alignment, which is
derived from the operations between `start/end`. The alignment argument
is called `block_align` and is useful if the emitter operates on blocks
such as encryption, cache line isolation, or compression blocks where
the final buffer should align with the blocks used during construction.
This can lead to significant zero padding just after the block header,
depending on block size.
The schema specified identifier is given as:
flatbuffers_identifier
and defaults to null. The schema specified extension is given as:
flatbuffers_extension
and defaults to null. Note that `flatbuffers_` is replaced by whatever
namespace is chosen. Each specific schema type also has a named file
exntension reflection the extension active when the type was defined,
for example:
MyGame_Example_Monster_file_identifier
This define is used when `create_as_root` automatically sets a file
identifier.
NOTE: before flatcc 0.6.1, the identifier was named
MyGame_Example_Monster_identifier (DEPRECATED)
but that would conflict with a table field named `identifier` which
happened often enough to be a problem. This naming is now removed on
conflict and will be completely removed in a future version.
When the buffer is ended, nothing special happens but only at this point
does it really makes sense to access the resulting buffer. The default
emitter provides a copy method and a direct buffer access method. These
are made available in the builder interface and will return null for
other emitters. See also [flatcc_builder.h] and the default emitter in
`flatcc_emitter.h`.
## Tables
### Adding Fields
If `Monster` is a table, we can create a Monster buffer (after
builder init) as follows:
Monster_start(B);
Monster_Hp_add(B, 80);
...
flatcc_builder_buffer_create(B, Monster_end(B));
All scalar and enums are added similar to the `Monster_add_Hp` call. We
will subsequently see how to deal with other types.
A table can also be created in a single operation using `create`:
Monster_ref_t m;
m = Monster_create(B, 80, ...);
The create arguments are those taken by the individual fields `add`
operations which is either an scalar, enum, or a reference returned by
another create or end call. Note that unlike the C++ interface, unions
only take a single argument that is also accepted by the `add` operation
of a union field. Deprecated fields are not included in the argument
list.
As of v0.5.3 the arguments are given in field id order which is usually
the same as the schema listed order, except with id attributes are
given explicitly. Using id order ensures version stability. Note that
since deprecated fields are omitted, deprecated fields can still break
existing code.
BREAKING: Prior to flatcc v0.5.3 the create call would use the schema order
also when fields have id attributes specifying a different order. This
could break code across versions and did not match the C++ behavior.
It was also document that the `original_order` attribute affected create
argument order, but that was incorrect.
NOTE: If the `original_order` attribute is set on a table, the `create`
implementation adds fields to the table in schema listed order,
otherwise it adds fields in order of decreasing size to reduce alignment
overhead. Generally there should be no need to use the `original_order`
attribute. This doesn't affect the call argument order although that
was incorrectly document prior to v 0.5.3.
NOTE: the `create` and `create_as_root` operations are not guaranteed to
be available when the number of fields is sufficiently large because it
might break some compilers. Currently there are no such restrictions.
Scalars and enums do not store the value if it it matches the default
value which is by default 0 and otherwise defined in the schema. To
override this behavior, use `force_add`. In the monster example, health
points default to 100 (percent), so if we wish to force store it in the
buffer we could use:
Monster_hp_force_add(B, 100);
Only scalar fields and enums have a `force_add` operation since only these
types have a default value, and other types have a meaningful
interpretation of null. (It is not quite clear if empty tables separate
from null/absent are valid in all implementations).
`force_add` may be useful when roundtripping data from a database where it is
relevant to distinguish between any valid value and null. Most readers will not
be able to tell the difference, but it is possible to inspect a flatbuffer to
see if a table field is present, present and default, or absent, meaning null.
NOTE: As of mid 2020, FlatBuffers added optional scalar table fields with support in flatcc 0.6.1. These fields automatically imply `force_add` to represent null values when a field is absent and therefore these fields do not have a `force_add` method and these fields also do not have a default value other than `null`, i.e. null if not added.
If Monster is declared as root, the above may also be called as:
Monster_start_as_root(B);
Monster_add_hp(B, 80);
...
Monster_end_as_root(B);
(Calling `Monster_end` instead would require `buffer_end` call
subsequently, and is basically a violation of nesting).
### Nested Tables
Tables can be nested, for example the Mini field may have type
Monster table again (a recursive type):
buffer_start(B);
Monster_start(B);
Monster_add_Hp(B, 80);
Monster_start(B);
Monster_hp_add(B, 81);
...
Monster_mini_add(Monster_end(B));
...
flatcc_builder_buffer_end(B, Monster_end(B));
The child Monster table may be created before the parent or as above
between the tables start and end. If created before, reference must be
stored until it can be added. The only requirement is that start and
end are balanced, that the sub-table is ended before the parent, and
that both are created in the same buffer (nested buffers can be created
while the parent buffer is still being created, similar to sub-tables,
so it is possible to mess this up):
Monster_ref_t root, mini;
buffer_start(B);
Monster_start(B);
Monster_hp_add(B, 81);
mini = Monster_end(B);
Monster_start(B);
Monster_hp_add(B, 80);
Monster_mini_add(B, mini);
root = Monster_end(B);
flatcc_builder_buffer_end(B, root)
Rather than adding a child table explicitly, it can be started and ended
as an operation on the field name, here with `Monster_Mini_start/end`:
Monster_ref_t root;
Monster_start(B);
Monster_add_Hp(B, 80);
Monster_mini_start(B);
Monster_hp_add(B, 81);
Monster_mini_end(B);
root = Monster_end(B);
flatcc_builder_buffer_end(B, root);
We can repeat the the table nesting as deep as we like, provided our
builder is willing to allocate enough stack space.
**Warning**: It is possible to use the wrong table type operations
between `start/end` - don't do that. It is a tradeoff between usability
and type safety.
Note that vectors, strings and structs map several standard operations
to a field name, for example `mytable_myfield_push(B, x)`. This is not the
case with table fields which only map `start/end/create` in part because it
would never terminate for recursive types and in part because each table
is different making a generic mapping rather complex and with very long
names.
A table may be created with a constructor, but it requires all
non-scalar objects to be references or pointers. Struct fields must be
pointers to zero padded structs, and strings, vectors and tables must be
references. The constructors are probably most useful for simple tables
with mostly scalar values (here we use the original Monster fields and
leaves out any we have invented for the sake of illustration):
IMPORTANT: objects can generally only be created within a buffer
context, i.e. after `buffer_start`. For example calling
`flatbuffers_uint8_vec_create` before `Monster_create_as_root`
technically violates this rule because the create call also starts the
buffer. It is, however, allowed at the top level. For nested buffers
(see later) this must be avoided because the vector would end up in the
wrong buffer.
Monster_ref_t m;
uint8_t invdata[4] = { 1, 2, 3, 4 };
Vec3_t vec;
flatbuffers_uint8_vec_ref_t inventory =
flatbuffers_uint8_vec_create(B, invdata, 4);
m = Monster_create(B, &vec, 150, 80, name, inventory,
Color_Red, Any_as_NONE());
flatcc_builder_buffer_create(m);
or
Monster_create_as_root(B, &vec, 150, 80, name, inventory,
Color_Red, Any_as_NONE());
## Packing tables
By reordering the fields, the table may be packed better, or be better
able to reuse an existing vtable. The `create` call already does this
unless the attribute `original_order` has been set. Unions present a
special problem since it is two fields treated as one and the type field
will generally waste padding space if stored in order:
To help pack unions better these can be added with the type
seperate from the value reference using `add_type(B, test.type)`,
`add_value(B, test)` where the value is only added if the type is
not `NONE`. The `add_type` should be called last since it is the
smallest type.
The same field should not be added more than at most once. Internal
reservations that track offset fields may overflow otherwise. An
assertion will fail in debug builds.
Required table fields will be asserted in debug builds as part of the
`end/create` call. Only offset fields can have a required attribute.
The generated `monster_test_reader.h` from [monster_test.fbs] shows how
the default packing takes place in generated `create` calls, see for
example the typealias test. Note that for example vectors are stored
together with integers like `uint32` because references to vectors have
the same size as `uint32`.
## Strings
Strings can be added to tables with zero terminated strings as source
Monster_start(B);
...
Monster_name_create_str(B, "Mega Monster");
Monster_end(B);
or strings potententially containing zeroes:
#define MONSTER "Mega\0Monster"
Monster_start(B);
...
/* Includes embedded zero. */
Monster_name_create(B, MONSTER, sizeof(MONSTER));
Monster_end(B);
or zero terminated source up to at most `max_len` characters.
#define MONSTER "Mega\0Monster"
Monster_start(B);
...
/* "Mega" */
Monster_name_create_strn(B, MONSTER, 12);
Monster_end(B);
The `create_str` and `create_strn` versions finds the string length via strlen
and strnlen respectively. `append_string` also has `_str/_strn` versions.
A string can also be created from an existing flatbuffer string in which
case the length is expected to be stored 4 bytes before the pointer in
little endian format, and aligned properly:
Monster_name_clone(B, mybufferstring);
or, create a string at most 4 characters long starting at 0-based index
10, if present:
Monster_name_slice(B, mybufferstring, 10, 4);
If index or index + len goes beyond the source, the result is truncated
accordingly, possibly resulting in an empty string.
A string can also be create independently. The above is just shortcuts
for that:
flatbuffers_string_ref_t monster_name;
monster_name = flatbuffers_string_create_str("Mega Monster");
Monster_name_add(B, monster_name);
Strings are generally expected to be utf-8, but any binary data will be
stored. Zero termination or embedded control codes are includes as is.
The string gets a final zero temination regardless, not counted in the
string length (in compliance with the FlatBuffers format).
A string can also be constructed from a more elaborate sequence of
operations. A string can be extended, appended to, or truncated and
reappended to, but it cannot be edited after other calls including calls
to update the same string. This may be useful if stripping escape codes
or parsed delimiters, etc., but here we just create the same "Mega
Monster" string in a more convoluted way:
flatbuffers_string_ref_t name;
char *s;
#define N 20
Monster_start(B);
...
flatbuffers_string_start(B);
flatbuffers_string_append(B, "Mega", 4);
flatbuffers_string_append(B, " ", 1);
s = flatbuffers_string_extend(B, N);
strncpy(s, "Monster", N);
flatbuffers_string_truncate(B, N - strlen(s));
name = flatbuffers_string_end(B);
Monster_name_add(B, name);
...
Monster_end(B);
`flatbuffers_string_create...` calls are also available when creating
the string separate from adding it to a table, for example:
flatbuffers_string_h name;
name = flatbuffers_string_create_str(B, "Mini Monster");
It is guaranteed that any returned the string buffer is zero filled and
has an extra zero after the requested length such that strlen can be
called on the content, but only the requested bytes may be updated.
Every call only returns the substring being added to the string in that
operation. It is also possible to call `flatbuffers_string_edit` to get a
modifiable pointer to the start of the string.
`flatbuffers_string_reserved_len(B)` returns the current string length
including any embedded zeroes, but excluding final zero termination. It
is only valid until `string_end` is called.
See [flatcc_builder.h] for detailed documentation. Essentially `extend`
reserves zeroed space on the stack and returns a buffer to the new
space, and truncate reduces the overall size again, and the string is
then given the final length and a zero termination at the end.
There is no endian conversion (except internally for the string length),
because UTF-8 strings are not sensitive to endianness.
Like tables, the string may be created while a parent container is being
constructed, or before.
Strings can also be used as vector elements, but we will get that when
discussing vectors.
## Structs
Structs in tables can be added as:
Monster_pos_create(B, 1, 2, 3);
The above essentially does the following:
Vec3_t *v;
v = Monster_pos_start(B);
Vec3_assign(v, 1, 2, -3.2);
Monster_pos_end(B);
Some versions of the monster schema has extra test fields - these would
break the assign approach above because there would be extra arguments.
Instead we can rely on the zero intialization and assign known fields.
Vec3_t *v;
v = Monster_pos_start(B);
v->x = 1, v->y = 2, v->z = -3.2;
Monster_pos_end(B);
`Monster_pos_end_pe(B)` can be used when the struct is known to be
little endian (pe for protocol endian, meaning no conversion is necessary),
for example copied from an existing buffer, but then `clone` is a better
choice:
Monster_pos_clone(B, &v);
When the struct is created alone for use as root:
Vec3_ref_t root;
root = Vec3_create(B, 1, 2, 3)
flatcc_builder_buffer_create(B, root);
An existing struct can be added as:
Vec3_t v;
Vec3_assign(&v, 1, 2, 3);
/* v does not have to be zero padded. */
Monster_pos_add(B, &v);
When adding a struct that is already little endian, presumably from an
existing buffer, it can be cloned using:
Monster_pos_clone(B, &v);
Clone assumes the source struct is both little endian and that padding
is already zeroed (example ignores error handling), and `end_pe`
does nothing.
*Monster_pos_start(B) = v;
Monster_pos_end_pe(B);
There are several assignment types that convert between host (native)
endianness and buffer endiannes. We use `pe` to indicate
`protocol_endian` rather than just `le` for `little endian` because it
allows us to change endianness to big endian in the the future and it
more clearly states the intention. While big endian is not allowed in
FlatBuffers, big endian structs may be useful in other network
protocols - but it is not currently supported because it would force
little endian platforms to support byte-swapping. The operations are:
`assign_from_pe`, `assign_to_pe`, `copy`, `copy_from_pe`,
`copy_to_pe`, `to_pe` and `from_pe`.
All the copy operations takes a const pointer as source, and
`to/from_pe` is just copy with same source and destination:
Vec3_t v, v2;
Vec3_assign_to_pe(&v2, 1, 2, 3);
Vec3_copy_from_pe(Vec3_clear(&v), &v2);
Vec3_to_pe(&v);
`from_pe` means from little endian to native endian, end `to_pe`
is the opposite. On little endian platforms all copy operations behave
the same and only move fields, not padding. `to/from_pe` conversion
will leave deprecated fields either as they were, or zero them because
the operation may be skipped entirely on protocol endian native platforms.
While struct fields cannot be deprecated officially, they are supported
if the schema compiler is flagged to accept then. The struct fields are
renamed and assigned 0 when using assign or copy, and assign / create has
no argument for them.
Because padding can carry noise and unintended information, structs
should be cleared before assignment - but if used as a source to copy
the padding is not copied so only the destation need to be zeroed.
If a struct is nested, the assign operation includes all fields as if
the struct was flattened:
typedef struct Plane Plane_t;
struct Plane {
Vec3_t direction;
Vec3_t normal;
};
Plane_t plane;
Plane_clear(&plane);
Plane_assign(&plane, 1, 2, 3, 7, 8, 9);
Structs can also be created standalone, similar to tables and vectors,
but FlatBuffers only support this when the struct is used as root.
Assuming Vec3 is declared as root, a buffer only holding a Vec3 struct
can be created using:
Vec3_create_as_root(B, 1, 2, 3);
Important: do not store the above as a nested buffer - it would be
missing the vector size field. If `Monster_playground` is a ubyte vector
with `nested_flatbuffer` attribute, then
`Monster_playground_start/end_as_root` may be used.
Structs also support `start/end_as_root`. In this case `start` returns
the struct pointer, and `end_pe_as_root` is supported:
Vec3_t *v;
v = Vec3_start_as_root(B);
v->x = 1, v->y = 2, v->z = 3;
Vec3_end_as_root(B);
(Be careful with the different result codes since a tables `start_as_root`
returns an integer result code where 0 is success while a struct returns
a pointer that is null on failure.)
The following also creates a buffer at top-level, but it may also be
added as a nested buffer because the stack frame detects the nesting:
Vec3_t *v;
flatcc_builder_buffer_start(B);
v = Vec3_start(B);
v->x = 1, v->y = 2, v->z = 3;
flatcc_builder_buffer_end(B, Vec3_end(B));
or
flatcc_builder_buffer_start(B);
...
Monster_start(B);
flatcc_builder_buffer_start(B);
v = Vec3_start(B);
v->x = 1, v->y = 2, v->z = 3;
Monster_playground_add(B,
flatcc_builder_buffer_end(B, Vec3_end(B)));
flatcc_builder_buffer_end(B, Monster_end(B));
or
flatcc_builder_buffer_ref_t nested_root;
flatcc_builder_buffer_start(B);
nested_root = Vec3_create_as_root(B, 1, 2, 3);
Monster_start(B);
Monster_playground_add(B, nested_root);
flatcc_builder_buffer_end(B, Monster_end(B));
A `buffer_ref_t` can be used as `uint8_vec_ref_t` when the
buffer is nested, and otherwise the reference cannot be used
for anything other than testing for failure. The buffer content
should match the type declared in a `nested_flatbuffers` attribute
but it isn't enforced, and a root can be stored in any field of
[ubyte] type.
When `Monster_playground` is declared as nested:
...
Monster_start(B);
Monster_playground_create_as_root(B, 1, 2, 3);
flatcc_builder_buffer_end(B, Monster_end(B));
...
Be aware that `Vec3_t` is for native updates while `Vec3_struct_t` is a const
pointer to an endian encoded struct used in the reader interface, and actually
also as source type in the clone operation.
### Fixed Length Arrays in Structs
As of flatcc 0.6.0 it is possible to have fixed length arrays as structs
members. A fixed length array is equivalent to having a struct field repeated
one or more times. The schema syntax is `name : [type:count];` similar to an
ordinary struct field `name : type;`. The type is any type that can ba valid
struct field type including enums and nested structs. The size cannot be 0 and
the overall size is limited by the maximum struct size the array is contained
within which is typically 65535 (2^16-1).
For example, given the schema:
struct MyStruct {
counters:[int:3];
// char is only valid as a fixed length array type
name:[char:6];
}
table MyTable {
mystruct:MyStruct;
}
The table can be created with:
ns(MyStruct_t) *x;
ns(MyTable_start_as_root(B));
x = ns(MyTable_mystruct_start(B));
x->counters[0] = 1;
x->counters[1] = 2;
x->counters[2] = 3;
strncpy(x->name, "Kermit", sizeof(x->name));
ns(MyTable_mystruct_end(B));
ns(MyTable_end_as_root(B));
Note that char arrays are not zero terminated but they are zero padded, so
strncpy is exactly the right operation to use when assigning to char arrays,
at least when they do not contain embedded nulls which is valid.
Char arrays are expected to be ASCII or UTF-8, but an application may use
other encodings if this is clear to all users.
With assignment:
int data[3] = { 1, 2, 3 };
ns(MyStruct_t) *x;
ns(MyTable_start_as_root(B));
x = ns(MyTable_mystruct_start(B));
// Careful: the name argument does not use strncpy internally
// so the source must be at least the expected length
// like other array arguments. Strings can have embedded nulls.
ns(MyStruct_assign(x, data, "Kermit");
ns(MyTable_mystruct_end(B));
ns(MyTable_end_as_root(B));
To read a struct the pointer to the struct is retrieved first
int sum;
int i;
const char *name;
size_t name_len;
ns(MyTable_table_t) t;
ns(MyStruct_struct_t) x;
t = ns(MyTable_as_root(buf));
x = ns(MyTable_mystruct_get(t));
for (sum = 0, i = 0; i < ns(MyStruct_counters_get_len()); ++i) {
sum += ns(MyStruct_counters_get(x, i)) +
// char arrays are endian neutral, so we can use pointer access.
name = ns(MyStruct_name_get_ptr(x);
name_len = strnlen(name, ns(MyStruct_name_get_len()));
printf("Added counters from %.*s", name_len, name);
// char arrays can be accessed like other arrays:
// ns(MyStruct_name_get(x, i);
}
An alternative to `strnlen` is strip trailing zeroes which will allow for
char arrays embedded zeroes, but there is no direct support for this. The JSON
printer uses this approach to shorten the printed char array string.
The `_get` suffix can be ommitted in the above if the flatcc `-g` has not
supplied to reduce the risk of name conflicts, but not for `_get_len` and
`_get_ptr`.
Note that it is not possible to have fixed length arrays as part of a table but
it is possible to wrap such data in a struct, and it is also possible to have
vectors of structs that contain fixed length arrays.
## Nested Buffers
These are discussed under Structs and Table sections but it is worth
noting that a nested buffers can also be added as pe ubyte vectors
which is probably the original intention with nested buffers. However,
when doing so it can be difficult to ensure the buffer is correctly
aligned. The untyped `flatcc_builder` has various options to deal with
this, but with generated code it is better to create a nested buffer
inline when suitable (with nested `buffer_start/end` or
`mytable_myfield_create_as_root`) - for example a message wrapper with
a union of tables holding buffer for a specific message type. In other
cases the buffer may truly be created independently of the current
buffer and then it can be added with controlled alignment using either
the `flatcc_builder` api for full control, or the `nest` operation on
nested table and struct fields:
To create and add a ubyte vector with a higher alignment than ubytes
single byte alignment, the following operation is available as an
operation on a nested buffer field:
Monster_playground_nest(B, void *data, size_t size, uint16_t align);
If alignment is unknown, it can be set to 0, and it will default to 8
for nested table types, and to the struct alignment for struct buffers.
Block alignment is inherited from the parent buffer so the child buffer
ends up in its own set of blocks, if block alignment is being used. If
the nested buffer needs a different block alignment, the `flatcc_builder`
api must be used.
All structs and tables have an `start/end/create_as_root` even if they
are not referenced by any `nested_flatbuffers` field and they will
create [ubyte] vectors containing a nested buffer but only [ubyte]
fields with `nested_flatbuffers` attribute will dedicated
`start/end/create_as_root` on the field name. Structs also have
`end_pe_as_root`.
## Scalars and Enums
Scalars keep their original type names `uint8_t`, `double`, etc, but
they get some operations similar to structs. These are contained in a
namespace which by default is `flatbuffers_`, for example:
uint16_t *flatbuffers_uint16_to_pe(uint16_t *p);
uint16_t *flatbuffers_uint16_from_pe(uint16_t *p);
flatbuffers_bool_t *flatbuffers_bool_to_pe(flatbuffers_bool_t *p);
flatbuffers_bool_t *flatbuffers_bool_from_pe(flatbuffers_bool_t *p);
These may be used freely, but are primarily present as an interface to
the vector operations also defined for structs.
Enums have similar definitions which may be used to convert endianness
without being concerned with the underlying integer type, for example:
Color_enum_t *Color_to_pe(Color_enum_t *p);
## Vectors
Vectors can be created independently, or directly when updating a table - the
end result is the same. Builder vector operations always reference element
values by pointer, or by reference for offset types like tables and strings.
uint8_t v;
Monster_inventory_start(B);
v = 1;
flatbuffers_uint8_vec_push(B, &v);
v = 2;
flatbuffers_uint8_vec_push(B, &v);
v = 3;
flatbuffers_uint8_vec_push(B, &v);
Monster_inventory_end(B);
or
flatbuffers_uint8_vec_ref_t inv;
uint8_t v;
flatbuffers_uint8_vec_start(B);
v = 1;
flatbuffers_uint8_vec_push(B, &v);
v = 2;
flatbuffers_uint8_vec_push(B, &v);
v = 3;
flatbuffers_uint8_vec_push(B, &v);
inv = flatbuffers_uint8_vec_end(B);
Monster_inventory_add(B, inv);
Because it can be tedious and error-prone to recall the exact field
type, and because the operations are not type safe (any kind of push
would be accepted), some vector operations are also mapped to the field
name:
uint8_t v;
Monster_inventory_start(B);
v = 1;
Monster_inventory_push(B, &v);
v = 2;
Monster_inventory_push(B, &v);
v = 3;
Monster_inventory_push(B, &v);
Monster_inventory_end(B);
Note: vector operations on a type uses the `_vec_<operation>` syntax, for
example `uint8_vec_push` or `Monster_vec_push` while operations that are mapped
onto table field names of vector type do not use the `_vec` infix because it is
not a type name, for example `Monster_inventory_push`.
A slightly faster operation preallocates the vector:
uint8_t *v;
Monster_inventory_start(B);
v = Monster_inventory_extend(B, 3);
v[0] = 1, v[1] = 2, v[2] = 3;
v = Monster_inventory_extend(B, 2);
v[0] = 4, v[1] = 5;
Monster_inventory_end(B);
Push just extends one element at time. Note that `extend` returns the
pointer to the extended vector segment. The full vector can be accessed
with `edit` and `reserved_len` between `start/end` (recalling that pointers
cannot be reused across buffer calls):
uint8_t *v, i;
uint8_t data[] = { 1, 2 };
Monster_inventory_start(B);
Monster_inventory_push(B, &data[0]);
Monster_inventory_push(B, &data[1]);
v = Monster_inventory_edit(B);
for (i = 1; i < Monster_inventory_reserved_len(B); ++i) {
v[i] = v[i - 1] + v[i];
}
Monster_inventory_end(B);
Note that the name `reserved_len` is to avoid confusion with
`_vec_len` read operation. It also indicates that it is not the final
size since it may change with `truncate/extend`.
A vector can also contain structs. Let us extend the Monster example
with a vector of positions, so we can have a breadcrumb trail:
Monster_breadcrumbs_start(B);
Vec3_vec_push_create(B, 1, 2, 3);
Vec3_vec_push_create(B, 3, 4, 5);
Monster_breadcrumbs_end(B);
or
Monster_breadcrumbs_start(B);
Monster_breadcrumbs_push_create(B, 1, 2, 3);
Monster_breadcrumbs_push_create(B, 3, 4, 5);
Monster_breadcrumbs_end(B);
or
Vec3_t *trails[2];
Monster_breadcrumbs_start(B);
trails = Monster_breadcrumbs_extend(B, 2);
Vec3_create(&trails[0], 1, 2, 3);
Vec3_create(&trails[1], 4, 5, 6);
Monster_breadcrumbs_end(B);
The `vec_start/exttend/end/end_pe/create/create_pe/clone/slice` are
translated into similar calls prefixed with the field name instead of
`vector` and except for `start`, the calls also add the vector to the
table if successful, for example:
uint8_t data[] = { 1, 2, 3 };
Monster_inventory_create(B, data, 3);
Monster_breadcrumbs_slice(B, some_other_breadcrumbs, 0, 10);
Vector operations that are allowed between `vec_start` and
`vec_end(_pe)` are also mapped. These are
`vec_extend/append/truncate/edit/reserved_len`, and `push/push_create/push_copy`.
`push_copy` ensures only valid fields are copied, not zero padding (or
the unofficial deprecated fields).
A struct `push_clone` is the same as a `push_copy` operation
because structs are stored inline in vectors - with the
exception of union vectors which have `push_clone` that does the
right thing.
The `add` call adds a vector created independently from the table field,
and this is what is going on under the surface in the other calls:
Vec3_t x;
Vec3_vec_ref_t inv;
/* Clear any padding in `x` because it is not allocated by builder. */
Vec3_assign(Vec3_clear(&x), 3, 4, 5);
Vec3_vec_start(B);
Vec3_vec_push_create(B, 1, 2, 3);
Vec3_vec_push(B, &v);
inv = Vec3_vec_end(B);
Monster_breadcrumbs_add(B, inv);
As always, a reference such as `inv` may only be used at most once, and
should be used once to avoid garbage.
Note that `Vec3_vec_start` would create an independent struct instead of a
vector of structs. Also note that `vec_ref_t` is a builder specific
temporary type while `vec_t` is intended as a const pointer to the first
element in an existing buffer in little endian encoding with a size
prefix (to be used with clone, for example).
An existing Vec3 struct can also be pushed with `Vec3_push(B, &v)`. The
argument must be zero padded. Because vectors are converted at the end,
there is no `push_pe`, but a struct may be in little endian using push
on all platforms if `vec_end_pe` is used at the end.
A vector may also be created from an existing array:
uint8_t data[] = { 1, 2, 3 };
Monster_inventory_add(B, flatbuffers_uint8_vec_create(B, data, 3));
This also applies to arrays of structs as long as they are properly zero
padded. `create_pe` is similar but does not do any endian conversion,
and is similar to `clone` except there are no header prefix.
Likewise an existing vector with proper zero padding may be appended
using the `extend` operation. The format must be native or little endian
depending on whether `vec_end` or `vec_end_pe` is called at the end.
All vectors are converted to little endian when the `end` command is
called. `end_pe` prevents this from happening.
`clone` and `slice` and can be used to copy an entire, or a partial
array from an existing buffer. The pointer must be to the first vector
element in little endian format, and it must have a size prefix and be
aligned (like any flatbuffer vector). `slice` takes a base-0 index and
a vector length where the result is truncated if the source is not
large enough.
Monster_inventory_clone(B, v);
or
Monster_inventory_add(flatbuffers_int8_clone(B, v);
or
Monster_inventory_add(flatbuffers_int8_slice(B, v, 2, 4);
or
Monster_inventory_slice(B, v, 2, 4);
A vector of strings an be constructed as (`friends` is a string
vector field that we just invented for the occasion):
flatbuffers_string_ref_t friend, *p;
Monster_friends_start(B);
friend = flatbuffer_string_create_str(B, "Peter Pan");
Monster_friends_push_create_str(B, "Shrek");
Monster_friends_push_create_str(B, "Pinnochio");
Monster_friends_push_create_str(B, "Pinnochio");
Monster_friends_push_create(B, "Hector", 6);
Monster_friends_push(friend);
p = Monster_friends_extend(B, 1);
*p = flatbuffers_string_create_str("Cindarella");
Monster_friends_push_start(B);
flatbuffers_string_append("The Little");
flatbuffers_string_append("Mermaid");
Monster_friends_push_end(B);
Monster_friends_end(B);
Vectors and strings have a second argument to start, see also the `spawn` example
below.
Finally, vectors can contain tables. Table vectors are offset
vectors just like string vectors. `push_start` pushes a new table and
allows for updates until `push_end`. If we have a spawn vector of monsters in
the Monster table, we can populate it like this:
Monster_spawn_start(B);
Monster_vec_push_start(B);
Monster_Hp_add(B, 27);
Monster_vec_push_end(B);
Monster_vec_push_create(B,
/* Approximate argument list for illustration only. */
&vec, 150, 80, name, inventory, Color_Red, Any_as_None());
Monster_spawn_end(B);
The push operation has constructors `push_start/end/create` for both tables
struct, and string elements. String elements also have
`push_create_str/create_strn/clone/slice`. Structs also have
`push_copy`. Between `push_start` and
`push_end` the operations valid for the given table or string element can be
used (typically `add` for tables, and `append` for strings).
Instead of `Monster_vec_push_start` we can also uses
`Monster_spawn_push_start` etc. - in this case the child type is the
same as the parent, but using the field specific `push_start` ensures we
get the right table element type.
`Monster_spawn_push_start(B)` takes no length argument because it is a
table element, while `Monster_friends_push_start(B)` because it is a
string element (similar to a vector).
`Monster_spawn_start(B)` should just be followed by push operations
rather than following up with `Monster_spawn_extend(B, n)` because we
risk loose references that can lead to crashes. But handled carefully
it is possible:
Monster_vec_ref_t mvec;
Monster_spawn_start(B);
mvec = Monster_spawn_extend(B, 2);
mvec[0] = Monster_create(B, ...);
mvec[1] = Monster_create(B, ...);
Monster_spawn_end(B);
We can also push a reference to an independently create monster table,
all as seen before with strings.
As of flatcc version 0.5.2 it is also possible to clone tables.
Therefore we also have `push_clone` on vectors of tables.
While the use of `extend` and `truncate` is possible with vectors of
strings and tables, they should be used with care because the elements
are references and will just end up as garbage if truncated. On the
other hand, unused elements should be truncated as 0 elements in an
offset vector is not valid.
A vector of tables or strings can be created using an externally built
array of references, for example:
Monster_ref_t monsters[20];
Monster_vec_ref_t mvec;
monsters[0] = Monster_create(B, ...);
...
mvec = Monster_vec_create(B, monsters, 20);
By convention, create calls bypass the internal stack when the endian
format is otherwise compatible, and thus feed the emitter directly.
This is not possible with table and string vectors because the
references in the source vectors must be translated into offsets.
Therefore these create calls are similar to start, append, end calls.
There is an internal, but unexposed `flatcc_builder` version
`create_offset_vector_direct` which destroys the source vector instead
of allocating a stack copy.
## Unions
Unlike the C++ Flatbuffers library, we do not expose a separate union
type field except via a small struct with a union of typed references
and a type field. This struct is given to the create argument, and above
it is zero initialized meaning default None.
Unions can be created with value specific `start/end/create` calls. The add
call is not specialized since it takes a union reference:
Monster_test_Weapon_start(B);
Weapon_rounds_add(B, 50);
Monster_test_Weapon_end(B);
or
Monster_test_Weapon_create(B, 50);
or
Monster_test_Weapon_add(B, Weapon_create(B, 50));
or
Monster_test_Pickup_start(B);
Pickup_location_create(B, 0, 0, 17);
Pickup_hint_create_str(B, "Jump High!");
Monster_test_Pickup_end(B);
or
Pickup_ref_t test;
Pickup_start(B);
Pickup_location_create(B, 0, 0, 17);
test = Pickup_end(B);
Monster_test_add(B, Any_as_Pickup(test));
or
Any_union_ref_t test;
Pickup_start(B);
Pickup_location_create(B, 0, 0, 17);
/* test.Pickup = Pickup_end(B); no longer possible as of v0.5.0 */
test.value = Pickup_end(B); /* As of v0.5.1. */
test.type = Any_Pickup;
Monster_test_add(B, test);
The following is valid and will not return an error, but also has no effect:
Monster_test_add(B, Any_as_NONE());
_Note: the union structure has been changed for v0.5.0, and v0.5.1.
Both unions and union vectors are now represented by a struct with the
fields { type, value } in the low level interfaces. Before 0.5.0 only
unions of tables were supported._
### Union Vectors
The `monster_test.fbs` schema has a field named manyany in the Monster
table. It is vector of unions of type Any.
We can create a vector using
Any_union_vec_ref_t anyvec_ref;
Any_vec_start(B);
Any_vec_push(TestSimpleTableWithEnum_create(B));
anyvec_ref = Any_vec_end(B);
Monster_manyany_add(anyvec_ref);
A union can be constructed with type specific `_push` or `_push_create` operations:
Monster_manyany_start(B);
Monster_manyany_push(B, Any_as_TestSimpleTableWithEnum(ref));
Monster_manyany_end(B);
Monster_manyany_start(B);
Monster_manyany_TestSimpleTableWithEnum_push(B, ref);
Monster_manyany_end(B);
Monster_manyany_start(B);
Monster_manyany_TestSimpleTableWithEnum_push_create(B, args);
Monster_manyany_end(B);
and other similar operations, much like other vectors.
Note that internally `anyvec_ref` is really two references, one to type
vector and one to a table vector. The vector is constructed a single
vector of unions and later split into two before final storage. If it is
necessary to create a union vector from a vector of tables and types,
the low level builder interface has a `direct` call to do this.
Union vectos generally use more temporary stack space because during
construction because each element as a struct of type and reference
which don't back as densely as a two separate tables. In addition the
separated type and table vectors must be constructed temporarily. The
finaly buffer result is resonably compatct since the type vector does
not use much space. Unions will also be somewhat slower to construct,
but not unreasonably so.
### Unions of Strings and Structs
_Note: as of v0.5.0 unions can also contain strings and structs in
addition to tables. Support for these types in other languages may vary,
but C++ does support them too._
All union values are stored by reference. Structs that are not unions
are stored inline in tables and cannot be shared but unions of struct
type are stored by reference and can be shared. A union value is
therefore always a reference. This is mostly transparent because the
generated table field methods has `create/start/end` calls for each union
value type and addition to `add`.
To illustrate the use of these variation we use the Movie table from
[monster_test.fbs]:
namespace Fantasy;
table Attacker {
sword_attack_damage: int;
}
struct Rapunzel {
hair_length: uint16;
}
struct BookReader {
books_read: int;
}
union Character {
MuLan: Attacker = 2, // Can have name be different from type.
Rapunzel = 8, // Or just both the same, as before.
Belle: Fantasy.BookReader,
BookFan: BookReader,
Other: string,
Unused: string = 255
}
table Movie {
main_character: Character;
antagonist: Character;
side_kick: Character;
cameo: Character;
characters: [Character];
}
and the mixed type test case from [monster_test.c]:
nsf(Character_union_ref_t) ut;
nsf(Rapunzel_ref_t) cameo_ref;
nsf(Attacker_ref_t) attacker_ref;
nsf(BookReader_ref_t) br_ref;
nsf(BookReader_t *) pbr;
nsf(Movie_table_t) mov;
nsf(Movie_start_as_root(B));
br_ref = nsf(BookReader_create(B, 10));
cameo_ref = nsf(Rapunzel_create(B, 22));
ut = nsf(Character_as_Rapunzel(cameo_ref));
nsf(Movie_main_character_Rapunzel_create(B, 19));
nsf(Movie_cameo_Rapunzel_add(B, cameo_ref));
attacker_ref = nsf(Attacker_create(B, 42));
nsf(Movie_antagonist_MuLan_add(B, attacker_ref));
nsf(Movie_side_kick_Other_create_str(B, "Nemo"));
nsf(Movie_characters_start(B));
nsf(Movie_characters_push(B, ut));
nsf(Movie_characters_MuLan_push(B, attacker_ref));
nsf(Movie_characters_MuLan_push_create(B, 1));
nsf(Character_vec_push(B, nsf(Character_as_Other(nsc(string_create_str(B, "other"))))));
nsf(Movie_characters_Belle_push(B, br_ref));
pbr = nsf(Movie_characters_Belle_push_start(B));
pbr->books_read = 3;
nsf(Movie_characters_Belle_push_end(B));
nsf(Movie_characters_Belle_push(B, nsf(BookReader_create(B, 1))));
nsf(Movie_characters_Belle_push_create(B, 2));
nsf(Movie_characters_Other_push(B, nsc(string_create_str(B, "another"))));
nsf(Movie_characters_Other_push_create_str(B, "yet another"));
nsf(Movie_characters_end(B));
nsf(Movie_end_as_root(B));
Note that reading a union of string type requires a cast which can be
seen in the full test case in [monster_test.c].
## Error Handling
The API generally expects all error codes to be checked but the
following table and vector operations will accept and return an error:
- `add` null reference to table, vector, or string.
- `push` null reference to table or string.
- `buffer_end/create` null reference to root.
This can simplify pushing or adding atomically created objects, for
example by adding a cloned vector to table field.
It is especially important to check start operations because the builder
will not be in the expected stack frame context after failure and will
not have reserved necessary internal memory, for example when adding a
table field.
On a server with reasonable amount of memory using the default
allocator, and with an emitter that will not return errors, and when it
can be expected that inputs will not exceed the size contraints of the
flatbuffer data types, and if the api is being used correctly, then there
are no reason for failure and error handling may be skipped. However,
it is sometimes desireable for servers to restrict a single clients
memory usage, and then errors are very likely unless the source data is
already limited. As an opposite example, an embedded device sending
small network packages using a fixed but large enough allocation pool,
would be in total control and need not be concerned with any errors.
## Type System Overview
The generated methods for building buffers may look the same but
have different semantics. For example `_clone` on a table field
such as `Monster_enemy_clone` will actually create a table based
on the content of a table in a another buffer, then add that
table to the currently open table. But `Monster_clone` will
create clone and just return a reference without adding the
reference to any table. There is also `push_clone` which adds
an element to an open vector. The same applies to many other
operations.
Basically there are
the following different types of methods:
- Methods on native flatbuffer types, such as
`flatbuffer_string_start`.
- Methods on generated types types such as `Monster_start`
- Methods on field members such as as `Monster_emeny_start`
- Methods on vectors on vectors of the above such as
`flatbuffers_string_vec_start`, `Monster_vec_start`.
`Monster_inventory_vec_start`.
- Slight adaptions for buffer roots and nested buffer roots.
For unions and union vectors the story is more complex - and the
api might need to be cleaned up further, but generally there are
both union type fields, union value fields, and union fields
representing both, and vectors of the same. In additional there
are pseudo fields for each union member because `create` on a
union does not make sense, but
`Monster_myvariant_MyTable_create` does create and `MyTable`
table and assigns it with the correct type to the field
`Monster_myvariant_type` and `Monster_myvariant.
## Cloning
As of flatcc v0.5.2 it is also possible to clone tables, unions,
vectors of tables, vectors of strings, and vectors of unions.
Previously many operations did have a clone or a `push_clone`
operator, but these were all raw byte copies. Table cloning and
union cloning is signficantly more involved as it a simple copy
will not work due to stored references, possible sharing of
references and because the required alignment of table is hard
to reason about without building a new table. Unions and union
vectors are even more difficult.
That said, cloning is now implemented for all relevant data
types.
All clone operations expect the content to originate from
another finalized buffer. For scalars and structs there are
copy operations that are almost the same as clone - they both
avoid endian conversion.
Structs have a special case with clone and copy: Whenever a
struct is stored inline in the desitination buffer, it behaves
like copy. Whenever the destination is a buffer root, or a union
member, the result is a reference to an independent memory
block. When calling clone on a struct type the destination is
unknown and a indendpendent reference is created. If this is not
the intention a `copy` operation can be used. When used field
methods the destination type is known at the right thing will
happen.
Cloning a table will, by default, expand any shared references
in the source into separate copies. This is also true when
cloning string vectors, or any other data that holds references.
Worst case this can blow up memory (which is also true when
printing JSON from a buffer).
It is possible to preserve the exact DAG structure when cloning.
It may not worthwhile for simple use cases but it goes as
follows:
The builder has a pointer to a `flatcc_refmap_t` object. This is
a fairly small stack allocated object that implements a
hashtable. By default this pointer is null, and we have the
above mentioned expansion. If it is not null, each newly cloned
object will have its reference stored in the refmap. The next
time the same object is cloned, the existing reference will be
taken from the refmap instead. See source comments in
`flatcc_refmap.h` and `flatcc_builder.h`, and `monster_test.c`
clone tests.
Note that, for example, it might be relevant to preserve DAG
structure when cloning one object with all its sub-objects, but
if it is cloned a second time, a new copy is desired still while
preseving the inner DAG structure. This can be done by working
with multiple refmaps and simple swapping them out via
`flatcc_builder_set_refmap`. It is also possible to add
references manually to a refmap before cloning.
Warning: the refmap MUST not hold any foreign references when
starting a nested root clone or when cloning inside a nested
buffer that has been started but not ended because it is
invalid to share references between buffers and there are no
safety checks for this.
## Picking
Picking is a method that is related to clone and also introduced
with flatcc 0.5.2. A pick method is only defined on a table
field or a struct field. Instead of taking an a read reference
of same type as the field, it takes a reference to to the same
container type (table or struct). Essentially pick means: find
myself in the other table, clone me, and and me to the new table
which is currently open. So clone takes an entire table where
pick takes a single field. Table cloning is implemented as a
sequence of pick method, one for each field as can be seen in
the generated builder source. A pick operation does nothting if
the field is not set. Pick also works with refmaps because it
does an internal clone operation. In the generated code, only
clone on types will use the refmap but other clone and pick
operations do depend on these type clone methods.
## Sorting Vectors
Vectors can be sorted, but not by the primary builder interface because:
String and table elements cannot be accessed after they have been
emitted. The emitter can do all sorts of async operations other than
actually building a buffer, for example encrypting blocks and / or send
partial buffers over the network. Scalars could be sorted, but the most
efficient way of emitting vectors does not create a temporary vector but
emits the source directly when endianess allows for it. Less
significant, the buffer producer is likely busy processing content and /
or on a resource constrained device. Altogether, it is much simpler to
not support sorting at this interface level.
To understand how sorting is implemented, lets first look at how an
already sorted vector can be searched:
Every vector of string, scalar and enum element types have a `find`
operation in the reader interface that performs a binary seach. Every
vector of table and struct elements have a `find_by_<field_name>` iff
there is a key attribute on at least one top-level scalar, enum or
string field type. FlatBuffers do not officially allow for multiple key
attributes, but if enabled, there will by a `find_by` operation for
every keyed element field. In addition there is a `find` operation that
maps to the first keyed field.
The read interface returns a vector type, which is a const pointer, when
accessing a table field of vector type. The find operation takes such a
vector as first argument, and a key as second. Strings have variations
to allow for keys with a given length (similar to strcmp vs strncmp).
This leads us to the sort interface:
Every `find` and `find_by` operation has a matching `sort` and `sort_by`
operation table and struct vectors maps `sort` to the first keyed
`sort_by` operation. The sort operation takes a non-const vector which
has the type name suffix `_mutable_vec_t`. These
vectors are not available via the reader interface and must be cast
explicitly from `_vec_t` to `_mutable_vec_t`. When this is done, the
vector can be sorted in-place in the buffer without any memory
allocation and without any recursion.
If the namespace is
`flatbuffers`, a string vector is sorted by:
flatbuffers_string_vec_t vec;
vec = ...;
`flatbuffers_string_vec_sort((flatbuffers_string_mutable_vec_t)vec)`
Scalar and enum vectors have similar inline sort operations, for
example:
flatbuffers_uint8_vec_sort(flatbuffer_uint8_mutable_vec_t vec);
For vectors of tables or structs the sort function is named by the key
field. Assuming the Monster table has a key attribute on the `Hp` field,
the following sort operation is available:
MyGame_Example_Monster_vec_t monsters;
monsters = ...;
MyGame_Example_Monster_vec_sort_by_Hp(
(MyGame_Example_Monster_mutable_vec_t)monsters);
Note: this is the reader interface. Any kind of `ref_t` type used by the
builder do not apply here. (Advanced: if an emitter builds a buffer, the
ref type can be used to find the actual vector pointer and then it can
be sorted by casting the pointer to a vector, even if the buffer isn't
finished).
Multiple keys per table or struct is an optional feature. Each key will
have its own sort and find function similar to the above. The first key
also has the shortcut:
MyGame_Example_Monster_vec_sort(m);
The current implementation uses heap sort which is nearly as fast as
quicksort and has a compact implementation that does not require
recursion or external memory and is robust against DOS attacks by having
worst case O(n log n). It is, however, not a stable sort. The sort
assumes struct have a reasonable size so swap operations can be done
efficiently. For large structs a decicated sort operation building an
external index vector would be better, but this is not supported.
Note that a DAG is valid so there can be multiple vectors referring to
the same table elements, and each can be sorted by a different key.
The find operations are stable meaning they always return the lowest
index of any matching key or `flatbuffers_not_found` which is larger
than any other index.
### Dangers of Sorting
If a buffer was received over, say, an untrusted network the buffer
should be verified before being accessed. But verification only makes it
safe to read a buffer, not to modify a buffer because for example two
vectors can be crafted to overlap each other without breaking any
verification rules.
Thus, sorting is intended to be done shortly after the buffer is
constructed while it can still be trusted.
Using find on a buffer that is supposed to be sorted, but isn't, can
yield unexpected search results, but the result will always be a one
element in the vector being searched, not a buffer overrun.
### Scanning
Some vectors can be sorted by different keys depending on which version
version of `_sort_by` is being used. Obviously `_find_by` must match the
sorted key.
If we need to search for a key that is not sorted, or if we simply do
not want to sort the vector, it is possible to use scanning operations
instead by using `_scan` or `_scan_by`. Scanning is similar to find
except that it does a linear search and it supports scanning from a
given position.
More information on scanning in the
[README](https://github.com/dvidelabs/flatcc#searching-and-sorting)
file, and in the [monster_test.c] test file.
## Example of different interface type users
A resource constrained microcontroller is building flatbuffers from
sensor data using an emitter that sends UDP packages of the flatbuffer
as soon as enough data is ready. A server reassembles the packages or
discards them if any UDP package was lost. One the package is assembled,
the server sorts specific vectors such as temparture levels in the buffer
before it sends the buffer upstream to a storage service through a
TCP/IP connection. The analyzers perform taks such as detecting
abnormal temparature readings based on the sorted vector data.
In the above example, the original sensor devices are not interested in
the reader interface nor the sort interface. While the sort and find
operations may be available, it is dead inline code that does not
inflate the binary codes image size, but the libflatccrt library is
linked in. The collecting server is not interested in the builder
interface and does not link with the `libflatccrt` library but uses
both the inline functions of the reader intrface and the sort interface.
The upstream data storage service uses no interface at all since it
treats the buffers as binary blobs in a database indexed by device and
time. The end users only use the read only interface to visualize and
analyze and has no need for the builder or the sort interface.
## Special Emitters
An emitter only need to implement one function to replace or wrap the
default emitter. See [flatcc_builder.h] on `flatcc_builder_emit_fun` for
details, and also `emit_test.c` for a very simple custom emitter that
just prints debug messages, and [flatcc_emitter.h].
When adding padding `flatcc_builder_padding_base` is used as base in iov
entries and an emitter may detect this pointer and assume the entire
content is just nulls. Usually padding is of limited size by its very
nature so the benefit of handling this is also limited, but it, or a
similar user provided constants can be used for similar purposes:
When creating a vector in a single operation from an external C-array,
no copying takes place on the internal builder stack. Therefore it is
valid to provide a null pointer or a valid array such as
`flatcc_builder_padding_base` that is is too small for the given length,
provided that the emitter is aware of it. This in turn can be used to
allocate space in the emitters internal datastructure so the vector can
be filled after the fact if so desired. Pointer tagging may be another
way to communicate special intent. Be aware that only `create` calls
support this - any `append`, `start/end` or other dynamic operation will
require valid inpout and will stack allocate temporary space.
Emitters always receive a small table of iov entries that together form
a single object including necessary headers and padding, for example a
vector, a string, a nested buffer header, or a vtable. This is
guaranteed by the api, but there is no coordination to provide details
about which call is in order to keep the interface simple and fast. If
this is desired the user must hint the emitter out of band before
calling the relevant build operation. This can also be one indirectly by
setting `user_state` in the emitter and have the emitter inspect this
setting.
When adding vectors piecemeal using `append` or similar as opposed to
zero or less than zero copy approach above, the memory cost is obviously
higher, but unless the individual objects grow large, the stack will
operate in hot cpu cache so the bandwidth from main memory to cpu and
back will not necessarily double. If the stack grows large it may also
be worthwhile trimming the stack with a custom allocator and custom
builder reset between buffers to reduce stack size and initialization
overhead.
[monster_test.c]: https://github.com/dvidelabs/flatcc/blob/master/test/monster_test/monster_test.c
[flatcc_builder.h]: https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/flatcc_builder.h
[flatcc_emitter.h]: https://github.com/dvidelabs/flatcc/blob/master/include/flatcc/flatcc_emitter.h
[monster_test.fbs]: https://github.com/dvidelabs/flatcc/blob/master/test/monster_test/monster_test.fbs
|