diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/.gitignore | 7 | ||||
-rw-r--r-- | doc/ChangeLog.txt | 267 | ||||
-rw-r--r-- | doc/Makefile | 18 | ||||
-rw-r--r-- | doc/banner.png | bin | 0 -> 20477 bytes | |||
-rw-r--r-- | doc/banner.svg | 451 | ||||
-rw-r--r-- | doc/google315d692c9c632ed0.html | 1 | ||||
-rw-r--r-- | doc/index.html | 122 | ||||
-rw-r--r-- | doc/license.html | 55 | ||||
-rw-r--r-- | doc/rss.png | bin | 0 -> 689 bytes | |||
-rw-r--r-- | doc/styles.css | 141 | ||||
-rw-r--r-- | doc/userguide.txt | 1901 | ||||
-rw-r--r-- | doc/utarray.txt | 383 | ||||
-rw-r--r-- | doc/uthash-mini.png | bin | 0 -> 3611 bytes | |||
-rw-r--r-- | doc/uthash-mini.svg | 288 | ||||
-rw-r--r-- | doc/uthash.png | bin | 0 -> 21518 bytes | |||
-rw-r--r-- | doc/utlist.txt | 293 | ||||
-rw-r--r-- | doc/utringbuffer.txt | 374 | ||||
-rw-r--r-- | doc/utstack.txt | 158 | ||||
-rw-r--r-- | doc/utstring.txt | 239 |
19 files changed, 4698 insertions, 0 deletions
diff --git a/doc/.gitignore b/doc/.gitignore new file mode 100644 index 000000000..9a2594e2a --- /dev/null +++ b/doc/.gitignore @@ -0,0 +1,7 @@ +ChangeLog.html +userguide.html +utarray.html +utlist.html +utringbuffer.html +utstack.html +utstring.html diff --git a/doc/ChangeLog.txt b/doc/ChangeLog.txt new file mode 100644 index 000000000..281de80fe --- /dev/null +++ b/doc/ChangeLog.txt @@ -0,0 +1,267 @@ +uthash ChangeLog +================ + +Click to return to the link:index.html[uthash home page]. + +NOTE: This ChangeLog may be incomplete and/or incorrect. See the git commit log. + +Version 2.1.0 (2018-12-20) +-------------------------- +* silence some Clang static analysis warnings +* add LL_INSERT_INORDER and LL_LOWER_BOUND etc (thanks, Jeffrey Lovitz and Mattias Eriksson!) +* add uthash_bzero for platforms without <string.h> +* fix a missing HASH_BLOOM_ADD in HASH_SELECT (thanks, Pawel Veselov!) +* permit malloc failure to be recoverable via HASH_NONFATAL_OOM (thanks, Pawel Veselov!) +* avoid repeated calls to uthash_strlen in HASH_FIND_STR +* rename uthash_memcmp to HASH_KEYCMP, leaving the old name for compatibility +* add utstack.h +* remove libut + +Version 2.0.2 (2017-03-02) +-------------------------- +* fix segfault in HASH_ADD_INORDER etc (thanks, Yana Kireyonok!) +* remove spurious cast to unsigned in utstring_len (thanks, Michal Sestrienka!) +* add uthash_memcmp and uthash_strlen for platforms without <stdlib.h> (thanks, Pawel Veselov!) +* fix a C++ incompatibility in utringbuffer + +Version 2.0.1 (2016-07-05) +-------------------------- +* in-order insertion macros HASH_ADD_INORDER etc (thanks, Thilo Schulz!) +* by-hashvalue insertion macros HASH_ADD_BYHASHVALUE etc +* during key comparison, check hashvalue before doing a full memcmp +* add utringbuffer.h + +Version 1.9.9.1 (2014-11-18) +---------------------------- +* inclusion of experimental libut bundle with utvector in opt/ +* use shift in Bernstein hash instead of multiply (thanks, Jimmy Zhuo!) +* switch ssize_t types in utarray/utstring to size_t (thanks, Hong Xu!) +* fix utstring pointer math on >4GB strings (thanks, Thomas Bottesch!) +* change FNV hash to FNV-1a varation (thanks, dwest1975!) +* fix bug in HASH_REPLACE_STR (thanks, Ilya Kaliman!) + +Version 1.9.9 (2014-02-25) +-------------------------- +* made HASH_ADD_STR compatible with char* or char[] (thanks, Samuel Thibault!) +* fixed header inclusion of sys/types.h for ssize_t (thanks, Fernando Campos!) +* added LL_COUNT/DL_COUNT/CDL_COUNT (thansk, Paul Praet!) +* added LRU cache example in `tests/lru_cache` (thanks, Oliver Lorenz!) +* fix LL_DELETE2 for VS2008 (thanks, Greg Davydouski!) +* fix missing argument in `HASH_REPLACE_STR` (thanks, Alex!) +* bump version number in source files to match docs (thanks, John Crow!) +* add `HASH_OVERHEAD` macro to get overhead size for hash table + +Version 1.9.8 (2013-03-10) +-------------------------- +* `HASH_REPLACE` now in uthash (thanks, Nick Vatamaniuc!) +* fixed clang warnings (thanks wynnw!) +* fixed `utarray_insert` when inserting past array end (thanks Rob Willett!) +* you can now find http://troydhanson.github.com/uthash/[uthash on GitHub] +* there's a https://groups.google.com/d/forum/uthash[uthash Google Group] +* uthash has been downloaded 29,000+ times since 2006 on SourceForge + +Version 1.9.7 (2012-10-09) +-------------------------- +* utstring now supports substring search using `utstring_find` (thanks, Joe Wei!) +* utlist now supports element 'prepend' and 'replace' (thanks, Zoltán Lajos Kis!) +* utlist element prev/next fields can now have any names (thanks, Pawel S. Veselov!) +* uthash cast quiets a clang warning (thanks, Roman Divacky and Baptiste Daroussin!) +* uthash userguide example shows how to check key uniqueness (thanks, Richard Cook!) +* uthash HASH_MUR compiles under MSVC++ 10 in C mode (thanks, Arun Kirthi Cherian!) +* `utstring_printf` now supports format checking (thanks, Donald Carr!) + +Version 1.9.6 (2012-04-28) +-------------------------- +* add utarray_prev (thanks, Ben Hiett!) +* add parens/casts for greater compatibility (thanks, Atis, Debasis Ganguly, and Steve McClellan!) +* added ifndef to uthash_malloc and related hooks (thanks, Holger Machens!) +* edit examples so they do not leak memory (thanks, 任晶磊!) + +Version 1.9.5 (2011-11-16) +-------------------------- +* added `utarray_renew` +* fixed memory leak in `uthash_clear` when using Bloom filter (thanks, Jan Hättig!) +* utarray now copies the UT_icd on array creation rather than storing a pointer +* add parentheses to `HASH_ADD` to fix preprocessing of certain arguments (thanks, Aaron Rosen!) +* more parenthesizations for greater macro argument flexibility + +Version 1.9.4 (2011-06-05) +-------------------------- +* uthash now supports MurmurHash v3 +* utlist now includes concatenation macros (`LL_CONCAT` and `DL_CONCAT`) +* utarray now supports binary search (`utarray_find`) +* utstring now supports a new-or-clear-existing macro (`utstring_renew`) +* documented technique for a multi-level hash table +* clarified scope requirements for `UT_icd` in the utarray documentation +* fixed termination when `utstring_clear` is followed by `utstring_body` +* fixed utarray_inserta macro when used with complex arguments +* on Visual Studio define missing type `uint8_t` +* Debian/Ubuntu include uthash in the package `uthash-dev`. +* uthash has been downloaded 16,211 times. + +Thanks to Yu Feng, Richard Cook, Dino Ciuffetti, Chris Groer, and Arun Cherian +for feedback and fixes in this release! + +Version 1.9.3 (2010-10-31) +-------------------------- +* fix an `ifdef` for compatibility with Intel compiler (thanks, degski!) +* fix `HASH_ITER` macro to satisfy C++ casting rules (thanks, Erik Bai!) + +Version 1.9.2 (2010-10-04) +-------------------------- +* new `HASH_ITER` macro for more convenient deletion-safe iteration +* `hashscan` can now run on FreeBSD 8.1 and later (thanks, Markus Gebert!) +* More parens to evaluate complex macro arguments properly (thanks, ngg!) +* Add sz parameter to the `uthash_free` hook for platforms that do their own memory management. Hopefully this minor API change doesn't cause too much breakage for people. (thanks, Niall Douglas!) +* uthash has been downloaded 12,294 times + +Version 1.9.1 (2010-05-15) +-------------------------- +* Fix a redefinition warning when using `uthash.h` and `utstring.h` together +* Fix a bug in `utstring_init` +* Added `HASH_FIND_PTR` and `HASH_ADD_PTR` (thanks, Niall Douglas!) + +Version 1.9 (2010-03-31) +-------------------------- +* uthash now supports Visual Studio 2008 and 2010 in C or C++ code! +* new headers link:utarray.html[utarray.h] and link:utstring.html[utstring.h] + are now included. These implement dynamic arrays and strings using macros +* link:utlist.html[utlist.h] now has deletion-safe iterators and search macros +* the test suite now runs under Visual Studio (thanks again degski!) +* special thanks for suggesting utarray and utlist features to Charalampos P.! +* uthash has been downloaded 9,616 times + +Version 1.8 (2009-09-08) +-------------------------- +* Added the `hashscan` utility that can report on the size and quality of + hash tables in a running process (Linux-only) +* Added Bloom filter support. This has the potential to speed up certain + types of programs that look up non-existant keys in sufficient numbers. +* Restored the MurmurHash, which can once again be used, if an additional + symbol is defined. This is a "safety" by which the user declares they + understand that `-fno-strict-aliasing` flag must be used if they are + using MurmurHash under gcc with optimization. +* Unified the bucket/table malloc hooks; now there is only one malloc hook +* Re-organized the manual into a main section and advanced topics section +* Fixed a bug in `utlist.h` where sorting a singly-linked list threw a + compile-time error. +* Fixed a bug in `utlist.h` where a doubly-linked list that is sorted + did not maintain the special `head->prev` pointer back to the list tail. + +Version 1.7 (2009-06-11) +-------------------------- +* The MurmurHash has been removed, and Jenkin's hash is once again the default. + While MurmurHash performed well, it's unsafe with regard to the strict + aliasing rule. This results in incorrect code when compiled with optimization. + It's not possible to enable `-fno-strict-aliasing` from within a header file. +* The linked list macros in `utlist.h` now comply with the strict-aliasing + rule so they generate correct code under high optimization levels (O2 or O3). + The use of the `__typeof__` extension, which was originally a GNU extension, + may reduce portability to other compilers that do not support this extension. + This extension is used in the singly-linked list macros and the sort macros. + +Version 1.6 (2009-05-08) +-------------------------- +Special thanks to Alfred Heisner for contributing several enhancements: + +* Support for two new hash functions: + - the Paul Hsieh hash function (`HASH_SFH`) + - Austin Appleby's MurmurHash function (`HASH_MUR`) +* Because of its excellent performance, MurmurHash is now the default hash function. +* `keystats` now has much better elapsed time accuracy under Cygwin and MinGW +* fixed casting in `HASH_FNV`, `HASH_SAX` and `HASH_OAT` for non-char keys + +This release also includes: + +* a new `HASH_CLEAR` operation clears a hash table in one step. +* a new `HASH_SELECT` operation inserts those elements from one hash that + satisfy a given condition into another hash. The selected items have + dual presence in both hash tables. For example a game could select the + visible polygons from a hash of all polygons. +* fixed a compile-time error which occurred if the final argument to + `HASH_ADD_KEYPTR` was a pointer to an array member like `&a[i]` +* added another test script `tests/all_funcs` which executes the test suite + using every supported hash function + +And lastly, + +* a new, separate header called link:utlist.html[utlist.h] is included which + provides 'linked list macros' for C structures, similar in style to the + uthash macros + +Version 1.5 (2009-02-19) +-------------------------- +* now thread-safe for concurrent readers +* use scratch variables on stack rather than in table (thanks, Petter Arvidsson!). + This change made HASH_FIND about 13% faster and enabled reader concurrency. +* made link:license.html[BSD license] terms even more permissive +* added link:userguide.pdf[PDF version] of User Guide +* added http://troydhanson.wordpress.com/feed/[update news] image:rss.png[(RSS)] + +Version 1.4 (2008-09-23) +-------------------------- +* Add `HASH_COUNT` for counting items in the hash +* Compatibility with C\+\+. Satisfy additional casting requirements. + Also in the `tests/` directory, running `make cplusplus` now compiles + all the test programs with the C++ compiler. +* Eliminate `elmt` pointer from the UT_hash_handle. Calculate elmt + from hash handle address by subtracting `hho` (hash handle offset). +* Contributed by L.S.Chin: + Cast `void*` to char* before pointer arithmetic to suppress compiler + warnings. We assume compilers abide to C standards which impose + requirement that `sizeof(void*) == sizeof(char*)`. +* Return meaningful exit status from do_tests per Tiago Cunha, + so that package manager-based install can verify tests are successful + + +Version 1.3 (2008-07-27) +------------------------ +* use integer-only math-- no floating point! Support FPU-less CPU's. +* eliminate `hash_q` metric, which measured the fraction of items with + non-ideal chain positions. We only need to know if this fraction + is below 0.5. This is now determined using fast bitwise tests. +* when an item is added to the hash, calculate the key's hash value + upfront and store it, instead of recomputing it as needed. This hashv + is stored in the hash handle. Potentially major speed benefit for + bucket expansion algorithm. Deleting is marginally improved too. +* fixed a minor bug in the calculation of the max ideal chain length; + line 446 in v1.2 erroneously calculated a/b*2 instead of a/(b*2). + The effect of this bug was that bucket expansion could occur more + readily because the per-bucket 'max chain length multiplier factor' + (which delays bucket expansion when certain buckets are overused) + was set to a lower, expansion-favoring value than intended. +* improved source commenting and improved variable names in structures +* remove `HASH_JSW`. Lengthy random number array made code less readable +* add `HASH_SRT(hh,hash,cmp)` as a generalized `HASH_SORT(hash,cmp)`. + It was an omission in uthash 1.2 that there was no sort macro for + hash handles with names other than hh. +* Corrected `HASH_FSCK` so it works with any name for the hash handle. +* behave properly in pathological `HASH_DEL(a,a)` case where the same + variable references the head and the deletee (advancing the head + then loses the correct reference to the deletee); fix by using + scratch area in the hash table to store deletee hash handle. +* made tests runnable on MinGW +* 3000+ downloads since uthash-1.0 + + +Version 1.2 (2006-11-22) +------------------------ +* new `HASH_SORT` macro +* Cygwin support +* User Guide now features a clickable Table of Contents. + (The technique for generating the TOC on the browser was contributed + back to the AsciiDoc project and incorporated into AsciiDoc v8.1.0). + + +Version 1.1 (2006-06-28) +------------------------ +* uthash-1.1 released +* supports several built-in user-selectable hash functions +* new keystats utility quantifies performance of hash functions + + +Version 1.0 (2006-06-02) +------------------------ +* Initial release + +// vim: set syntax=asciidoc: diff --git a/doc/Makefile b/doc/Makefile new file mode 100644 index 000000000..099a373d5 --- /dev/null +++ b/doc/Makefile @@ -0,0 +1,18 @@ +HTML=$(patsubst %.txt,%.html,$(wildcard *.txt)) + +all: $(HTML) + +# when each target of a multi-target rule has its own prereq +# we use a static pattern rule. +$(HTML): %.html: %.txt + asciidoc -a toc2 $< + +TMP=/tmp/uthash-gh-pages +stage: + mkdir -p ${TMP} + rm -if ${TMP}/* + cp *.html *.css *.png ${TMP} + +.PHONY: clean +clean: + $(RM) $(HTML) diff --git a/doc/banner.png b/doc/banner.png Binary files differnew file mode 100644 index 000000000..de4f310b9 --- /dev/null +++ b/doc/banner.png diff --git a/doc/banner.svg b/doc/banner.svg new file mode 100644 index 000000000..f3143f171 --- /dev/null +++ b/doc/banner.svg @@ -0,0 +1,451 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Created with Inkscape (http://www.inkscape.org/) --> +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://web.resource.org/cc/" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink" + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" + width="728px" + height="90px" + id="svg1307" + sodipodi:version="0.32" + inkscape:version="0.45.1" + sodipodi:docbase="/Users/thanson/code/uthash/trunk/doc/html/img" + sodipodi:docname="banner.svg" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/banner.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90" + inkscape:output_extension="org.inkscape.output.svg.inkscape"> + <defs + id="defs1309"> + <linearGradient + id="linearGradient12743"> + <stop + style="stop-color:#99e1fa;stop-opacity:1;" + offset="0" + id="stop12745" /> + <stop + id="stop12753" + offset="0" + style="stop-color:#99e1fa;stop-opacity:0.49803922;" /> + <stop + style="stop-color:#99e1fa;stop-opacity:0;" + offset="1" + id="stop12747" /> + </linearGradient> + <marker + inkscape:stockid="Arrow1Mend" + orient="auto" + refY="0.0" + refX="0.0" + id="Arrow1Mend" + style="overflow:visible;"> + <path + id="path7755" + d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z " + style="fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt;marker-start:none;" + transform="scale(0.4) rotate(180)" /> + </marker> + <marker + inkscape:stockid="Arrow1Sstart" + orient="auto" + refY="0.0" + refX="0.0" + id="Arrow1Sstart" + style="overflow:visible"> + <path + id="path7752" + d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z " + style="fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt;marker-start:none" + transform="scale(0.2)" /> + </marker> + <marker + inkscape:stockid="Arrow1Send" + orient="auto" + refY="0.0" + refX="0.0" + id="Arrow1Send" + style="overflow:visible;"> + <path + id="path7749" + d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z " + style="fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt;marker-start:none;" + transform="scale(0.2) rotate(180)" /> + </marker> + <marker + inkscape:stockid="StopM" + orient="auto" + refY="0.0" + refX="0.0" + id="StopM" + style="overflow:visible"> + <path + id="path7651" + d="M 0.0,5.65 L 0.0,-5.65" + style="fill:none;fill-opacity:0.75000000;fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt" + transform="scale(0.4)" /> + </marker> + <marker + inkscape:stockid="Arrow2Mend" + orient="auto" + refY="0.0" + refX="0.0" + id="Arrow2Mend" + style="overflow:visible;"> + <path + id="path7737" + style="font-size:12.0;fill-rule:evenodd;stroke-width:0.62500000;stroke-linejoin:round;" + d="M 8.7185878,4.0337352 L -2.2072895,0.016013256 L 8.7185884,-4.0017078 C 6.9730900,-1.6296469 6.9831476,1.6157441 8.7185878,4.0337352 z " + transform="scale(0.6) rotate(180) translate(-5,0)" /> + </marker> + <marker + inkscape:stockid="TriangleInM" + orient="auto" + refY="0.0" + refX="0.0" + id="TriangleInM" + style="overflow:visible"> + <path + id="path7669" + d="M 5.77,0.0 L -2.88,5.0 L -2.88,-5.0 L 5.77,0.0 z " + style="fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt;marker-start:none" + transform="scale(-0.4)" /> + </marker> + <marker + inkscape:stockid="StopL" + orient="auto" + refY="0.0" + refX="0.0" + id="StopL" + style="overflow:visible"> + <path + id="path7654" + d="M 0.0,5.65 L 0.0,-5.65" + style="fill:none;fill-opacity:0.75000000;fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt" + transform="scale(0.8)" /> + </marker> + <marker + inkscape:stockid="TriangleOutM" + orient="auto" + refY="0.0" + refX="0.0" + id="TriangleOutM" + style="overflow:visible"> + <path + id="path7660" + d="M 5.77,0.0 L -2.88,5.0 L -2.88,-5.0 L 5.77,0.0 z " + style="fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt;marker-start:none" + transform="scale(0.4)" /> + </marker> + <marker + inkscape:stockid="DiamondS" + orient="auto" + refY="0.0" + refX="0.0" + id="DiamondS" + style="overflow:visible"> + <path + id="path7675" + d="M -2.1579186e-005,-7.0710768 L -7.0710894,-8.9383918e-006 L -2.1579186e-005,7.0710589 L 7.0710462,-8.9383918e-006 L -2.1579186e-005,-7.0710768 z " + style="fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt;marker-start:none" + transform="scale(0.2)" /> + </marker> + <marker + inkscape:stockid="Tail" + orient="auto" + refY="0.0" + refX="0.0" + id="Tail" + style="overflow:visible"> + <g + id="g7716" + transform="scale(-1.2)"> + <path + id="path7718" + d="M -3.8048674,-3.9585227 L 0.54352094,-0.00068114835" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.8;marker-start:none;marker-end:none;stroke-linecap:round" /> + <path + id="path7720" + d="M -1.2866832,-3.9585227 L 3.0617053,-0.00068114835" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.8;marker-start:none;marker-end:none;stroke-linecap:round" /> + <path + id="path7722" + d="M 1.3053582,-3.9585227 L 5.6537466,-0.00068114835" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.8;marker-start:none;marker-end:none;stroke-linecap:round" /> + <path + id="path7724" + d="M -3.8048674,4.1775838 L 0.54352094,0.21974226" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.8;marker-start:none;marker-end:none;stroke-linecap:round" /> + <path + id="path7726" + d="M -1.2866832,4.1775838 L 3.0617053,0.21974226" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.8;marker-start:none;marker-end:none;stroke-linecap:round" /> + <path + id="path7728" + d="M 1.3053582,4.1775838 L 5.6537466,0.21974226" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.8;marker-start:none;marker-end:none;stroke-linecap:round" /> + </g> + </marker> + <marker + inkscape:stockid="Arrow1Lstart" + orient="auto" + refY="0.0" + refX="0.0" + id="Arrow1Lstart" + style="overflow:visible"> + <path + id="path7764" + d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z " + style="fill-rule:evenodd;stroke:#000000;stroke-width:1.0pt;marker-start:none" + transform="scale(0.8)" /> + </marker> + <linearGradient + inkscape:collect="always" + id="linearGradient3964"> + <stop + style="stop-color:#00eb00;stop-opacity:1;" + offset="0" + id="stop3966" /> + <stop + style="stop-color:#00eb00;stop-opacity:0;" + offset="1" + id="stop3968" /> + </linearGradient> + <radialGradient + inkscape:collect="always" + xlink:href="#linearGradient3964" + id="radialGradient3996" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1,0,0,0.237347,4.901628e-13,36.5688)" + cx="176.99219" + cy="47.949429" + fx="176.99219" + fy="47.949429" + r="78.257812" /> + <radialGradient + inkscape:collect="always" + xlink:href="#linearGradient12743" + id="radialGradient12751" + cx="165.91866" + cy="45.584854" + fx="165.91866" + fy="45.584854" + r="56.51194" + gradientTransform="matrix(1,0,0,0.603517,0,18.07364)" + gradientUnits="userSpaceOnUse" /> + </defs> + <sodipodi:namedview + id="base" + pagecolor="#ffffff" + bordercolor="#666666" + borderopacity="1.0" + inkscape:pageopacity="0.0" + inkscape:pageshadow="2" + inkscape:zoom="0.9793956" + inkscape:cx="372.32157" + inkscape:cy="45" + inkscape:document-units="px" + inkscape:current-layer="g2335" + inkscape:window-width="791" + inkscape:window-height="581" + inkscape:window-x="4" + inkscape:window-y="48" /> + <metadata + id="metadata1312"> + <rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + </cc:Work> + </rdf:RDF> + </metadata> + <g + id="layer1" + inkscape:label="Layer 1" + inkscape:groupmode="layer"> + <rect + style="opacity:1;fill:#393be9;fill-opacity:1;stroke:#f18a00;stroke-width:5.65522385;stroke-linecap:round;stroke-linejoin:bevel;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect3981" + width="435.17825" + height="78.666664" + x="5.1747785" + y="6" + rx="29.141403" + ry="20" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90" /> + <flowRoot + transform="matrix(1.673678,0,0,1.673678,-141.8484,-37.12273)" + style="font-size:47.99999774;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr;text-anchor:start;fill:#faf599;fill-opacity:1;stroke:#f3bf33;stroke-opacity:1;font-family:Bitstream Vera Sans" + id="flowRoot3988" + xml:space="preserve" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90"><flowRegion + style="fill:url(#radialGradient3996);fill-opacity:1" + id="flowRegion3990"><rect + style="font-size:47.99999774;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr;text-anchor:start;fill:#faf599;fill-opacity:1;stroke:#f3bf33;stroke-opacity:1;font-family:Bitstream Vera Sans" + y="18" + x="94.666664" + height="61.333332" + width="321.33334" + id="rect3992" /></flowRegion><flowPara + id="flowPara7831">ut hash</flowPara></flowRoot> <rect + style="opacity:1;fill:url(#radialGradient12751);fill-opacity:1.0;stroke:none;stroke-width:2.82532263;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect10995" + width="113.02388" + height="68.211792" + x="109.40672" + y="11.478957" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90" /> + <g + id="g7808" + transform="matrix(0.807859,0,0,0.807859,-140.848,9.677403)" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90"> + <rect + y="37.730064" + x="382.39673" + height="18.405188" + width="23.206543" + id="rect4882" + style="opacity:1;fill:#9be5ea;fill-opacity:1;stroke:#000000;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <rect + style="opacity:1;fill:#d48c21;fill-opacity:0.97777776;stroke:#000000;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect4886" + width="23.206543" + height="18.405188" + x="416.39673" + y="37.730064" /> + <path + inkscape:connector-type="polyline" + id="path4890" + d="M 372.60327,46.932658 L 381.39673,46.932658" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path4892" + d="M 406.60327,46.932658 L 415.39673,46.932658" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <rect + y="9.7300644" + x="348.39673" + height="18.405188" + width="23.206543" + id="rect4896" + style="opacity:1;fill:#79c71a;fill-opacity:1;stroke:#000000;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <rect + style="opacity:1;fill:#f5e1a2;fill-opacity:1;stroke:#000000;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect4898" + width="23.206543" + height="18.405188" + x="382.39673" + y="9.7300644" /> + <path + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 372.60327,18.932658 L 381.39673,18.932658" + id="path4902" + inkscape:connector-type="polyline" /> + <rect + y="14.207111" + x="318.45328" + height="10.1194" + width="10.1194" + id="rect4906" + style="opacity:1;fill:#1336e6;fill-opacity:1;stroke:none;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path5789" + d="M 328.57268,19.220474 L 347.39673,19.048081" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 328.57268,19.220474 L 347.39673,19.048081" + id="path5795" + inkscape:connector-type="polyline" /> + <rect + y="37.789951" + x="348.20978" + height="18.405188" + width="23.206543" + id="rect5803" + style="opacity:1;fill:#e5e5e5;fill-opacity:1;stroke:#000000;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <rect + y="42.267002" + x="318.26633" + height="10.1194" + width="10.1194" + id="rect5805" + style="opacity:1;fill:#1336e6;fill-opacity:1;stroke:none;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <path + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 328.38572,47.280365 L 347.20977,47.107972" + id="path5807" + inkscape:connector-type="polyline" /> + <rect + style="opacity:1;fill:#ddf9ed;fill-opacity:1;stroke:#000000;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect5809" + width="23.206543" + height="18.405188" + x="348.20978" + y="63.720913" /> + <rect + style="opacity:1;fill:#1336e6;fill-opacity:1;stroke:none;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect5811" + width="10.1194" + height="10.1194" + x="318.26633" + y="68.197968" /> + <path + inkscape:connector-type="polyline" + id="path5813" + d="M 328.38572,73.211328 L 347.20977,73.038935" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path5833" + d="M 323.47927,24.326511 L 323.35974,42.267002" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#2f29df;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path5835" + d="M 323.32603,52.386402 L 323.32603,68.197968" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#2f29df;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + id="path6716" + d="M 429.08836,47.11641 L 394.37307,18.527349 L 394.37307,49.158485 L 359.65778,18.527349 L 359.65778,50.179523 L 359.65778,75.70547" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#f3bf33;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;marker-start:url(#StopM);marker-end:url(#Arrow1Send);stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + </g> + <g + id="g2335" + transform="translate(0,-10)" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo_tag.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90"> + <text + xml:space="preserve" + style="font-size:18.43119621px;font-style:normal;font-weight:normal;text-align:center;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;font-family:Bitstream Vera Sans" + x="565.8512" + y="50.633156" + id="text2331"><tspan + sodipodi:role="line" + id="tspan2333" + x="565.85119" + y="50.633156">a hash table</tspan><tspan + sodipodi:role="line" + x="565.8512" + y="73.672151" + id="tspan2361">for C structures</tspan></text> + </g> + </g> +</svg> diff --git a/doc/google315d692c9c632ed0.html b/doc/google315d692c9c632ed0.html new file mode 100644 index 000000000..c79038e6e --- /dev/null +++ b/doc/google315d692c9c632ed0.html @@ -0,0 +1 @@ +google-site-verification: google315d692c9c632ed0.html
\ No newline at end of file diff --git a/doc/index.html b/doc/index.html new file mode 100644 index 000000000..11b120e22 --- /dev/null +++ b/doc/index.html @@ -0,0 +1,122 @@ +<!DOCTYPE html + PUBLIC "-//W3C//DTD XTHML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <link rel="stylesheet" type="text/css" href="styles.css" /> + <title>uthash: a hash table for C structures</title> + </head> + <body> + + <div id="banner"> + <img src="banner.png" alt="uthash: a hash table for C structures" /> + </div> <!-- banner --> + + <div id="topnav"> + <a href="http://github.com/troydhanson/uthash">GitHub page</a> > + uthash home <!-- http://troydhanson.github.com/uthash/ --> + +<a href="https://twitter.com/share" class="twitter-share-button" data-via="troydhanson">Tweet</a> +<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");</script> + </div> + + <hr /> + <div id="mid"> + + <div id="nav"> + + <h2>documentation</h2> + <div><a href="userguide.html">uthash</a></div> + <div><a href="utlist.html">utlist</a></div> + <div><a href="utarray.html">utarray</a></div> + <div><a href="utringbuffer.html">utringbuffer</a></div> + <div><a href="utstack.html">utstack</a></div> + <div><a href="utstring.html">utstring</a></div> + + <h2>download</h2> + <h3>GNU/Linux, Windows</h3> + <div><a href=https://github.com/troydhanson/uthash/archive/master.zip>uthash-master.zip</a></div> + <div><a href=https://github.com/troydhanson/uthash>git clone</a></div> + + <h2>license</h2> + <div><a href="license.html">BSD revised</a></div> + + + <h2>developer</h2> + <div><a href="http://troydhanson.github.io/">Troy D. Hanson</a></div> + + <h2>maintainer</h2> + <div><a href="https://github.com/Quuxplusone">Arthur O'Dwyer</a></div> + + + </div> + + <div id="main"> + Any C structure can be stored in a hash table using uthash. Just add a + <em>UT_hash_handle</em> to the structure and choose one or more fields + in your structure to act as the key. Then use these macros to store, + retrieve or delete items from the hash table. + +<div class="listing"> +Example 1. Adding an item to a hash. +<div class="code"> +<pre> +#include "uthash.h" + +struct my_struct { + int id; /* we'll use this field as the key */ + char name[10]; + UT_hash_handle hh; /* makes this structure hashable */ +}; + +struct my_struct *users = NULL; + +void add_user(struct my_struct *s) { + HASH_ADD_INT( users, id, s ); +} + +</pre> +</div> <!-- code --> +</div> <!-- listing --> + +<div class="listing"> +Example 2. Looking up an item in a hash. +<div class="code"> +<pre> +struct my_struct *find_user(int user_id) { + struct my_struct *s; + + HASH_FIND_INT( users, &user_id, s ); + return s; +} + +</pre> +</div> <!-- code --> +</div> <!-- listing --> + +<div class="listing"> +Example 3. Deleting an item from a hash. +<div class="code"> + +<pre> +void delete_user(struct my_struct *user) { + HASH_DEL( users, user); +} + +</pre> +</div> <!-- code --> +</div> <!-- listing --> + + For more information and examples, please see the <a href="userguide.html">User Guide.</a> + +</div> <!-- main --> +</div> <!-- mid --> + + <hr /> + <div id="footer"> + </div> <!-- footer --> + + </body> + +</html> + diff --git a/doc/license.html b/doc/license.html new file mode 100644 index 000000000..c70684558 --- /dev/null +++ b/doc/license.html @@ -0,0 +1,55 @@ +<!DOCTYPE html + PUBLIC "-//W3C//DTD XTHML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <link rel="stylesheet" type="text/css" href="styles.css" /> + <title>uthash: a hash table for C structures</title> + </head> + <body> + + <div id="banner"> + <img src="banner.png" alt="uthash: a hash table for C structures" /> + </div> <!-- banner --> + + <div id="topnav"> + <a href="http://troydhanson.github.com/uthash/">uthash home</a> > + BSD license + </div> + + <hr /> + <div id="mid"> + <div id="main"> +<pre> +Copyright (c) 2005-2018, Troy D. Hanson http://troydhanson.github.com/uthash/ +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS +IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED +TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +</pre> +</div> <!-- mid --> +</div> <!-- main --> + + <hr /> + <div id="footer"> + </div> <!-- footer --> + + </body> + +</html> + diff --git a/doc/rss.png b/doc/rss.png Binary files differnew file mode 100644 index 000000000..b3c949d22 --- /dev/null +++ b/doc/rss.png diff --git a/doc/styles.css b/doc/styles.css new file mode 100644 index 000000000..061172232 --- /dev/null +++ b/doc/styles.css @@ -0,0 +1,141 @@ +#banner { + /* font-size: x-large; */ + /* background: #ff00ff; */ + /* height: 100px; */ +} + +#topnav { + /* background-image: url(img/grad_topnav.png); */ + /* background-repeat: repeat-y; */ + /* background-color: #af00af; */ + /* height: 25px; */ + margin-top: 5px; + margin-bottom: 10px; + padding: 3px; + font-size: 9pt; + font-family: sans-serif; + /* border-style: solid; */ + /* border-width: 1px; */ +} + +#topnav a { + padding: 8px; +} + +h1,p { margin: 0; } /* non-0 margin on firefox */ + +#mid { + /* background-image: url(img/grad_blue.png); */ + background-repeat: repeat-y; + /* background-color: #ffddaa; */ + padding-top: 20px; + padding-top: 20px; + margin-bottom: 10px; +} + +#mid img { + padding-left: 10px; + vertical-align: middle; +} + +a img { + border: 0 +} + +.twitter-share-button { + float: right; +} + +.twitter-follow-button { + padding: 5px; +} + +#nav { + background-color: #fff8f1; + margin-left: 10px; + margin-top: 20px; + margin-right: 20px; + float: left; + padding: 10px; + border-style: solid; + border-width: 2px; + font-family: sans-serif; +} + + +#nav h2 { + font-weight: bold; + font-size: 10pt; +} + +#nav h3 { + /* font-weight: bold; */ + padding-left: 5px; + /* font-style: oblique; */ + font-family: sans-serif; + font-size: 7pt; +} + +#nav div { + font-size: 9pt; + padding-left: 15px; +} + +#main { + background: #ffffff; + margin-top: 20px; + margin-left: 170px; + padding-left: 20px; + height: 100%; +} + +#main h1 { + font-family: sans-serif; +} + +.listing { + margin: 20px; + font-family: sans-serif; + font-weight: bold; +} + +.code { + padding: 10px; + margin: 10px; + font-size: 8pt; + font-weight: normal; + background: #f3f3f3; + width: 100%; + border-style: solid; + border-width: 1px; +} + +#footer { + /* background: #00ffff; */ + margin-top: 5px; + font-size: small; + font-family: sans-serif; +} + +hr { + height: 0.04em; + background: black; + margin: 0 10% 0 0; +} + +#footer { + width: 90%; +} + +#footer img { + margin-right: 5px; + float: right; +} + +#footer #support { + float: right; +} + +body { + width: 80%; +} diff --git a/doc/userguide.txt b/doc/userguide.txt new file mode 100644 index 000000000..4d8ce8f0a --- /dev/null +++ b/doc/userguide.txt @@ -0,0 +1,1901 @@ +uthash User Guide +================= +Troy D. Hanson, Arthur O'Dwyer +v2.1.0, December 2018 + +To download uthash, follow this link back to the +https://github.com/troydhanson/uthash[GitHub project page]. +Back to my http://troydhanson.github.io/[other projects]. + +A hash in C +----------- +This document is written for C programmers. Since you're reading this, chances +are that you know a hash is used for looking up items using a key. In scripting +languages, hashes or "dictionaries" are used all the time. In C, hashes don't +exist in the language itself. This software provides a hash table for C +structures. + +What can it do? +~~~~~~~~~~~~~~~~~ +This software supports these operations on items in a hash table: + +1. add/replace +2. find +3. delete +4. count +5. iterate +6. sort + +Is it fast? +~~~~~~~~~~~ +Add, find and delete are normally constant-time operations. This is influenced +by your key domain and the hash function. + +This hash aims to be minimalistic and efficient. It's around 1000 lines of C. +It inlines automatically because it's implemented as macros. It's fast as long +as the hash function is suited to your keys. You can use the default hash +function, or easily compare performance and choose from among several other +<<hash_functions,built-in hash functions>>. + +Is it a library? +~~~~~~~~~~~~~~~~ +No, it's just a single header file: `uthash.h`. All you need to do is copy +the header file into your project, and: + + #include "uthash.h" + +Since uthash is a header file only, there is no library code to link against. + +C/C++ and platforms +~~~~~~~~~~~~~~~~~~~ +This software can be used in C and C++ programs. It has been tested on: + + * Linux + * Windows using Visual Studio 2008 and 2010 + * Solaris + * OpenBSD + * FreeBSD + * Android + +Test suite +^^^^^^^^^^ +To run the test suite, enter the `tests` directory. Then, + + * on Unix platforms, run `make` + * on Windows, run the "do_tests_win32.cmd" batch file. (You may edit the + batch file if your Visual Studio is installed in a non-standard location). + +BSD licensed +~~~~~~~~~~~~ +This software is made available under the +link:license.html[revised BSD license]. +It is free and open source. + +Download uthash +~~~~~~~~~~~~~~~ +Follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file. + +Getting help +~~~~~~~~~~~~ +Please use the https://groups.google.com/d/forum/uthash[uthash Google Group] to +ask questions. You can email it at uthash@googlegroups.com. + +Contributing +~~~~~~~~~~~~ +You may submit pull requests through GitHub. However, the maintainers of uthash +value keeping it unchanged, rather than adding bells and whistles. + +Extras included +~~~~~~~~~~~~~~~ +Three "extras" come with uthash. These provide lists, dynamic arrays and +strings: + + * link:utlist.html[utlist.h] provides linked list macros for C structures. + * link:utarray.html[utarray.h] implements dynamic arrays using macros. + * link:utstring.html[utstring.h] implements a basic dynamic string. + +History +~~~~~~~ +I wrote uthash in 2004-2006 for my own purposes. Originally it was hosted on +SourceForge. Uthash was downloaded around 30,000 times between 2006-2013 then +transitioned to GitHub. It's been incorporated into commercial software, +academic research, and into other open-source software. It has also been added +to the native package repositories for a number of Unix-y distros. + +When uthash was written, there were fewer options for doing generic hash tables +in C than exist today. There are faster hash tables, more memory-efficient hash +tables, with very different API's today. But, like driving a minivan, uthash is +convenient, and gets the job done for many purposes. + +As of July 2016, uthash is maintained by Arthur O'Dwyer. + +Your structure +-------------- + +In uthash, a hash table is comprised of structures. Each structure represents a +key-value association. One or more of the structure fields constitute the key. +The structure pointer itself is the value. + +.Defining a structure that can be hashed +---------------------------------------------------------------------- +#include "uthash.h" + +struct my_struct { + int id; /* key */ + char name[10]; + UT_hash_handle hh; /* makes this structure hashable */ +}; +---------------------------------------------------------------------- + +Note that, in uthash, your structure will never be moved or copied into another +location when you add it into a hash table. This means that you can keep other +data structures that safely point to your structure-- regardless of whether you +add or delete it from a hash table during your program's lifetime. + +The key +~~~~~~~ +There are no restrictions on the data type or name of the key field. The key +can also comprise multiple contiguous fields, having any names and data types. + +.Any data type... really? +***************************************************************************** +Yes, your key and structure can have any data type. Unlike function calls with +fixed prototypes, uthash consists of macros-- whose arguments are untyped-- and +thus able to work with any type of structure or key. +***************************************************************************** + +Unique keys +^^^^^^^^^^^ +As with any hash, every item must have a unique key. Your application must +enforce key uniqueness. Before you add an item to the hash table, you must +first know (if in doubt, check!) that the key is not already in use. You +can check whether a key already exists in the hash table using `HASH_FIND`. + +The hash handle +~~~~~~~~~~~~~~~ +The `UT_hash_handle` field must be present in your structure. It is used for +the internal bookkeeping that makes the hash work. It does not require +initialization. It can be named anything, but you can simplify matters by +naming it `hh`. This allows you to use the easier "convenience" macros to add, +find and delete items. + +A word about memory +~~~~~~~~~~~~~~~~~~~ + +Overhead +^^^^^^^^ +The hash handle consumes about 32 bytes per item on a 32-bit system, or 56 bytes +per item on a 64-bit system. The other overhead costs-- the buckets and the +table-- are negligible in comparison. You can use `HASH_OVERHEAD` to get the +overhead size, in bytes, for a hash table. See <<Macro_reference,Macro +Reference>>. + +How clean up occurs +^^^^^^^^^^^^^^^^^^^ +Some have asked how uthash cleans up its internal memory. The answer is simple: +'when you delete the final item' from a hash table, uthash releases all the +internal memory associated with that hash table, and sets its pointer to NULL. + + +Hash operations +--------------- + +This section introduces the uthash macros by example. For a more succinct +listing, see <<Macro_reference,Macro Reference>>. + +.Convenience vs. general macros: +***************************************************************************** +The uthash macros fall into two categories. The 'convenience' macros can be used +with integer, pointer or string keys (and require that you chose the conventional +name `hh` for the `UT_hash_handle` field). The convenience macros take fewer +arguments than the general macros, making their usage a bit simpler for these +common types of keys. + +The 'general' macros can be used for any types of keys, or for multi-field keys, +or when the `UT_hash_handle` has been named something other than `hh`. These +macros take more arguments and offer greater flexibility in return. But if the +convenience macros suit your needs, use them-- your code will be more readable. +***************************************************************************** + +Declare the hash +~~~~~~~~~~~~~~~~ +Your hash must be declared as a `NULL`-initialized pointer to your structure. + + struct my_struct *users = NULL; /* important! initialize to NULL */ + +Add item +~~~~~~~~ +Allocate and initialize your structure as you see fit. The only aspect +of this that matters to uthash is that your key must be initialized to +a unique value. Then call `HASH_ADD`. (Here we use the convenience macro +`HASH_ADD_INT`, which offers simplified usage for keys of type `int`). + +.Add an item to a hash +---------------------------------------------------------------------- +void add_user(int user_id, char *name) { + struct my_struct *s; + + s = malloc(sizeof(struct my_struct)); + s->id = user_id; + strcpy(s->name, name); + HASH_ADD_INT( users, id, s ); /* id: name of key field */ +} +---------------------------------------------------------------------- + +The first parameter to `HASH_ADD_INT` is the hash table, and the +second parameter is the 'name' of the key field. Here, this is `id`. The +last parameter is a pointer to the structure being added. + +[[validc]] +.Wait.. the field name is a parameter? +******************************************************************************* +If you find it strange that `id`, which is the 'name of a field' in the +structure, can be passed as a parameter... welcome to the world of macros. Don't +worry; the C preprocessor expands this to valid C code. +******************************************************************************* + +Key must not be modified while in-use +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Once a structure has been added to the hash, do not change the value of its key. +Instead, delete the item from the hash, change the key, and then re-add it. + +Checking uniqueness +^^^^^^^^^^^^^^^^^^^ +In the example above, we didn't check to see if `user_id` was already a key +of some existing item in the hash. *If there's any chance that duplicate keys +could be generated by your program, you must explicitly check the uniqueness* +before adding the key to the hash. If the key is already in the hash, you can +simply modify the existing structure in the hash rather than adding the item. +'It is an error to add two items with the same key to the hash table'. + +Let's rewrite the `add_user` function to check whether the id is in the hash. +Only if the id is not present in the hash, do we create the item and add it. +Otherwise we just modify the structure that already exists. + + void add_user(int user_id, char *name) { + struct my_struct *s; + + HASH_FIND_INT(users, &user_id, s); /* id already in the hash? */ + if (s==NULL) { + s = (struct my_struct *)malloc(sizeof *s); + s->id = user_id; + HASH_ADD_INT( users, id, s ); /* id: name of key field */ + } + strcpy(s->name, name); + } + + +Why doesn't uthash check key uniqueness for you? It saves the cost of a hash +lookup for those programs which don't need it- for example, programs whose keys +are generated by an incrementing, non-repeating counter. + +However, if replacement is a common operation, it is possible to use the +`HASH_REPLACE` macro. This macro, before adding the item, will try to find an +item with the same key and delete it first. It also returns a pointer to the +replaced item, so the user has a chance to de-allocate its memory. + +Passing the hash pointer into functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +In the example above `users` is a global variable, but what if the caller wanted +to pass the hash pointer 'into' the `add_user` function? At first glance it would +appear that you could simply pass `users` as an argument, but that won't work +right. + + /* bad */ + void add_user(struct my_struct *users, int user_id, char *name) { + ... + HASH_ADD_INT( users, id, s ); + } + +You really need to pass 'a pointer' to the hash pointer: + + /* good */ + void add_user(struct my_struct **users, int user_id, char *name) { ... + ... + HASH_ADD_INT( *users, id, s ); + } + +Note that we dereferenced the pointer in the `HASH_ADD` also. + +The reason it's necessary to deal with a pointer to the hash pointer is simple: +the hash macros modify it (in other words, they modify the 'pointer itself' not +just what it points to). + + +Replace item +~~~~~~~~~~~~ +`HASH_REPLACE` macros are equivalent to HASH_ADD macros except they attempt +to find and delete the item first. If it finds and deletes an item, it will +also return that items pointer as an output parameter. + + +Find item +~~~~~~~~~ +To look up a structure in a hash, you need its key. Then call `HASH_FIND`. +(Here we use the convenience macro `HASH_FIND_INT` for keys of type `int`). + +.Find a structure using its key +---------------------------------------------------------------------- +struct my_struct *find_user(int user_id) { + struct my_struct *s; + + HASH_FIND_INT( users, &user_id, s ); /* s: output pointer */ + return s; +} +---------------------------------------------------------------------- + +Here, the hash table is `users`, and `&user_id` points to the key (an integer +in this case). Last, `s` is the 'output' variable of `HASH_FIND_INT`. The +final result is that `s` points to the structure with the given key, or +is `NULL` if the key wasn't found in the hash. + +[NOTE] +The middle argument is a 'pointer' to the key. You can't pass a literal key +value to `HASH_FIND`. Instead assign the literal value to a variable, and pass +a pointer to the variable. + + +Delete item +~~~~~~~~~~~ +To delete a structure from a hash, you must have a pointer to it. (If you only +have the key, first do a `HASH_FIND` to get the structure pointer). + +.Delete an item from a hash +---------------------------------------------------------------------- +void delete_user(struct my_struct *user) { + HASH_DEL(users, user); /* user: pointer to deletee */ + free(user); /* optional; it's up to you! */ +} +---------------------------------------------------------------------- + +Here again, `users` is the hash table, and `user` is a pointer to the +structure we want to remove from the hash. + +uthash never frees your structure +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Deleting a structure just removes it from the hash table-- it doesn't `free` +it. The choice of when to free your structure is entirely up to you; uthash +will never free your structure. For example when using `HASH_REPLACE` macros, +a replaced output argument is returned back, in order to make it possible for +the user to de-allocate it. + +Delete can change the pointer +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +The hash table pointer (which initially points to the first item added to the +hash) can change in response to `HASH_DEL` (i.e. if you delete the first item +in the hash table). + +Iterative deletion +^^^^^^^^^^^^^^^^^^ +The `HASH_ITER` macro is a deletion-safe iteration construct which expands +to a simple 'for' loop. + +.Delete all items from a hash +---------------------------------------------------------------------- +void delete_all() { + struct my_struct *current_user, *tmp; + + HASH_ITER(hh, users, current_user, tmp) { + HASH_DEL(users,current_user); /* delete; users advances to next */ + free(current_user); /* optional- if you want to free */ + } +} +---------------------------------------------------------------------- + +All-at-once deletion +^^^^^^^^^^^^^^^^^^^^ +If you only want to delete all the items, but not free them or do any +per-element clean up, you can do this more efficiently in a single operation: + + HASH_CLEAR(hh,users); + +Afterward, the list head (here, `users`) will be set to `NULL`. + +Count items +~~~~~~~~~~~ + +The number of items in the hash table can be obtained using `HASH_COUNT`: + +.Count of items in the hash table +---------------------------------------------------------------------- +unsigned int num_users; +num_users = HASH_COUNT(users); +printf("there are %u users\n", num_users); +---------------------------------------------------------------------- + +Incidentally, this works even the list (`users`, here) is `NULL`, in +which case the count is 0. + +Iterating and sorting +~~~~~~~~~~~~~~~~~~~~~ + +You can loop over the items in the hash by starting from the beginning and +following the `hh.next` pointer. + +.Iterating over all the items in a hash +---------------------------------------------------------------------- +void print_users() { + struct my_struct *s; + + for(s=users; s != NULL; s=s->hh.next) { + printf("user id %d: name %s\n", s->id, s->name); + } +} +---------------------------------------------------------------------- + +There is also an `hh.prev` pointer you could use to iterate backwards through +the hash, starting from any known item. + +[[deletesafe]] +Deletion-safe iteration +^^^^^^^^^^^^^^^^^^^^^^^ +In the example above, it would not be safe to delete and free `s` in the body +of the 'for' loop, (because `s` is derefenced each time the loop iterates). +This is easy to rewrite correctly (by copying the `s->hh.next` pointer to a +temporary variable 'before' freeing `s`), but it comes up often enough that a +deletion-safe iteration macro, `HASH_ITER`, is included. It expands to a +`for`-loop header. Here is how it could be used to rewrite the last example: + + struct my_struct *s, *tmp; + + HASH_ITER(hh, users, s, tmp) { + printf("user id %d: name %s\n", s->id, s->name); + /* ... it is safe to delete and free s here */ + } + +.A hash is also a doubly-linked list. +******************************************************************************* +Iterating backward and forward through the items in the hash is possible +because of the `hh.prev` and `hh.next` fields. All the items in the hash can +be reached by repeatedly following these pointers, thus the hash is also a +doubly-linked list. +******************************************************************************* + +If you're using uthash in a C++ program, you need an extra cast on the `for` +iterator, e.g., `s=(struct my_struct*)s->hh.next`. + +Sorting +^^^^^^^ +The items in the hash are visited in "insertion order" when you follow the +`hh.next` pointer. You can sort the items into a new order using `HASH_SORT`. + + HASH_SORT( users, name_sort ); + +The second argument is a pointer to a comparison function. It must accept two +pointer arguments (the items to compare), and must return an `int` which is +less than zero, zero, or greater than zero, if the first item sorts before, +equal to, or after the second item, respectively. (This is the same convention +used by `strcmp` or `qsort` in the standard C library). + + int sort_function(void *a, void *b) { + /* compare a to b (cast a and b appropriately) + * return (int) -1 if (a < b) + * return (int) 0 if (a == b) + * return (int) 1 if (a > b) + */ + } + +Below, `name_sort` and `id_sort` are two examples of sort functions. + +.Sorting the items in the hash +---------------------------------------------------------------------- +int name_sort(struct my_struct *a, struct my_struct *b) { + return strcmp(a->name,b->name); +} + +int id_sort(struct my_struct *a, struct my_struct *b) { + return (a->id - b->id); +} + +void sort_by_name() { + HASH_SORT(users, name_sort); +} + +void sort_by_id() { + HASH_SORT(users, id_sort); +} +---------------------------------------------------------------------- + +When the items in the hash are sorted, the first item may change position. In +the example above, `users` may point to a different structure after calling +`HASH_SORT`. + +A complete example +~~~~~~~~~~~~~~~~~~ + +We'll repeat all the code and embellish it with a `main()` function to form a +working example. + +If this code was placed in a file called `example.c` in the same directory as +`uthash.h`, it could be compiled and run like this: + + cc -o example example.c + ./example + +Follow the prompts to try the program. + +.A complete program +---------------------------------------------------------------------- +#include <stdio.h> /* gets */ +#include <stdlib.h> /* atoi, malloc */ +#include <string.h> /* strcpy */ +#include "uthash.h" + +struct my_struct { + int id; /* key */ + char name[10]; + UT_hash_handle hh; /* makes this structure hashable */ +}; + +struct my_struct *users = NULL; + +void add_user(int user_id, char *name) { + struct my_struct *s; + + HASH_FIND_INT(users, &user_id, s); /* id already in the hash? */ + if (s==NULL) { + s = (struct my_struct *)malloc(sizeof *s); + s->id = user_id; + HASH_ADD_INT( users, id, s ); /* id: name of key field */ + } + strcpy(s->name, name); +} + +struct my_struct *find_user(int user_id) { + struct my_struct *s; + + HASH_FIND_INT( users, &user_id, s ); /* s: output pointer */ + return s; +} + +void delete_user(struct my_struct *user) { + HASH_DEL(users, user); /* user: pointer to deletee */ + free(user); +} + +void delete_all() { + struct my_struct *current_user, *tmp; + + HASH_ITER(hh, users, current_user, tmp) { + HASH_DEL(users, current_user); /* delete it (users advances to next) */ + free(current_user); /* free it */ + } +} + +void print_users() { + struct my_struct *s; + + for(s=users; s != NULL; s=(struct my_struct*)(s->hh.next)) { + printf("user id %d: name %s\n", s->id, s->name); + } +} + +int name_sort(struct my_struct *a, struct my_struct *b) { + return strcmp(a->name,b->name); +} + +int id_sort(struct my_struct *a, struct my_struct *b) { + return (a->id - b->id); +} + +void sort_by_name() { + HASH_SORT(users, name_sort); +} + +void sort_by_id() { + HASH_SORT(users, id_sort); +} + +int main(int argc, char *argv[]) { + char in[10]; + int id=1, running=1; + struct my_struct *s; + unsigned num_users; + + while (running) { + printf(" 1. add user\n"); + printf(" 2. add/rename user by id\n"); + printf(" 3. find user\n"); + printf(" 4. delete user\n"); + printf(" 5. delete all users\n"); + printf(" 6. sort items by name\n"); + printf(" 7. sort items by id\n"); + printf(" 8. print users\n"); + printf(" 9. count users\n"); + printf("10. quit\n"); + gets(in); + switch(atoi(in)) { + case 1: + printf("name?\n"); + add_user(id++, gets(in)); + break; + case 2: + printf("id?\n"); + gets(in); id = atoi(in); + printf("name?\n"); + add_user(id, gets(in)); + break; + case 3: + printf("id?\n"); + s = find_user(atoi(gets(in))); + printf("user: %s\n", s ? s->name : "unknown"); + break; + case 4: + printf("id?\n"); + s = find_user(atoi(gets(in))); + if (s) delete_user(s); + else printf("id unknown\n"); + break; + case 5: + delete_all(); + break; + case 6: + sort_by_name(); + break; + case 7: + sort_by_id(); + break; + case 8: + print_users(); + break; + case 9: + num_users=HASH_COUNT(users); + printf("there are %u users\n", num_users); + break; + case 10: + running=0; + break; + } + } + + delete_all(); /* free any structures */ + return 0; +} +---------------------------------------------------------------------- + +This program is included in the distribution in `tests/example.c`. You can run +`make example` in that directory to compile it easily. + +Standard key types +------------------ +This section goes into specifics of how to work with different kinds of keys. +You can use nearly any type of key-- integers, strings, pointers, structures, etc. + +[NOTE] +.A note about float +================================================================================ +You can use floating point keys. This comes with the same caveats as with any +program that tests floating point equality. In other words, even the tiniest +difference in two floating point numbers makes them distinct keys. +================================================================================ + +Integer keys +~~~~~~~~~~~~ +The preceding examples demonstrated use of integer keys. To recap, use the +convenience macros `HASH_ADD_INT` and `HASH_FIND_INT` for structures with +integer keys. (The other operations such as `HASH_DELETE` and `HASH_SORT` are +the same for all types of keys). + +String keys +~~~~~~~~~~~ +If your structure has a string key, the operations to use depend on whether your +structure 'points to' the key (`char *`) or the string resides `within` the +structure (`char a[10]`). *This distinction is important*. As we'll see below, +you need to use `HASH_ADD_KEYPTR` when your structure 'points' to a key (that is, +the key itself is 'outside' of the structure); in contrast, use `HASH_ADD_STR` +for a string key that is contained *within* your structure. + +[NOTE] +.char[ ] vs. char* +================================================================================ +The string is 'within' the structure in the first example below-- `name` is a +`char[10]` field. In the second example, the key is 'outside' of the +structure-- `name` is a `char *`. So the first example uses `HASH_ADD_STR` but +the second example uses `HASH_ADD_KEYPTR`. For information on this macro, see +the <<Macro_reference,Macro reference>>. +================================================================================ + +String 'within' structure +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.A string-keyed hash (string within structure) +---------------------------------------------------------------------- +#include <string.h> /* strcpy */ +#include <stdlib.h> /* malloc */ +#include <stdio.h> /* printf */ +#include "uthash.h" + +struct my_struct { + char name[10]; /* key (string is WITHIN the structure) */ + int id; + UT_hash_handle hh; /* makes this structure hashable */ +}; + + +int main(int argc, char *argv[]) { + const char *names[] = { "joe", "bob", "betty", NULL }; + struct my_struct *s, *tmp, *users = NULL; + + for (int i = 0; names[i]; ++i) { + s = (struct my_struct *)malloc(sizeof *s); + strcpy(s->name, names[i]); + s->id = i; + HASH_ADD_STR( users, name, s ); + } + + HASH_FIND_STR( users, "betty", s); + if (s) printf("betty's id is %d\n", s->id); + + /* free the hash table contents */ + HASH_ITER(hh, users, s, tmp) { + HASH_DEL(users, s); + free(s); + } + return 0; +} +---------------------------------------------------------------------- + +This example is included in the distribution in `tests/test15.c`. It prints: + + betty's id is 2 + +String 'pointer' in structure +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Now, here is the same example but using a `char *` key instead of `char [ ]`: + +.A string-keyed hash (structure points to string) +---------------------------------------------------------------------- +#include <string.h> /* strcpy */ +#include <stdlib.h> /* malloc */ +#include <stdio.h> /* printf */ +#include "uthash.h" + +struct my_struct { + const char *name; /* key */ + int id; + UT_hash_handle hh; /* makes this structure hashable */ +}; + + +int main(int argc, char *argv[]) { + const char *names[] = { "joe", "bob", "betty", NULL }; + struct my_struct *s, *tmp, *users = NULL; + + for (int i = 0; names[i]; ++i) { + s = (struct my_struct *)malloc(sizeof *s); + s->name = names[i]; + s->id = i; + HASH_ADD_KEYPTR( hh, users, s->name, strlen(s->name), s ); + } + + HASH_FIND_STR( users, "betty", s); + if (s) printf("betty's id is %d\n", s->id); + + /* free the hash table contents */ + HASH_ITER(hh, users, s, tmp) { + HASH_DEL(users, s); + free(s); + } + return 0; +} +---------------------------------------------------------------------- + +This example is included in `tests/test40.c`. + +Pointer keys +~~~~~~~~~~~~ +Your key can be a pointer. To be very clear, this means the 'pointer itself' +can be the key (in contrast, if the thing 'pointed to' is the key, this is a +different use case handled by `HASH_ADD_KEYPTR`). + +Here is a simple example where a structure has a pointer member, called `key`. + +.A pointer key +---------------------------------------------------------------------- +#include <stdio.h> +#include <stdlib.h> +#include "uthash.h" + +typedef struct { + void *key; + int i; + UT_hash_handle hh; +} el_t; + +el_t *hash = NULL; +char *someaddr = NULL; + +int main() { + el_t *d; + el_t *e = (el_t *)malloc(sizeof *e); + if (!e) return -1; + e->key = (void*)someaddr; + e->i = 1; + HASH_ADD_PTR(hash,key,e); + HASH_FIND_PTR(hash, &someaddr, d); + if (d) printf("found\n"); + + /* release memory */ + HASH_DEL(hash,e); + free(e); + return 0; +} +---------------------------------------------------------------------- + +This example is included in `tests/test57.c`. Note that the end of the program +deletes the element out of the hash, (and since no more elements remain in the +hash), uthash releases its internal memory. + +Structure keys +~~~~~~~~~~~~~~ +Your key field can have any data type. To uthash, it is just a sequence of +bytes. Therefore, even a nested structure can be used as a key. We'll use the +general macros `HASH_ADD` and `HASH_FIND` to demonstrate. + +NOTE: Structures contain padding (wasted internal space used to fulfill +alignment requirements for the members of the structure). These padding bytes +'must be zeroed' before adding an item to the hash or looking up an item. +Therefore always zero the whole structure before setting the members of +interest. The example below does this-- see the two calls to `memset`. + +.A key which is a structure +---------------------------------------------------------------------- +#include <stdlib.h> +#include <stdio.h> +#include "uthash.h" + +typedef struct { + char a; + int b; +} record_key_t; + +typedef struct { + record_key_t key; + /* ... other data ... */ + UT_hash_handle hh; +} record_t; + +int main(int argc, char *argv[]) { + record_t l, *p, *r, *tmp, *records = NULL; + + r = (record_t *)malloc(sizeof *r); + memset(r, 0, sizeof *r); + r->key.a = 'a'; + r->key.b = 1; + HASH_ADD(hh, records, key, sizeof(record_key_t), r); + + memset(&l, 0, sizeof(record_t)); + l.key.a = 'a'; + l.key.b = 1; + HASH_FIND(hh, records, &l.key, sizeof(record_key_t), p); + + if (p) printf("found %c %d\n", p->key.a, p->key.b); + + HASH_ITER(hh, records, p, tmp) { + HASH_DEL(records, p); + free(p); + } + return 0; +} + +---------------------------------------------------------------------- + +This usage is nearly the same as use of a compound key explained below. + +Note that the general macros require the name of the `UT_hash_handle` to be +passed as the first argument (here, this is `hh`). The general macros are +documented in <<Macro_reference,Macro Reference>>. + +Advanced Topics +--------------- + +Compound keys +~~~~~~~~~~~~~ +Your key can even comprise multiple contiguous fields. + +.A multi-field key +---------------------------------------------------------------------- +#include <stdlib.h> /* malloc */ +#include <stddef.h> /* offsetof */ +#include <stdio.h> /* printf */ +#include <string.h> /* memset */ +#include "uthash.h" + +#define UTF32 1 + +typedef struct { + UT_hash_handle hh; + int len; + char encoding; /* these two fields */ + int text[]; /* comprise the key */ +} msg_t; + +typedef struct { + char encoding; + int text[]; +} lookup_key_t; + +int main(int argc, char *argv[]) { + unsigned keylen; + msg_t *msg, *tmp, *msgs = NULL; + lookup_key_t *lookup_key; + + int beijing[] = {0x5317, 0x4eac}; /* UTF-32LE for 北京 */ + + /* allocate and initialize our structure */ + msg = (msg_t *)malloc( sizeof(msg_t) + sizeof(beijing) ); + memset(msg, 0, sizeof(msg_t)+sizeof(beijing)); /* zero fill */ + msg->len = sizeof(beijing); + msg->encoding = UTF32; + memcpy(msg->text, beijing, sizeof(beijing)); + + /* calculate the key length including padding, using formula */ + keylen = offsetof(msg_t, text) /* offset of last key field */ + + sizeof(beijing) /* size of last key field */ + - offsetof(msg_t, encoding); /* offset of first key field */ + + /* add our structure to the hash table */ + HASH_ADD( hh, msgs, encoding, keylen, msg); + + /* look it up to prove that it worked :-) */ + msg=NULL; + + lookup_key = (lookup_key_t *)malloc(sizeof(*lookup_key) + sizeof(beijing)); + memset(lookup_key, 0, sizeof(*lookup_key) + sizeof(beijing)); + lookup_key->encoding = UTF32; + memcpy(lookup_key->text, beijing, sizeof(beijing)); + HASH_FIND( hh, msgs, &lookup_key->encoding, keylen, msg ); + if (msg) printf("found \n"); + free(lookup_key); + + HASH_ITER(hh, msgs, msg, tmp) { + HASH_DEL(msgs, msg); + free(msg); + } + return 0; +} +---------------------------------------------------------------------- + +This example is included in the distribution in `tests/test22.c`. + +If you use multi-field keys, recognize that the compiler pads adjacent fields +(by inserting unused space between them) in order to fulfill the alignment +requirement of each field. For example a structure containing a `char` followed +by an `int` will normally have 3 "wasted" bytes of padding after the char, in +order to make the `int` field start on a multiple-of-4 address (4 is the length +of the int). + +[[multifield_note]] +.Calculating the length of a multi-field key: +******************************************************************************* +To determine the key length when using a multi-field key, you must include any +intervening structure padding the compiler adds for alignment purposes. + +An easy way to calculate the key length is to use the `offsetof` macro from +`<stddef.h>`. The formula is: + + key length = offsetof(last_key_field) + + sizeof(last_key_field) + - offsetof(first_key_field) + +In the example above, the `keylen` variable is set using this formula. +******************************************************************************* + +When dealing with a multi-field key, you must zero-fill your structure before +`HASH_ADD`'ing it to a hash table, or using its fields in a `HASH_FIND` key. + +In the previous example, `memset` is used to initialize the structure by +zero-filling it. This zeroes out any padding between the key fields. If we +didn't zero-fill the structure, this padding would contain random values. The +random values would lead to `HASH_FIND` failures; as two "identical" keys will +appear to mismatch if there are any differences within their padding. + +Alternatively, you can customize the global <<hash_keycompare,key comparison function>> +and <<hash_functions,key hashing function>> to ignore the padding in your key. +See <<hash_keycompare,Specifying an alternate key comparison function>>. + +[[multilevel]] +Multi-level hash tables +~~~~~~~~~~~~~~~~~~~~~~~ +A multi-level hash table arises when each element of a hash table contains its +own secondary hash table. There can be any number of levels. In a scripting +language you might see: + + $items{bob}{age}=37 + +The C program below builds this example in uthash: the hash table is called +`items`. It contains one element (`bob`) whose own hash table contains one +element (`age`) with value 37. No special functions are necessary to build +a multi-level hash table. + +While this example represents both levels (`bob` and `age`) using the same +structure, it would also be fine to use two different structure definitions. +It would also be fine if there were three or more levels instead of two. + +.Multi-level hash table +---------------------------------------------------------------------- +#include <stdio.h> +#include <string.h> +#include <stdlib.h> +#include "uthash.h" + +/* hash of hashes */ +typedef struct item { + char name[10]; + struct item *sub; + int val; + UT_hash_handle hh; +} item_t; + +item_t *items=NULL; + +int main(int argc, char *argvp[]) { + item_t *item1, *item2, *tmp1, *tmp2; + + /* make initial element */ + item_t *i = malloc(sizeof(*i)); + strcpy(i->name, "bob"); + i->sub = NULL; + i->val = 0; + HASH_ADD_STR(items, name, i); + + /* add a sub hash table off this element */ + item_t *s = malloc(sizeof(*s)); + strcpy(s->name, "age"); + s->sub = NULL; + s->val = 37; + HASH_ADD_STR(i->sub, name, s); + + /* iterate over hash elements */ + HASH_ITER(hh, items, item1, tmp1) { + HASH_ITER(hh, item1->sub, item2, tmp2) { + printf("$items{%s}{%s} = %d\n", item1->name, item2->name, item2->val); + } + } + + /* clean up both hash tables */ + HASH_ITER(hh, items, item1, tmp1) { + HASH_ITER(hh, item1->sub, item2, tmp2) { + HASH_DEL(item1->sub, item2); + free(item2); + } + HASH_DEL(items, item1); + free(item1); + } + + return 0; +} +---------------------------------------------------------------------- +The example above is included in `tests/test59.c`. + +[[multihash]] +Items in several hash tables +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A structure can be added to more than one hash table. A few reasons you might do +this include: + +- each hash table may use a different key; +- each hash table may have its own sort order; +- or you might simply use multiple hash tables for grouping purposes. E.g., + you could have users in an `admin_users` and a `users` hash table. + +Your structure needs to have a `UT_hash_handle` field for each hash table to +which it might be added. You can name them anything. E.g., + + UT_hash_handle hh1, hh2; + +Items with multiple keys +~~~~~~~~~~~~~~~~~~~~~~~~ +You might create a hash table keyed on an ID field, and another hash table keyed +on username (if usernames are unique). You can add the same user structure to +both hash tables (without duplication of the structure), allowing lookup of a +user structure by their name or ID. The way to achieve this is to have a +separate `UT_hash_handle` for each hash to which the structure may be added. + +.A structure with two different keys +---------------------------------------------------------------------- +struct my_struct { + int id; /* first key */ + char username[10]; /* second key */ + UT_hash_handle hh1; /* handle for first hash table */ + UT_hash_handle hh2; /* handle for second hash table */ +}; +---------------------------------------------------------------------- + +In the example above, the structure can now be added to two separate hash +tables. In one hash, `id` is its key, while in the other hash, `username` is +its key. (There is no requirement that the two hashes have different key +fields. They could both use the same key, such as `id`). + +Notice the structure has two hash handles (`hh1` and `hh2`). In the code +below, notice that each hash handle is used exclusively with a particular hash +table. (`hh1` is always used with the `users_by_id` hash, while `hh2` is +always used with the `users_by_name` hash table). + +.Two keys on a structure +---------------------------------------------------------------------- + struct my_struct *users_by_id = NULL, *users_by_name = NULL, *s; + int i; + char *name; + + s = malloc(sizeof(struct my_struct)); + s->id = 1; + strcpy(s->username, "thanson"); + + /* add the structure to both hash tables */ + HASH_ADD(hh1, users_by_id, id, sizeof(int), s); + HASH_ADD(hh2, users_by_name, username, strlen(s->username), s); + + /* find user by ID in the "users_by_id" hash table */ + i=1; + HASH_FIND(hh1, users_by_id, &i, sizeof(int), s); + if (s) printf("found id %d: %s\n", i, s->username); + + /* find user by username in the "users_by_name" hash table */ + name = "thanson"; + HASH_FIND(hh2, users_by_name, name, strlen(name), s); + if (s) printf("found user %s: %d\n", name, s->id); +---------------------------------------------------------------------- + + +Sorted insertion of new items +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +If you would like to maintain a sorted hash you have two options. The first +option is to use the HASH_SRT() macro, which will sort any unordered list in +'O(n log(n))'. This is the best strategy if you're just filling up a hash +table with items in random order with a single final HASH_SRT() operation +when all is done. Obviously, this won't do what you want if you need +the list to be in an ordered state at times between insertion of +items. You can use HASH_SRT() after every insertion operation, but that will +yield a computational complexity of 'O(n^2 log n)'. + +The second route you can take is via the in-order add and replace macros. +The `HASH_ADD_INORDER*` macros work just like their `HASH_ADD*` counterparts, but +with an additional comparison-function argument: + + int name_sort(struct my_struct *a, struct my_struct *b) { + return strcmp(a->name,b->name); + } + + HASH_ADD_KEYPTR_INORDER(hh, items, &item->name, strlen(item->name), item, name_sort); + +New items are sorted at insertion time in 'O(n)', thus resulting in a +total computational complexity of 'O(n^2)' for the creation of the hash +table with all items. +For in-order add to work, the list must be in an ordered state before +insertion of the new item. + +Several sort orders +~~~~~~~~~~~~~~~~~~~ +It comes as no surprise that two hash tables can have different sort orders, but +this fact can also be used advantageously to sort the 'same items' in several +ways. This is based on the ability to store a structure in several hash tables. + +Extending the previous example, suppose we have many users. We have added each +user structure to the `users_by_id` hash table and the `users_by_name` hash table. +(To reiterate, this is done without the need to have two copies of each structure.) +Now we can define two sort functions, then use `HASH_SRT`. + + int sort_by_id(struct my_struct *a, struct my_struct *b) { + if (a->id == b->id) return 0; + return (a->id < b->id) ? -1 : 1; + } + + int sort_by_name(struct my_struct *a, struct my_struct *b) { + return strcmp(a->username,b->username); + } + + HASH_SRT(hh1, users_by_id, sort_by_id); + HASH_SRT(hh2, users_by_name, sort_by_name); + +Now iterating over the items in `users_by_id` will traverse them in id-order +while, naturally, iterating over `users_by_name` will traverse them in +name-order. The items are fully forward-and-backward linked in each order. +So even for one set of users, we might store them in two hash tables to provide +easy iteration in two different sort orders. + +Bloom filter (faster misses) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Programs that generate a fair miss rate (`HASH_FIND` that result in `NULL`) may +benefit from the built-in Bloom filter support. This is disabled by default, +because programs that generate only hits would incur a slight penalty from it. +Also, programs that do deletes should not use the Bloom filter. While the +program would operate correctly, deletes diminish the benefit of the filter. +To enable the Bloom filter, simply compile with `-DHASH_BLOOM=n` like: + + -DHASH_BLOOM=27 + +where the number can be any value up to 32 which determines the amount of memory +used by the filter, as shown below. Using more memory makes the filter more +accurate and has the potential to speed up your program by making misses bail +out faster. + +.Bloom filter sizes for selected values of n +[width="50%",cols="10m,30",grid="none",options="header"] +|===================================================================== +| n | Bloom filter size (per hash table) +| 16 | 8 kilobytes +| 20 | 128 kilobytes +| 24 | 2 megabytes +| 28 | 32 megabytes +| 32 | 512 megabytes +|===================================================================== + +Bloom filters are only a performance feature; they do not change the results of +hash operations in any way. The only way to gauge whether or not a Bloom filter +is right for your program is to test it. Reasonable values for the size of the +Bloom filter are 16-32 bits. + +Select +~~~~~~ +An experimental 'select' operation is provided that inserts those items from a +source hash that satisfy a given condition into a destination hash. This +insertion is done with somewhat more efficiency than if this were using +`HASH_ADD`, namely because the hash function is not recalculated for keys of the +selected items. This operation does not remove any items from the source hash. +Rather the selected items obtain dual presence in both hashes. The destination +hash may already have items in it; the selected items are added to it. In order +for a structure to be usable with `HASH_SELECT`, it must have two or more hash +handles. (As described <<multihash,here>>, a structure can exist in many +hash tables at the same time; it must have a separate hash handle for each one). + + user_t *users=NULL, *admins=NULL; /* two hash tables */ + + typedef struct { + int id; + UT_hash_handle hh; /* handle for users hash */ + UT_hash_handle ah; /* handle for admins hash */ + } user_t; + +Now suppose we have added some users, and want to select just the administrator +users who have id's less than 1024. + + #define is_admin(x) (((user_t*)x)->id < 1024) + HASH_SELECT(ah,admins,hh,users,is_admin); + +The first two parameters are the 'destination' hash handle and hash table, the +second two parameters are the 'source' hash handle and hash table, and the last +parameter is the 'select condition'. Here we used a macro `is_admin()` but we +could just as well have used a function. + + int is_admin(void *userv) { + user_t *user = (user_t*)userv; + return (user->id < 1024) ? 1 : 0; + } + +If the select condition always evaluates to true, this operation is +essentially a 'merge' of the source hash into the destination hash. Of course, +the source hash remains unchanged under any use of `HASH_SELECT`. It only adds +items to the destination hash selectively. + +The two hash handles must differ. An example of using `HASH_SELECT` is included +in `tests/test36.c`. + +[[hash_keycompare]] +Specifying an alternate key comparison function +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When you call `HASH_FIND(hh, head, intfield, sizeof(int), out)`, uthash will +first call <<hash_functions,`HASH_FUNCTION`>>`(intfield, sizeof(int), hashvalue)` to +determine the bucket `b` in which to search, and then, for each element `elt` +of bucket `b`, uthash will evaluate +`elt->hh.hashv == hashvalue && elt.hh.keylen == sizeof(int) && HASH_KEYCMP(intfield, elt->hh.key, sizeof(int)) == 0`. +`HASH_KEYCMP` should return `0` to indicate that `elt` is a match and should be +returned, and any non-zero value to indicate that the search for a matching +element should continue. + +By default, uthash defines `HASH_KEYCMP` as an alias for `memcmp`. On platforms +that do not provide `memcmp`, you can substitute your own implementation. + +---------------------------------------------------------------------------- +#undef HASH_KEYCMP +#define HASH_KEYCMP(a,b,len) bcmp(a,b,len) +---------------------------------------------------------------------------- + +Another reason to substitute your own key comparison function is if your "key" is not +trivially comparable. In this case you will also need to substitute your own `HASH_FUNCTION`. + +---------------------------------------------------------------------------- +struct Key { + short s; + /* 2 bytes of padding */ + float f; +}; +/* do not compare the padding bytes; do not use memcmp on floats */ +unsigned key_hash(struct Key *s) { return s + (unsigned)f; } +bool key_equal(struct Key *a, struct Key *b) { return a.s == b.s && a.f == b.f; } + +#define HASH_FUNCTION(s,len,hashv) (hashv) = key_hash((struct Key *)s) +#define HASH_KEYCMP(a,b,len) (!key_equal((struct Key *)a, (struct Key *)b)) +---------------------------------------------------------------------------- + +Another reason to substitute your own key comparison function is to trade off +correctness for raw speed. During its linear search of a bucket, uthash always +compares the 32-bit `hashv` first, and calls `HASH_KEYCMP` only if the `hashv` +compares equal. This means that `HASH_KEYCMP` is called at least once per +successful find. Given a good hash function, we expect the `hashv` comparison to +produce a "false positive" equality only once in four billion times. Therefore, +we expect `HASH_KEYCMP` to produce `0` most of the time. If we expect many +successful finds, and our application doesn't mind the occasional false positive, +we might substitute a no-op comparison function: + +---------------------------------------------------------------------------- +#undef HASH_KEYCMP +#define HASH_KEYCMP(a,b,len) 0 /* occasionally wrong, but very fast */ +---------------------------------------------------------------------------- + +Note: The global equality-comparison function `HASH_KEYCMP` has no relationship +at all to the lessthan-comparison function passed as a parameter to `HASH_ADD_INORDER`. + +[[hash_functions]] +Built-in hash functions +~~~~~~~~~~~~~~~~~~~~~~~ +Internally, a hash function transforms a key into a bucket number. You don't +have to take any action to use the default hash function, currently Jenkins. + +Some programs may benefit from using another of the built-in hash functions. +There is a simple analysis utility included with uthash to help you determine +if another hash function will give you better performance. + +You can use a different hash function by compiling your program with +`-DHASH_FUNCTION=HASH_xyz` where `xyz` is one of the symbolic names listed +below. E.g., + + cc -DHASH_FUNCTION=HASH_BER -o program program.c + +.Built-in hash functions +[width="50%",cols="^5m,20",grid="none",options="header"] +|=============================================================================== +|Symbol | Name +|JEN | Jenkins (default) +|BER | Bernstein +|SAX | Shift-Add-Xor +|OAT | One-at-a-time +|FNV | Fowler/Noll/Vo +|SFH | Paul Hsieh +|=============================================================================== + +Which hash function is best? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +You can easily determine the best hash function for your key domain. To do so, +you'll need to run your program once in a data-collection pass, and then run +the collected data through an included analysis utility. + +First you must build the analysis utility. From the top-level directory, + + cd tests/ + make + +We'll use `test14.c` to demonstrate the data-collection and analysis steps +(here using `sh` syntax to redirect file descriptor 3 to a file): + +.Using keystats +-------------------------------------------------------------------------------- +% cc -DHASH_EMIT_KEYS=3 -I../src -o test14 test14.c +% ./test14 3>test14.keys +% ./keystats test14.keys +fcn ideal% #items #buckets dup% fl add_usec find_usec del-all usec +--- ------ ---------- ---------- ----- -- ---------- ---------- ------------ +SFH 91.6% 1219 256 0% ok 92 131 25 +FNV 90.3% 1219 512 0% ok 107 97 31 +SAX 88.7% 1219 512 0% ok 111 109 32 +OAT 87.2% 1219 256 0% ok 99 138 26 +JEN 86.7% 1219 256 0% ok 87 130 27 +BER 86.2% 1219 256 0% ok 121 129 27 +-------------------------------------------------------------------------------- + +[NOTE] +The number 3 in `-DHASH_EMIT_KEYS=3` is a file descriptor. Any file descriptor +that your program doesn't use for its own purposes can be used instead of 3. +The data-collection mode enabled by `-DHASH_EMIT_KEYS=x` should not be used in +production code. + +Usually, you should just pick the first hash function that is listed. Here, this +is `SFH`. This is the function that provides the most even distribution for +your keys. If several have the same `ideal%`, then choose the fastest one +according to the `find_usec` column. + +keystats column reference +^^^^^^^^^^^^^^^^^^^^^^^^^ +fcn:: + symbolic name of hash function +ideal%:: + The percentage of items in the hash table which can be looked up within an + ideal number of steps. (Further explained below). +#items:: + the number of keys that were read in from the emitted key file +#buckets:: + the number of buckets in the hash after all the keys were added +dup%:: + the percent of duplicate keys encountered in the emitted key file. + Duplicates keys are filtered out to maintain key uniqueness. (Duplicates + are normal. For example, if the application adds an item to a hash, + deletes it, then re-adds it, the key is written twice to the emitted file.) +flags:: + this is either `ok`, or `nx` (noexpand) if the expansion inhibited flag is + set, described in <<expansion,Expansion internals>>. It is not recommended + to use a hash function that has the `noexpand` flag set. +add_usec:: + the clock time in microseconds required to add all the keys to a hash +find_usec:: + the clock time in microseconds required to look up every key in the hash +del-all usec:: + the clock time in microseconds required to delete every item in the hash + +[[ideal]] +ideal% +^^^^^^ + +.What is ideal%? +***************************************************************************** +The 'n' items in a hash are distributed into 'k' buckets. Ideally each bucket +would contain an equal share '(n/k)' of the items. In other words, the maximum +linear position of any item in a bucket chain would be 'n/k' if every bucket is +equally used. If some buckets are overused and others are underused, the +overused buckets will contain items whose linear position surpasses 'n/k'. +Such items are considered non-ideal. + +As you might guess, `ideal%` is the percentage of ideal items in the hash. These +items have favorable linear positions in their bucket chains. As `ideal%` +approaches 100%, the hash table approaches constant-time lookup performance. +***************************************************************************** + +[[hashscan]] +hashscan +~~~~~~~~ +NOTE: This utility is only available on Linux, and on FreeBSD (8.1 and up). + +A utility called `hashscan` is included in the `tests/` directory. It +is built automatically when you run `make` in that directory. This tool +examines a running process and reports on the uthash tables that it finds in +that program's memory. It can also save the keys from each table in a format +that can be fed into `keystats`. + +Here is an example of using `hashscan`. First ensure that it is built: + + cd tests/ + make + +Since `hashscan` needs a running program to inspect, we'll start up a simple +program that makes a hash table and then sleeps as our test subject: + + ./test_sleep & + pid: 9711 + +Now that we have a test program, let's run `hashscan` on it: + + ./hashscan 9711 + Address ideal items buckets mc fl bloom/sat fcn keys saved to + ------------------ ----- -------- -------- -- -- --------- --- ------------- + 0x862e038 81% 10000 4096 11 ok 16 14% JEN + +If we wanted to copy out all its keys for external analysis using `keystats`, +add the `-k` flag: + + ./hashscan -k 9711 + Address ideal items buckets mc fl bloom/sat fcn keys saved to + ------------------ ----- -------- -------- -- -- --------- --- ------------- + 0x862e038 81% 10000 4096 11 ok 16 14% JEN /tmp/9711-0.key + +Now we could run `./keystats /tmp/9711-0.key` to analyze which hash function +has the best characteristics on this set of keys. + +hashscan column reference +^^^^^^^^^^^^^^^^^^^^^^^^^ +Address:: + virtual address of the hash table +ideal:: + The percentage of items in the table which can be looked up within an ideal + number of steps. See <<ideal>> in the `keystats` section. +items:: + number of items in the hash table +buckets:: + number of buckets in the hash table +mc:: + the maximum chain length found in the hash table (uthash usually tries to + keep fewer than 10 items in each bucket, or in some cases a multiple of 10) +fl:: + flags (either `ok`, or `NX` if the expansion-inhibited flag is set) +bloom/sat:: + if the hash table uses a Bloom filter, this is the size (as a power of two) + of the filter (e.g. 16 means the filter is 2^16 bits in size). The second + number is the "saturation" of the bits expressed as a percentage. The lower + the percentage, the more potential benefit to identify cache misses quickly. +fcn:: + symbolic name of hash function +keys saved to:: + file to which keys were saved, if any + +.How hashscan works +***************************************************************************** +When hashscan runs, it attaches itself to the target process, which suspends +the target process momentarily. During this brief suspension, it scans the +target's virtual memory for the signature of a uthash hash table. It then +checks if a valid hash table structure accompanies the signature and reports +what it finds. When it detaches, the target process resumes running normally. +The hashscan is performed "read-only"-- the target process is not modified. +Since hashscan is analyzing a momentary snapshot of a running process, it may +return different results from one run to another. +***************************************************************************** + +[[expansion]] +Expansion internals +~~~~~~~~~~~~~~~~~~~ +Internally this hash manages the number of buckets, with the goal of having +enough buckets so that each one contains only a small number of items. + +.Why does the number of buckets matter? +******************************************************************************** +When looking up an item by its key, this hash scans linearly through the items +in the appropriate bucket. In order for the linear scan to run in constant +time, the number of items in each bucket must be bounded. This is accomplished +by increasing the number of buckets as needed. +******************************************************************************** + +Normal expansion +^^^^^^^^^^^^^^^^ +This hash attempts to keep fewer than 10 items in each bucket. When an item is +added that would cause a bucket to exceed this number, the number of buckets in +the hash is doubled and the items are redistributed into the new buckets. In an +ideal world, each bucket will then contain half as many items as it did before. + +Bucket expansion occurs automatically and invisibly as needed. There is +no need for the application to know when it occurs. + +Per-bucket expansion threshold +++++++++++++++++++++++++++++++ +Normally all buckets share the same threshold (10 items) at which point bucket +expansion is triggered. During the process of bucket expansion, uthash can +adjust this expansion-trigger threshold on a per-bucket basis if it sees that +certain buckets are over-utilized. + +When this threshold is adjusted, it goes from 10 to a multiple of 10 (for that +particular bucket). The multiple is based on how many times greater the actual +chain length is than the ideal length. It is a practical measure to reduce +excess bucket expansion in the case where a hash function over-utilizes a few +buckets but has good overall distribution. However, if the overall distribution +gets too bad, uthash changes tactics. + +Inhibited expansion +^^^^^^^^^^^^^^^^^^^ +You usually don't need to know or worry about this, particularly if you used +the `keystats` utility during development to select a good hash for your keys. + +A hash function may yield an uneven distribution of items across the buckets. +In moderation this is not a problem. Normal bucket expansion takes place as +the chain lengths grow. But when significant imbalance occurs (because the hash +function is not well suited to the key domain), bucket expansion may be +ineffective at reducing the chain lengths. + +Imagine a very bad hash function which always puts every item in bucket 0. No +matter how many times the number of buckets is doubled, the chain length of +bucket 0 stays the same. In a situation like this, the best behavior is to +stop expanding, and accept 'O(n)' lookup performance. This is what uthash +does. It degrades gracefully if the hash function is ill-suited to the keys. + +If two consecutive bucket expansions yield `ideal%` values below 50%, uthash +inhibits expansion for that hash table. Once set, the 'bucket expansion +inhibited' flag remains in effect as long as the hash has items in it. +Inhibited expansion may cause `HASH_FIND` to exhibit worse than constant-time +performance. + +Diagnostic hooks +^^^^^^^^^^^^^^^^ + +There are two "notification" hooks which get executed if uthash is +expanding buckets, or setting the 'bucket expansion inhibited' flag. +There is no need for the application to set these hooks or take action in +response to these events. They are mainly for diagnostic purposes. +Normally both of these hooks are undefined and thus compile away to nothing. + +The `uthash_expand_fyi` hook can be defined to execute code whenever +uthash performs a bucket expansion. + +---------------------------------------------------------------------------- +#undef uthash_expand_fyi +#define uthash_expand_fyi(tbl) printf("expanded to %u buckets\n", tbl->num_buckets) +---------------------------------------------------------------------------- + +The `uthash_noexpand_fyi` hook can be defined to execute code whenever +uthash sets the 'bucket expansion inhibited' flag. + +---------------------------------------------------------------------------- +#undef uthash_noexpand_fyi +#define uthash_noexpand_fyi(tbl) printf("warning: bucket expansion inhibited\n") +---------------------------------------------------------------------------- + +Hooks +~~~~~ +You don't need to use these hooks -- they are only here if you want to modify +the behavior of uthash. Hooks can be used to replace standard library functions +that might be unavailable on some platforms, to change how uthash allocates +memory, or to run code in response to certain internal events. + +The `uthash.h` header will define these hooks to default values, unless they +are already defined. It is safe either to `#undef` and redefine them +after including `uthash.h`, or to define them before inclusion; for +example, by passing `-Duthash_malloc=my_malloc` on the command line. + +Specifying alternate memory management functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +By default, uthash uses `malloc` and `free` to manage memory. +If your application uses its own custom allocator, uthash can use them too. + +---------------------------------------------------------------------------- +#include "uthash.h" + +/* undefine the defaults */ +#undef uthash_malloc +#undef uthash_free + +/* re-define, specifying alternate functions */ +#define uthash_malloc(sz) my_malloc(sz) +#define uthash_free(ptr,sz) my_free(ptr) + +... +---------------------------------------------------------------------------- + +Notice that `uthash_free` receives two parameters. The `sz` parameter is for +convenience on embedded platforms that manage their own memory. + +Specifying alternate standard library functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Uthash also uses `strlen` (in the `HASH_FIND_STR` convenience macro, for +example) and `memset` (used only for zeroing memory). On platforms that do not +provide these functions, you can substitute your own implementations. + +---------------------------------------------------------------------------- +#undef uthash_bzero +#define uthash_bzero(a,len) my_bzero(a,len) + +#undef uthash_strlen +#define uthash_strlen(s) my_strlen(s) +---------------------------------------------------------------------------- + +Out of memory +^^^^^^^^^^^^^ +If memory allocation fails (i.e., the `uthash_malloc` function returns `NULL`), +the default behavior is to terminate the process by calling `exit(-1)`. This +can be modified by re-defining the `uthash_fatal` macro. + +---------------------------------------------------------------------------- +#undef uthash_fatal +#define uthash_fatal(msg) my_fatal_function(msg) +---------------------------------------------------------------------------- + +The fatal function should terminate the process or `longjmp` back to a safe +place. Note that an allocation failure may leave allocated memory that cannot +be recovered. After `uthash_fatal`, the hash table object should be considered +unusable; it might not be safe even to run `HASH_CLEAR` on the hash table +when it is in this state. + +To enable "returning a failure" if memory cannot be allocated, define the +macro `HASH_NONFATAL_OOM` before including the `uthash.h` header file. In this +case, `uthash_fatal` is not used; instead, each allocation failure results in +a single call to `uthash_nonfatal_oom(elt)` where `elt` is the address of the +element whose insertion triggered the failure. The default behavior of +`uthash_nonfatal_oom` is a no-op. + +---------------------------------------------------------------------------- +#undef uthash_nonfatal_oom +#define uthash_nonfatal_oom(elt) perhaps_recover((element_t *) elt) +---------------------------------------------------------------------------- + +Before the call to `uthash_nonfatal_oom`, the hash table is rolled back +to the state it was in prior to the problematic insertion; no memory is +leaked. It is safe to `throw` or `longjmp` out of the `uthash_nonfatal_oom` +handler. + +The `elt` argument will be of the correct pointer-to-element type, unless +`uthash_nonfatal_oom` is invoked from `HASH_SELECT`, in which case it will +be of `void*` type and must be cast before using. In any case, `elt->hh.tbl` +will be `NULL`. + +Allocation failure is possible only when adding elements to the hash table +(including the `ADD`, `REPLACE`, and `SELECT` operations). +`uthash_free` is not allowed to fail. + +Debug mode +~~~~~~~~~~ +If a program that uses this hash is compiled with `-DHASH_DEBUG=1`, a special +internal consistency-checking mode is activated. In this mode, the integrity +of the whole hash is checked following every add or delete operation. This is +for debugging the uthash software only, not for use in production code. + +In the `tests/` directory, running `make debug` will run all the tests in +this mode. + +In this mode, any internal errors in the hash data structure will cause a +message to be printed to `stderr` and the program to exit. + +The `UT_hash_handle` data structure includes `next`, `prev`, `hh_next` and +`hh_prev` fields. The former two fields determine the "application" ordering +(that is, insertion order-- the order the items were added). The latter two +fields determine the "bucket chain" order. These link the `UT_hash_handles` +together in a doubly-linked list that is a bucket chain. + +Checks performed in `-DHASH_DEBUG=1` mode: + +- the hash is walked in its entirety twice: once in 'bucket' order and a + second time in 'application' order +- the total number of items encountered in both walks is checked against the + stored number +- during the walk in 'bucket' order, each item's `hh_prev` pointer is compared + for equality with the last visited item +- during the walk in 'application' order, each item's `prev` pointer is compared + for equality with the last visited item + +.Macro debugging: +******************************************************************************** +Sometimes it's difficult to interpret a compiler warning on a line which +contains a macro call. In the case of uthash, one macro can expand to dozens of +lines. In this case, it is helpful to expand the macros and then recompile. +By doing so, the warning message will refer to the exact line within the macro. + +Here is an example of how to expand the macros and then recompile. This uses the +`test1.c` program in the `tests/` subdirectory. + + gcc -E -I../src test1.c > /tmp/a.c + egrep -v '^#' /tmp/a.c > /tmp/b.c + indent /tmp/b.c + gcc -o /tmp/b /tmp/b.c + +The last line compiles the original program (test1.c) with all macros expanded. +If there was a warning, the referenced line number can be checked in `/tmp/b.c`. +******************************************************************************** + +Thread safety +~~~~~~~~~~~~~ +You can use uthash in a threaded program. But you must do the locking. Use a +read-write lock to protect against concurrent writes. It is ok to have +concurrent readers (since uthash 1.5). + +For example using pthreads you can create an rwlock like this: + + pthread_rwlock_t lock; + if (pthread_rwlock_init(&lock,NULL) != 0) fatal("can't create rwlock"); + +Then, readers must acquire the read lock before doing any `HASH_FIND` calls or +before iterating over the hash elements: + + if (pthread_rwlock_rdlock(&lock) != 0) fatal("can't get rdlock"); + HASH_FIND_INT(elts, &i, e); + pthread_rwlock_unlock(&lock); + +Writers must acquire the exclusive write lock before doing any update. Add, +delete, and sort are all updates that must be locked. + + if (pthread_rwlock_wrlock(&lock) != 0) fatal("can't get wrlock"); + HASH_DEL(elts, e); + pthread_rwlock_unlock(&lock); + +If you prefer, you can use a mutex instead of a read-write lock, but this will +reduce reader concurrency to a single thread at a time. + +An example program using uthash with a read-write lock is included in +`tests/threads/test1.c`. + +[[Macro_reference]] +Macro reference +--------------- + +Convenience macros +~~~~~~~~~~~~~~~~~~ +The convenience macros do the same thing as the generalized macros, but +require fewer arguments. + +In order to use the convenience macros, + +1. the structure's `UT_hash_handle` field must be named `hh`, and +2. for add or find, the key field must be of type `int` or `char[]` or pointer + +.Convenience macros +[width="90%",cols="10m,30m",grid="none",options="header"] +|=============================================================================== +|macro | arguments +|HASH_ADD_INT | (head, keyfield_name, item_ptr) +|HASH_REPLACE_INT | (head, keyfiled_name, item_ptr,replaced_item_ptr) +|HASH_FIND_INT | (head, key_ptr, item_ptr) +|HASH_ADD_STR | (head, keyfield_name, item_ptr) +|HASH_REPLACE_STR | (head,keyfield_name, item_ptr, replaced_item_ptr) +|HASH_FIND_STR | (head, key_ptr, item_ptr) +|HASH_ADD_PTR | (head, keyfield_name, item_ptr) +|HASH_REPLACE_PTR | (head, keyfield_name, item_ptr, replaced_item_ptr) +|HASH_FIND_PTR | (head, key_ptr, item_ptr) +|HASH_DEL | (head, item_ptr) +|HASH_SORT | (head, cmp) +|HASH_COUNT | (head) +|=============================================================================== + +General macros +~~~~~~~~~~~~~~ + +These macros add, find, delete and sort the items in a hash. You need to +use the general macros if your `UT_hash_handle` is named something other +than `hh`, or if your key's data type isn't `int` or `char[]`. + +.General macros +[width="90%",cols="10m,30m",grid="none",options="header"] +|=============================================================================== +|macro | arguments +|HASH_ADD | (hh_name, head, keyfield_name, key_len, item_ptr) +|HASH_ADD_BYHASHVALUE | (hh_name, head, keyfield_name, key_len, hashv, item_ptr) +|HASH_ADD_KEYPTR | (hh_name, head, key_ptr, key_len, item_ptr) +|HASH_ADD_KEYPTR_BYHASHVALUE | (hh_name, head, key_ptr, key_len, hashv, item_ptr) +|HASH_ADD_INORDER | (hh_name, head, keyfield_name, key_len, item_ptr, cmp) +|HASH_ADD_BYHASHVALUE_INORDER | (hh_name, head, keyfield_name, key_len, hashv, item_ptr, cmp) +|HASH_ADD_KEYPTR_INORDER | (hh_name, head, key_ptr, key_len, item_ptr, cmp) +|HASH_ADD_KEYPTR_BYHASHVALUE_INORDER | (hh_name, head, key_ptr, key_len, hashv, item_ptr, cmp) +|HASH_REPLACE | (hh_name, head, keyfield_name, key_len, item_ptr, replaced_item_ptr) +|HASH_REPLACE_BYHASHVALUE | (hh_name, head, keyfield_name, key_len, hashv, item_ptr, replaced_item_ptr) +|HASH_REPLACE_INORDER | (hh_name, head, keyfield_name, key_len, item_ptr, replaced_item_ptr, cmp) +|HASH_REPLACE_BYHASHVALUE_INORDER | (hh_name, head, keyfield_name, key_len, hashv, item_ptr, replaced_item_ptr, cmp) +|HASH_FIND | (hh_name, head, key_ptr, key_len, item_ptr) +|HASH_FIND_BYHASHVALUE | (hh_name, head, key_ptr, key_len, hashv, item_ptr) +|HASH_DELETE | (hh_name, head, item_ptr) +|HASH_VALUE | (key_ptr, key_len, hashv) +|HASH_SRT | (hh_name, head, cmp) +|HASH_CNT | (hh_name, head) +|HASH_CLEAR | (hh_name, head) +|HASH_SELECT | (dst_hh_name, dst_head, src_hh_name, src_head, condition) +|HASH_ITER | (hh_name, head, item_ptr, tmp_item_ptr) +|HASH_OVERHEAD | (hh_name, head) +|=============================================================================== + +[NOTE] +`HASH_ADD_KEYPTR` is used when the structure contains a pointer to the +key, rather than the key itself. + +The `HASH_VALUE` and `..._BYHASHVALUE` macros are a performance mechanism mainly for the +special case of having different structures, in different hash tables, having +identical keys. It allows the hash value to be obtained once and then passed +in to the `..._BYHASHVALUE` macros, saving the expense of re-computing the hash value. + + +Argument descriptions +^^^^^^^^^^^^^^^^^^^^^ +hh_name:: + name of the `UT_hash_handle` field in the structure. Conventionally called + `hh`. +head:: + the structure pointer variable which acts as the "head" of the hash. So + named because it initially points to the first item that is added to the hash. +keyfield_name:: + the name of the key field in the structure. (In the case of a multi-field + key, this is the first field of the key). If you're new to macros, it + might seem strange to pass the name of a field as a parameter. See + <<validc,note>>. +key_len:: + the length of the key field in bytes. E.g. for an integer key, this is + `sizeof(int)`, while for a string key it's `strlen(key)`. (For a + multi-field key, see <<multifield_note,this note>>.) +key_ptr:: + for `HASH_FIND`, this is a pointer to the key to look up in the hash + (since it's a pointer, you can't directly pass a literal value here). For + `HASH_ADD_KEYPTR`, this is the address of the key of the item being added. +hashv:: + the hash value of the provided key. This is an input parameter for the + `..._BYHASHVALUE` macros, and an output parameter for `HASH_VALUE`. + Reusing a cached hash value can be a performance optimization if + you're going to do repeated lookups for the same key. +item_ptr:: + pointer to the structure being added, deleted, replaced, or looked up, or the current + pointer during iteration. This is an input parameter for the `HASH_ADD`, + `HASH_DELETE`, and `HASH_REPLACE` macros, and an output parameter for `HASH_FIND` + and `HASH_ITER`. (When using `HASH_ITER` to iterate, `tmp_item_ptr` + is another variable of the same type as `item_ptr`, used internally). +replaced_item_ptr:: + used in `HASH_REPLACE` macros. This is an output parameter that is set to point + to the replaced item (if no item is replaced it is set to NULL). +cmp:: + pointer to comparison function which accepts two arguments (pointers to + items to compare) and returns an int specifying whether the first item + should sort before, equal to, or after the second item (like `strcmp`). +condition:: + a function or macro which accepts a single argument (a void pointer to a + structure, which needs to be cast to the appropriate structure type). The + function or macro should evaluate to a non-zero value if the + structure should be "selected" for addition to the destination hash. + +// vim: set tw=80 wm=2 syntax=asciidoc: diff --git a/doc/utarray.txt b/doc/utarray.txt new file mode 100644 index 000000000..25d94e260 --- /dev/null +++ b/doc/utarray.txt @@ -0,0 +1,383 @@ +utarray: dynamic array macros for C +=================================== +Troy D. Hanson <tdh@tkhanson.net> +v2.1.0, December 2018 + +Here's a link back to the https://github.com/troydhanson/uthash[GitHub project page]. + +Introduction +------------ +A set of general-purpose dynamic array macros for C structures are included with +uthash in `utarray.h`. To use these macros in your own C program, just +copy `utarray.h` into your source directory and use it in your programs. + + #include "utarray.h" + +The dynamic array supports basic operations such as push, pop, and erase on the +array elements. These array elements can be any simple datatype or structure. +The array <<operations,operations>> are based loosely on the C++ STL vector methods. + +Internally the dynamic array contains a contiguous memory region into which +the elements are copied. This buffer is grown as needed using `realloc` to +accommodate all the data that is pushed into it. + +Download +~~~~~~~~ +To download the `utarray.h` header file, +follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file, +then look in the src/ sub-directory. + +BSD licensed +~~~~~~~~~~~~ +This software is made available under the +link:license.html[revised BSD license]. +It is free and open source. + +Platforms +~~~~~~~~~ +The 'utarray' macros have been tested on: + + * Linux, + * Mac OS X, + * Windows, using Visual Studio 2008 and Visual Studio 2010 + +Usage +----- + +Declaration +~~~~~~~~~~~ + +The array itself has the data type `UT_array`, regardless of the type of +elements to be stored in it. It is declared like, + + UT_array *nums; + +New and free +~~~~~~~~~~~~ +The next step is to create the array using `utarray_new`. Later when you're +done with the array, `utarray_free` will free it and all its elements. + +Push, pop, etc +~~~~~~~~~~~~~~ +The central features of the utarray involve putting elements into it, taking +them out, and iterating over them. There are several <<operations,operations>> +to pick from that deal with either single elements or ranges of elements at a +time. In the examples below we will use only the push operation to insert +elements. + +Elements +-------- + +Support for dynamic arrays of integers or strings is especially easy. These are +best shown by example: + +Integers +~~~~~~~~ +This example makes a utarray of integers, pushes 0-9 into it, then prints it. +Lastly it frees it. + +.Integer elements +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utarray.h" + +int main() { + UT_array *nums; + int i, *p; + + utarray_new(nums,&ut_int_icd); + for(i=0; i < 10; i++) utarray_push_back(nums,&i); + + for(p=(int*)utarray_front(nums); + p!=NULL; + p=(int*)utarray_next(nums,p)) { + printf("%d\n",*p); + } + + utarray_free(nums); + + return 0; +} +------------------------------------------------------------------------------- + +The second argument to `utarray_push_back` is always a 'pointer' to the type +(so a literal cannot be used). So for integers, it is an `int*`. + +Strings +~~~~~~~ +In this example we make a utarray of strings, push two strings into it, print +it and free it. + +.String elements +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utarray.h" + +int main() { + UT_array *strs; + char *s, **p; + + utarray_new(strs,&ut_str_icd); + + s = "hello"; utarray_push_back(strs, &s); + s = "world"; utarray_push_back(strs, &s); + p = NULL; + while ( (p=(char**)utarray_next(strs,p))) { + printf("%s\n",*p); + } + + utarray_free(strs); + + return 0; +} +------------------------------------------------------------------------------- + +In this example, since the element is a `char*`, we pass a pointer to it +(`char**`) as the second argument to `utarray_push_back`. Note that "push" makes +a copy of the source string and pushes that copy into the array. + +About UT_icd +~~~~~~~~~~~~ + +Arrays be made of any type of element, not just integers and strings. The +elements can be basic types or structures. Unless you're dealing with integers +and strings (which use pre-defined `ut_int_icd` and `ut_str_icd`), you'll need +to define a `UT_icd` helper structure. This structure contains everything that +utarray needs to initialize, copy or destruct elements. + + typedef struct { + size_t sz; + init_f *init; + ctor_f *copy; + dtor_f *dtor; + } UT_icd; + +The three function pointers `init`, `copy`, and `dtor` have these prototypes: + + typedef void (ctor_f)(void *dst, const void *src); + typedef void (dtor_f)(void *elt); + typedef void (init_f)(void *elt); + +The `sz` is just the size of the element being stored in the array. + +The `init` function will be invoked whenever utarray needs to initialize an +empty element. This only happens as a byproduct of `utarray_resize` or +`utarray_extend_back`. If `init` is `NULL`, it defaults to zero filling the +new element using memset. + +The `copy` function is used whenever an element is copied into the array. +It is invoked during `utarray_push_back`, `utarray_insert`, `utarray_inserta`, +or `utarray_concat`. If `copy` is `NULL`, it defaults to a bitwise copy using +memcpy. + +The `dtor` function is used to clean up an element that is being removed from +the array. It may be invoked due to `utarray_resize`, `utarray_pop_back`, +`utarray_erase`, `utarray_clear`, `utarray_done` or `utarray_free`. If the +elements need no cleanup upon destruction, `dtor` may be `NULL`. + +Scalar types +~~~~~~~~~~~~ + +The next example uses `UT_icd` with all its defaults to make a utarray of +`long` elements. This example pushes two longs, prints them, and frees the +array. + +.long elements +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utarray.h" + +UT_icd long_icd = {sizeof(long), NULL, NULL, NULL }; + +int main() { + UT_array *nums; + long l, *p; + utarray_new(nums, &long_icd); + + l=1; utarray_push_back(nums, &l); + l=2; utarray_push_back(nums, &l); + + p=NULL; + while( (p=(long*)utarray_next(nums,p))) printf("%ld\n", *p); + + utarray_free(nums); + return 0; +} +------------------------------------------------------------------------------- + +Structures +~~~~~~~~~~ + +Structures can be used as utarray elements. If the structure requires no +special effort to initialize, copy or destruct, we can use `UT_icd` with all +its defaults. This example shows a structure that consists of two integers. Here +we push two values, print them and free the array. + +.Structure (simple) +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utarray.h" + +typedef struct { + int a; + int b; +} intpair_t; + +UT_icd intpair_icd = {sizeof(intpair_t), NULL, NULL, NULL}; + +int main() { + + UT_array *pairs; + intpair_t ip, *p; + utarray_new(pairs,&intpair_icd); + + ip.a=1; ip.b=2; utarray_push_back(pairs, &ip); + ip.a=10; ip.b=20; utarray_push_back(pairs, &ip); + + for(p=(intpair_t*)utarray_front(pairs); + p!=NULL; + p=(intpair_t*)utarray_next(pairs,p)) { + printf("%d %d\n", p->a, p->b); + } + + utarray_free(pairs); + return 0; +} +------------------------------------------------------------------------------- + +The real utility of `UT_icd` is apparent when the elements of the utarray are +structures that require special work to initialize, copy or destruct. + +For example, when a structure contains pointers to related memory areas that +need to be copied when the structure is copied (and freed when the structure is +freed), we can use custom `init`, `copy`, and `dtor` members in the `UT_icd`. + +Here we take an example of a structure that contains an integer and a string. +When this element is copied (such as when an element is pushed into the array), +we want to "deep copy" the `s` pointer (so the original element and the new +element point to their own copies of `s`). When an element is destructed, we +want to "deep free" its copy of `s`. Lastly, this example is written to work +even if `s` has the value `NULL`. + +.Structure (complex) +------------------------------------------------------------------------------- +#include <stdio.h> +#include <stdlib.h> +#include "utarray.h" + +typedef struct { + int a; + char *s; +} intchar_t; + +void intchar_copy(void *_dst, const void *_src) { + intchar_t *dst = (intchar_t*)_dst, *src = (intchar_t*)_src; + dst->a = src->a; + dst->s = src->s ? strdup(src->s) : NULL; +} + +void intchar_dtor(void *_elt) { + intchar_t *elt = (intchar_t*)_elt; + if (elt->s) free(elt->s); +} + +UT_icd intchar_icd = {sizeof(intchar_t), NULL, intchar_copy, intchar_dtor}; + +int main() { + UT_array *intchars; + intchar_t ic, *p; + utarray_new(intchars, &intchar_icd); + + ic.a=1; ic.s="hello"; utarray_push_back(intchars, &ic); + ic.a=2; ic.s="world"; utarray_push_back(intchars, &ic); + + p=NULL; + while( (p=(intchar_t*)utarray_next(intchars,p))) { + printf("%d %s\n", p->a, (p->s ? p->s : "null")); + } + + utarray_free(intchars); + return 0; +} + +------------------------------------------------------------------------------- + +[[operations]] +Reference +--------- +This table lists all the utarray operations. These are loosely based on the C++ +vector class. + +Operations +~~~~~~~~~~ + +[width="100%",cols="50<m,40<",grid="none",options="none"] +|=============================================================================== +| utarray_new(UT_array *a, UT_icd *icd)| allocate a new array +| utarray_free(UT_array *a) | free an allocated array +| utarray_init(UT_array *a,UT_icd *icd)| init an array (non-alloc) +| utarray_done(UT_array *a) | dispose of an array (non-allocd) +| utarray_reserve(UT_array *a,int n) | ensure space available for 'n' more elements +| utarray_push_back(UT_array *a,void *p) | push element p onto a +| utarray_pop_back(UT_array *a) | pop last element from a +| utarray_extend_back(UT_array *a) | push empty element onto a +| utarray_len(UT_array *a) | get length of a +| utarray_eltptr(UT_array *a,int j) | get pointer of element from index +| utarray_eltidx(UT_array *a,void *e) | get index of element from pointer +| utarray_insert(UT_array *a,void *p, int j) | insert element p to index j +| utarray_inserta(UT_array *a,UT_array *w, int j) | insert array w into array a at index j +| utarray_resize(UT_array *dst,int num) | extend or shrink array to num elements +| utarray_concat(UT_array *dst,UT_array *src) | copy src to end of dst array +| utarray_erase(UT_array *a,int pos,int len) | remove len elements from a[pos]..a[pos+len-1] +| utarray_clear(UT_array *a) | clear all elements from a, setting its length to zero +| utarray_sort(UT_array *a,cmpfcn *cmp) | sort elements of a using comparison function +| utarray_find(UT_array *a,void *v, cmpfcn *cmp) | find element v in utarray (must be sorted) +| utarray_front(UT_array *a) | get first element of a +| utarray_next(UT_array *a,void *e) | get element of a following e (front if e is NULL) +| utarray_prev(UT_array *a,void *e) | get element of a before e (back if e is NULL) +| utarray_back(UT_array *a) | get last element of a +|=============================================================================== + +Notes +~~~~~ + +1. `utarray_new` and `utarray_free` are used to allocate a new array and free it, + while `utarray_init` and `utarray_done` can be used if the UT_array is already + allocated and just needs to be initialized or have its internal resources + freed. + +2. `utarray_reserve` takes the "delta" of elements to reserve, not the total + desired capacity of the array. This differs from the C++ STL "reserve" notion. + +3. `utarray_sort` expects a comparison function having the usual `strcmp`-like + convention where it accepts two elements (a and b) and returns a negative + value if a precedes b, 0 if a and b sort equally, and positive if b precedes a. + This is an example of a comparison function: + + int intsort(const void *a, const void *b) { + int _a = *(const int *)a; + int _b = *(const int *)b; + return (_a < _b) ? -1 : (_a > _b); + } + +4. `utarray_find` uses a binary search to locate an element having a certain value + according to the given comparison function. The utarray must be first sorted + using the same comparison function. An example of using `utarray_find` with + a utarray of strings is included in `tests/test61.c`. + +5. A 'pointer' to a particular element (obtained using `utarray_eltptr` or + `utarray_front`, `utarray_next`, `utarray_prev`, `utarray_back`) becomes invalid whenever + another element is inserted into the utarray. This is because the internal + memory management may need to `realloc` the element storage to a new address. + For this reason, it's usually better to refer to an element by its integer + 'index' in code whose duration may include element insertion. + +6. To override the default out-of-memory handling behavior (which calls `exit(-1)`), + override the `utarray_oom()` macro before including `utarray.h`. + For example, + + #define utarray_oom() do { longjmp(error_handling_location); } while (0) + ... + #include "utarray.h" + +// vim: set nowrap syntax=asciidoc: diff --git a/doc/uthash-mini.png b/doc/uthash-mini.png Binary files differnew file mode 100644 index 000000000..9536b2a78 --- /dev/null +++ b/doc/uthash-mini.png diff --git a/doc/uthash-mini.svg b/doc/uthash-mini.svg new file mode 100644 index 000000000..ea2d0749d --- /dev/null +++ b/doc/uthash-mini.svg @@ -0,0 +1,288 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Created with Inkscape (http://www.inkscape.org/) --> +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://web.resource.org/cc/" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink" + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" + width="118.44112" + height="22.655222" + id="svg2018" + sodipodi:version="0.32" + inkscape:version="0.44" + version="1.0" + sodipodi:docbase="/home/thanson/code/uthash/trunk/doc/html/img" + sodipodi:docname="uthash-mini.svg"> + <defs + id="defs3"> + <linearGradient + inkscape:collect="always" + id="linearGradient3964"> + <stop + style="stop-color:#00eb00;stop-opacity:1;" + offset="0" + id="stop3966" /> + <stop + style="stop-color:#00eb00;stop-opacity:0;" + offset="1" + id="stop3968" /> + </linearGradient> + <radialGradient + inkscape:collect="always" + xlink:href="#linearGradient3964" + id="radialGradient3996" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1,0,0,0.237347,0,36.5688)" + cx="176.99219" + cy="47.949429" + fx="176.99219" + fy="47.949429" + r="78.257812" /> + <linearGradient + id="linearGradient12743"> + <stop + style="stop-color:#99e1fa;stop-opacity:1;" + offset="0" + id="stop12745" /> + <stop + id="stop12753" + offset="0" + style="stop-color:#99e1fa;stop-opacity:0.49803922;" /> + <stop + style="stop-color:#99e1fa;stop-opacity:0;" + offset="1" + id="stop12747" /> + </linearGradient> + <radialGradient + inkscape:collect="always" + xlink:href="#linearGradient12743" + id="radialGradient12751" + cx="165.91866" + cy="45.584854" + fx="165.91866" + fy="45.584854" + r="56.51194" + gradientTransform="matrix(0.268675,0,0,0.16215,17.28599,40.67469)" + gradientUnits="userSpaceOnUse" /> + <marker + inkscape:stockid="Arrow1Send" + orient="auto" + refY="0" + refX="0" + id="Arrow1Send" + style="overflow:visible"> + <path + id="path7749" + d="M 0,0 L 5,-5 L -12.5,0 L 5,5 L 0,0 z " + style="fill-rule:evenodd;stroke:black;stroke-width:1pt;marker-start:none" + transform="scale(-0.2,-0.2)" /> + </marker> + <marker + inkscape:stockid="StopM" + orient="auto" + refY="0" + refX="0" + id="StopM" + style="overflow:visible"> + <path + id="path7651" + d="M 0,5.65 L 0,-5.65" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1pt" + transform="scale(0.4,0.4)" /> + </marker> + </defs> + <sodipodi:namedview + id="base" + pagecolor="#ffffff" + bordercolor="#666666" + borderopacity="1.0" + inkscape:pageopacity="0.0" + inkscape:pageshadow="2" + inkscape:zoom="1.2" + inkscape:cx="160" + inkscape:cy="90" + inkscape:document-units="mm" + inkscape:current-layer="layer1" + inkscape:window-width="916" + inkscape:window-height="626" + inkscape:window-x="5" + inkscape:window-y="73" /> + <metadata + id="metadata2022"> + <rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + </cc:Work> + </rdf:RDF> + </metadata> + <g + inkscape:label="Layer 1" + inkscape:groupmode="layer" + id="layer1" + transform="translate(-17.9166,-36.67108)"> + <rect + style="opacity:1;fill:#393be9;fill-opacity:1;stroke:#f18a00;stroke-width:1.51941979;stroke-linecap:round;stroke-linejoin:bevel;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect3981" + width="116.92171" + height="21.135801" + x="18.67631" + y="37.430794" + rx="7.8295798" + ry="5.3735089" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90" /> + <flowRoot + transform="matrix(0.449676,0,0,0.449676,-20.8252,25.84477)" + style="font-size:47.99999619px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#faf599;fill-opacity:1;stroke:#f3bf33;stroke-opacity:1;font-family:Bitstream Vera Sans" + id="flowRoot3988" + xml:space="preserve" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90"><flowRegion + style="fill:url(#radialGradient3996);fill-opacity:1" + id="flowRegion3990"><rect + style="font-size:47.99999619px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#faf599;fill-opacity:1;stroke:#f3bf33;stroke-opacity:1;font-family:Bitstream Vera Sans" + y="18" + x="94.666664" + height="61.333332" + width="321.33334" + id="rect3992" /></flowRegion><flowPara + id="flowPara7831">ut hash</flowPara></flowRoot> <rect + style="opacity:1;fill:url(#radialGradient12751);fill-opacity:1;stroke:none;stroke-width:2.82532263;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect10995" + width="30.366741" + height="18.326834" + x="46.68087" + y="38.902855" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90" /> + <g + id="g7808" + transform="matrix(0.217052,0,0,0.217052,-20.55641,38.41883)" + inkscape:export-filename="/home/thanson/code/uthash/doc/html/img/logo.png" + inkscape:export-xdpi="90" + inkscape:export-ydpi="90"> + <rect + y="37.730064" + x="382.39673" + height="18.405188" + width="23.206543" + id="rect4882" + style="opacity:1;fill:#9be5ea;fill-opacity:1;stroke:black;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <rect + style="opacity:1;fill:#d48c21;fill-opacity:0.97777776;stroke:black;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect4886" + width="23.206543" + height="18.405188" + x="416.39673" + y="37.730064" /> + <path + inkscape:connector-type="polyline" + id="path4890" + d="M 372.60327,46.932658 L 381.39673,46.932658" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path4892" + d="M 406.60327,46.932658 L 415.39673,46.932658" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <rect + y="9.7300644" + x="348.39673" + height="18.405188" + width="23.206543" + id="rect4896" + style="opacity:1;fill:#79c71a;fill-opacity:1;stroke:black;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <rect + style="opacity:1;fill:#f5e1a2;fill-opacity:1;stroke:black;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect4898" + width="23.206543" + height="18.405188" + x="382.39673" + y="9.7300644" /> + <path + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 372.60327,18.932658 L 381.39673,18.932658" + id="path4902" + inkscape:connector-type="polyline" /> + <rect + y="14.207111" + x="318.45328" + height="10.1194" + width="10.1194" + id="rect4906" + style="opacity:1;fill:#1336e6;fill-opacity:1;stroke:none;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path5789" + d="M 328.57268,19.220474 L 347.39673,19.048081" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 328.57268,19.220474 L 347.39673,19.048081" + id="path5795" + inkscape:connector-type="polyline" /> + <rect + y="37.789951" + x="348.20978" + height="18.405188" + width="23.206543" + id="rect5803" + style="opacity:1;fill:#e5e5e5;fill-opacity:1;stroke:black;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <rect + y="42.267002" + x="318.26633" + height="10.1194" + width="10.1194" + id="rect5805" + style="opacity:1;fill:#1336e6;fill-opacity:1;stroke:none;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + <path + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 328.38572,47.280365 L 347.20977,47.107972" + id="path5807" + inkscape:connector-type="polyline" /> + <rect + style="opacity:1;fill:#ddf9ed;fill-opacity:1;stroke:black;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect5809" + width="23.206543" + height="18.405188" + x="348.20978" + y="63.720913" /> + <rect + style="opacity:1;fill:#1336e6;fill-opacity:1;stroke:none;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" + id="rect5811" + width="10.1194" + height="10.1194" + x="318.26633" + y="68.197968" /> + <path + inkscape:connector-type="polyline" + id="path5813" + d="M 328.38572,73.211328 L 347.20977,73.038935" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:black;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path5833" + d="M 323.47927,24.326511 L 323.35974,42.267002" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#2f29df;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + inkscape:connector-type="polyline" + id="path5835" + d="M 323.32603,52.386402 L 323.32603,68.197968" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#2f29df;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + <path + id="path6716" + d="M 429.08836,47.11641 L 394.37307,18.527349 L 394.37307,49.158485 L 359.65778,18.527349 L 359.65778,50.179523 L 359.65778,75.70547" + style="fill:none;fill-opacity:0.75;fill-rule:evenodd;stroke:#f3bf33;stroke-width:2;stroke-linecap:butt;stroke-linejoin:miter;marker-start:url(#StopM);marker-end:url(#Arrow1Send);stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1" /> + </g> + </g> +</svg> diff --git a/doc/uthash.png b/doc/uthash.png Binary files differnew file mode 100644 index 000000000..20df5a7d3 --- /dev/null +++ b/doc/uthash.png diff --git a/doc/utlist.txt b/doc/utlist.txt new file mode 100644 index 000000000..06b6b00dc --- /dev/null +++ b/doc/utlist.txt @@ -0,0 +1,293 @@ +utlist: linked list macros for C structures +=========================================== +Troy D. Hanson <tdh@tkhanson.net> +v2.1.0, December 2018 + +Here's a link back to the https://github.com/troydhanson/uthash[GitHub project page]. + +Introduction +------------ +A set of general-purpose 'linked list' macros for C structures are included with +uthash in `utlist.h`. To use these macros in your own C program, just +copy `utlist.h` into your source directory and use it in your programs. + + #include "utlist.h" + +These macros support the basic linked list operations: adding and deleting +elements, sorting them and iterating over them. + +Download +~~~~~~~~ +To download the `utlist.h` header file, +follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file, +then look in the src/ sub-directory. + +BSD licensed +~~~~~~~~~~~~ +This software is made available under the +link:license.html[revised BSD license]. +It is free and open source. + +Platforms +~~~~~~~~~ +The 'utlist' macros have been tested on: + + * Linux, + * Mac OS X, and + * Windows, using Visual Studio 2008, Visual Studio 2010, or Cygwin/MinGW. + +Using utlist +------------ + +Types of lists +~~~~~~~~~~~~~~ +Three types of linked lists are supported: + +- *singly-linked* lists, +- *doubly-linked* lists, and +- *circular, doubly-linked* lists + +Efficiency +^^^^^^^^^^ +Prepending elements:: + Constant-time on all list types. +Appending:: + 'O(n)' on singly-linked lists; constant-time on doubly-linked list. + (The utlist implementation of the doubly-linked list keeps a tail pointer in + `head->prev` so that append can be done in constant time). +Deleting elements:: + 'O(n)' on singly-linked lists; constant-time on doubly-linked list. +Sorting:: + 'O(n log(n))' for all list types. +Insertion in order (for sorted lists):: + 'O(n)' for all list types. +Iteration, counting and searching:: + 'O(n)' for all list types. + +List elements +~~~~~~~~~~~~~ +You can use any structure with these macros, as long as the structure +contains a `next` pointer. If you want to make a doubly-linked list, +the element also needs to have a `prev` pointer. + + typedef struct element { + char *name; + struct element *prev; /* needed for a doubly-linked list only */ + struct element *next; /* needed for singly- or doubly-linked lists */ + } element; + +You can name your structure anything. In the example above it is called `element`. +Within a particular list, all elements must be of the same type. + +Flexible prev/next naming +^^^^^^^^^^^^^^^^^^^^^^^^^ +You can name your `prev` and `next` pointers something else. If you do, there is +a <<flex_names,family of macros>> that work identically but take these names as +extra arguments. + +List head +~~~~~~~~~ +The list head is simply a pointer to your element structure. You can name it +anything. *It must be initialized to `NULL`*. + + element *head = NULL; + +List operations +~~~~~~~~~~~~~~~ +The lists support inserting or deleting elements, sorting the elements and +iterating over them. + +[width="100%",cols="10<m,10<m,10<m",grid="cols",options="header"] +|=============================================================================== +|Singly-linked | Doubly-linked | Circular, doubly-linked +|LL_PREPEND(head,add); | DL_PREPEND(head,add); | CDL_PREPEND(head,add); +|LL_PREPEND_ELEM(head,ref,add); | DL_PREPEND_ELEM(head,ref,add); | CDL_PREPEND_ELEM(head,ref,add); +|LL_APPEND_ELEM(head,ref,add); | DL_APPEND_ELEM(head,ref,add); | CDL_APPEND_ELEM(head,ref,add); +|LL_REPLACE_ELEM(head,del,add); | DL_REPLACE_ELEM(head,del,add); | CDL_REPLACE_ELEM(head,del,add); +|LL_APPEND(head,add); | DL_APPEND(head,add); | CDL_APPEND(head,add); +|LL_INSERT_INORDER(head,add,cmp); | DL_INSERT_INORDER(head,add,cmp); | CDL_INSERT_INORDER(head,add,cmp); +|LL_CONCAT(head1,head2); | DL_CONCAT(head1,head2); | +|LL_DELETE(head,del); | DL_DELETE(head,del); | CDL_DELETE(head,del); +|LL_SORT(head,cmp); | DL_SORT(head,cmp); | CDL_SORT(head,cmp); +|LL_FOREACH(head,elt) {...}| DL_FOREACH(head,elt) {...} | CDL_FOREACH(head,elt) {...} +|LL_FOREACH_SAFE(head,elt,tmp) {...}| DL_FOREACH_SAFE(head,elt,tmp) {...} | CDL_FOREACH_SAFE(head,elt,tmp1,tmp2) {...} +|LL_SEARCH_SCALAR(head,elt,mbr,val);| DL_SEARCH_SCALAR(head,elt,mbr,val); | CDL_SEARCH_SCALAR(head,elt,mbr,val); +|LL_SEARCH(head,elt,like,cmp);| DL_SEARCH(head,elt,like,cmp); | CDL_SEARCH(head,elt,like,cmp); +|LL_LOWER_BOUND(head,elt,like,cmp); | DL_LOWER_BOUND(head,elt,like,cmp); | CDL_LOWER_BOUND(head,elt,like,cmp); +|LL_COUNT(head,elt,count); | DL_COUNT(head,elt,count); | CDL_COUNT(head,elt,count); +|=============================================================================== + +'Prepend' means to insert an element in front of the existing list head (if any), +changing the list head to the new element. 'Append' means to add an element at the +end of the list, so it becomes the new tail element. 'Concatenate' takes two +properly constructed lists and appends the second list to the first. (Visual +Studio 2008 does not support `LL_CONCAT` and `DL_CONCAT`, but VS2010 is ok.) +To prepend before an arbitrary element instead of the list head, use the +`_PREPEND_ELEM` macro family. +To append after an arbitrary element element instead of the list head, use the +`_APPEND_ELEM` macro family. +To 'replace' an arbitrary list element with another element use the `_REPLACE_ELEM` +family of macros. + +The 'sort' operation never moves the elements in memory; rather it only adjusts +the list order by altering the `prev` and `next` pointers in each element. Also +the sort operation can change the list head to point to a new element. + +The 'foreach' operation is for easy iteration over the list from the head to the +tail. A usage example is shown below. You can of course just use the `prev` and +`next` pointers directly instead of using the 'foreach' macros. +The 'foreach_safe' operation should be used if you plan to delete any of the list +elements while iterating. + +The 'search' operation is a shortcut for iteration in search of a particular +element. It is not any faster than manually iterating and testing each element. +There are two forms: the "scalar" version searches for an element using a +simple equality test on a given structure member, while the general version takes an +element to which all others in the list will be compared using a `cmp` function. + +The 'lower_bound' operation finds the first element of the list which is no greater +than the provided `like` element, according to the provided `cmp` function. +The 'lower_bound' operation sets `elt` to a suitable value for passing to +`LL_APPEND_ELEM`; i.e., `elt=NULL` if the proper insertion point is at the front of +the list, and `elt=p` if the proper insertion point is between `p` and `p->next`. + +The 'count' operation iterates over the list and increments a supplied counter. + +The parameters shown in the table above are explained here: + +head:: + The list head (a pointer to your list element structure). +add:: + A pointer to the list element structure you are adding to the list. +del:: + A pointer to the list element structure you are replacing or + deleting from the list. +elt:: + A pointer that will be assigned to each list element in succession (see + example) in the case of iteration macros; or, the output pointer from + the search macros. +ref:: + Reference element for prepend and append operations that will be + prepended before or appended after. + If `ref` is a pointer with value NULL, the new element will be appended to the + list for _PREPEND_ELEM() operations and prepended for _APPEND_ELEM() operations. + `ref` must be the name of a pointer variable and cannot be literally NULL, + use _PREPEND() and _APPEND() macro family instead. +like:: + An element pointer, having the same type as `elt`, for which the search macro + seeks a match (if found, the match is stored in `elt`). A match is determined + by the given `cmp` function. +cmp:: + pointer to comparison function which accepts two arguments-- these are + pointers to two element structures to be compared. The comparison function + must return an `int` that is negative, zero, or positive, which specifies + whether the first item should sort before, equal to, or after the second item, + respectively. (In other words, the same convention that is used by `strcmp`). + Note that under Visual Studio 2008 you may need to declare the two arguments + as `void *` and then cast them back to their actual types. +tmp:: + A pointer of the same type as `elt`. Used internally. Need not be initialized. +mbr:: + In the scalar search macro, the name of a member within the `elt` structure which + will be tested (using `==`) for equality with the value `val`. +val:: + In the scalar search macro, specifies the value of (of structure member + `field`) of the element being sought. +count:: + integer which will be set to the length of the list + +Example +~~~~~~~ +This example program reads names from a text file (one name per line), and +appends each name to a doubly-linked list. Then it sorts and prints them. + +.A doubly-linked list +-------------------------------------------------------------------------------- +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include "utlist.h" + +#define BUFLEN 20 + +typedef struct el { + char bname[BUFLEN]; + struct el *next, *prev; +} el; + +int namecmp(el *a, el *b) { + return strcmp(a->bname,b->bname); +} + +el *head = NULL; /* important- initialize to NULL! */ + +int main(int argc, char *argv[]) { + el *name, *elt, *tmp, etmp; + + char linebuf[BUFLEN]; + int count; + FILE *file; + + if ( (file = fopen( "test11.dat", "r" )) == NULL ) { + perror("can't open: "); + exit(-1); + } + + while (fgets(linebuf,BUFLEN,file) != NULL) { + if ( (name = (el *)malloc(sizeof *name)) == NULL) exit(-1); + strcpy(name->bname, linebuf); + DL_APPEND(head, name); + } + DL_SORT(head, namecmp); + DL_FOREACH(head,elt) printf("%s", elt->bname); + DL_COUNT(head, elt, count); + printf("%d number of elements in list\n", count); + + memcpy(&etmp.bname, "WES\n", 5); + DL_SEARCH(head,elt,&etmp,namecmp); + if (elt) printf("found %s\n", elt->bname); + + /* now delete each element, use the safe iterator */ + DL_FOREACH_SAFE(head,elt,tmp) { + DL_DELETE(head,elt); + free(elt); + } + + fclose(file); + + return 0; +} +-------------------------------------------------------------------------------- + +[[flex_names]] +Other names for prev and next +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +If the `prev` and `next` fields are named something else, a separate group of +macros must be used. These work the same as the regular macros, but take the +field names as extra parameters. + +These "flexible field name" macros are shown below. They all end with `2`. Each +operates the same as its counterpart without the `2`, but they take the name of +the `prev` and `next` fields (as applicable) as trailing arguments. + +[width="100%",cols="10<m,10<m,10<m",grid="cols",options="header"] +|=============================================================================== +|Singly-linked | Doubly-linked | Circular, doubly-linked +|LL_PREPEND2(head,add,next); | DL_PREPEND2(head,add,prev,next); | CDL_PREPEND2(head,add,prev,next); +|LL_PREPEND_ELEM2(head,ref,add,next); | DL_PREPEND_ELEM2(head,ref,add,prev,next); | CDL_PREPEND_ELEM2(head,ref,add,prev,next); +|LL_APPEND_ELEM2(head,ref,add,next); | DL_APPEND_ELEM2(head,ref,add,prev,next); | CDL_APPEND_ELEM2(head,ref,add,prev,next); +|LL_REPLACE_ELEM2(head,del,add,next); | DL_REPLACE_ELEM2(head,del,add,prev,next); | CDL_REPLACE_ELEM2(head,del,add,prev,next); +|LL_APPEND2(head,add,next); | DL_APPEND2(head,add,prev,next); | CDL_APPEND2(head,add,prev,next); +|LL_INSERT_INORDER2(head,add,cmp,next); | DL_INSERT_INORDER2(head,add,cmp,prev,next); | CDL_INSERT_INORDER2(head,add,cmp,prev,next); +|LL_CONCAT2(head1,head2,next); | DL_CONCAT2(head1,head2,prev,next); | +|LL_DELETE2(head,del,next); | DL_DELETE2(head,del,prev,next); | CDL_DELETE2(head,del,prev,next); +|LL_SORT2(head,cmp,next); | DL_SORT2(head,cmp,prev,next); | CDL_SORT2(head,cmp,prev,next); +|LL_FOREACH2(head,elt,next) {...} | DL_FOREACH2(head,elt,next) {...} | CDL_FOREACH2(head,elt,next) {...} +|LL_FOREACH_SAFE2(head,elt,tmp,next) {...} | DL_FOREACH_SAFE2(head,elt,tmp,next) {...} | CDL_FOREACH_SAFE2(head,elt,tmp1,tmp2,prev,next) {...} +|LL_SEARCH_SCALAR2(head,elt,mbr,val,next); | DL_SEARCH_SCALAR2(head,elt,mbr,val,next); | CDL_SEARCH_SCALAR2(head,elt,mbr,val,next); +|LL_SEARCH2(head,elt,like,cmp,next); | DL_SEARCH2(head,elt,like,cmp,next); | CDL_SEARCH2(head,elt,like,cmp,next); +|LL_LOWER_BOUND2(head,elt,like,cmp,next); | DL_LOWER_BOUND2(head,elt,like,cmp,next); | CDL_LOWER_BOUND2(head,elt,like,cmp,next); +|LL_COUNT2(head,elt,count,next); | DL_COUNT2(head,elt,count,next); | CDL_COUNT2(head,elt,count,next); +|=============================================================================== + +// vim: set tw=80 wm=2 syntax=asciidoc: diff --git a/doc/utringbuffer.txt b/doc/utringbuffer.txt new file mode 100644 index 000000000..86206c1e5 --- /dev/null +++ b/doc/utringbuffer.txt @@ -0,0 +1,374 @@ +utringbuffer: dynamic ring-buffer macros for C +============================================== +Arthur O'Dwyer <arthur.j.odwyer@gmail.com> +v2.1.0, December 2018 + +Here's a link back to the https://github.com/troydhanson/uthash[GitHub project page]. + +Introduction +------------ +The functions in `utringbuffer.h` are based on the general-purpose array macros +provided in `utarray.h`, so before reading this page you should read +link:utarray.html[that page] first. + +To use these macros in your own C program, copy both `utarray.h` and `utringbuffer.h` +into your source directory and use `utringbuffer.h` in your program. + + #include "utringbuffer.h" + +The provided <<operations,operations>> are based loosely on the C++ STL vector methods. +The ring-buffer data type supports construction (with a specified capacity), +destruction, iteration, and push, but not pop; once the ring-buffer reaches full +capacity, pushing a new element automatically pops and destroys the oldest element. +The elements contained in the ring-buffer can be any simple datatype or structure. + +Internally the ring-buffer contains a pre-allocated memory region into which the +elements are copied, starting at position 0. When the ring-buffer reaches full +capacity, the next element to be pushed is pushed at position 0, overwriting the +oldest element, and the internal index representing the "start" of the ring-buffer +is incremented. A ring-buffer, once full, can never become un-full. + + +Download +~~~~~~~~ +To download the `utringbuffer.h` header file, +follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file, +then look in the src/ sub-directory. + +BSD licensed +~~~~~~~~~~~~ +This software is made available under the +link:license.html[revised BSD license]. +It is free and open source. + +Platforms +~~~~~~~~~ +The 'utringbuffer' macros have been tested on: + + * Linux, + * Mac OS X, + * Windows, using Visual Studio 2008 and Visual Studio 2010 + +Usage +----- + +Declaration +~~~~~~~~~~~ + +The ring-buffer itself has the data type `UT_ringbuffer`, regardless of the type of +elements to be stored in it. It is declared like, + + UT_ringbuffer *history; + +New and free +~~~~~~~~~~~~ +The next step is to create the ring-buffer using `utringbuffer_new`. Later when you're +done with the ring-buffer, `utringbuffer_free` will free it and all its elements. + +Push, etc +~~~~~~~~~ +The central features of the ring-buffer involve putting elements into it +and iterating over them. There are several <<operations,operations>> +that deal with either single elements or ranges of elements at a +time. In the examples below we will use only the push operation to insert +elements. + +Elements +-------- + +Support for dynamic arrays of integers or strings is especially easy. These are +best shown by example: + +Integers +~~~~~~~~ +This example makes a ring-buffer of integers, pushes 0-9 into it, then prints it +two different ways. Lastly it frees it. + +.Integer elements +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utringbuffer.h" + +int main() { + UT_ringbuffer *history; + int i, *p; + + utringbuffer_new(history, 7, &ut_int_icd); + for(i=0; i < 10; i++) utringbuffer_push_back(history, &i); + + for (p = (int*)utringbuffer_front(history); + p != NULL; + p = (int*)utringbuffer_next(history, p)) { + printf("%d\n", *p); /* prints "3 4 5 6 7 8 9" */ + } + + for (i=0; i < utringbuffer_len(history); i++) { + p = utringbuffer_eltptr(history, i); + printf("%d\n", *p); /* prints "3 4 5 6 7 8 9" */ + } + + utringbuffer_free(history); + + return 0; +} +------------------------------------------------------------------------------- + +The second argument to `utringbuffer_push_back` is always a 'pointer' to the type +(so a literal cannot be used). So for integers, it is an `int*`. + +Strings +~~~~~~~ +In this example we make a ring-buffer of strings, push two strings into it, print +it and free it. + +.String elements +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utringbuffer.h" + +int main() { + UT_ringbuffer *strs; + char *s, **p; + + utringbuffer_new(strs, 7, &ut_str_icd); + + s = "hello"; utringbuffer_push_back(strs, &s); + s = "world"; utringbuffer_push_back(strs, &s); + p = NULL; + while ( (p=(char**)utringbuffer_next(strs,p))) { + printf("%s\n",*p); + } + + utringbuffer_free(strs); + + return 0; +} +------------------------------------------------------------------------------- + +In this example, since the element is a `char*`, we pass a pointer to it +(`char**`) as the second argument to `utringbuffer_push_back`. Note that "push" makes +a copy of the source string and pushes that copy into the array. + +About UT_icd +~~~~~~~~~~~~ + +Arrays can be made of any type of element, not just integers and strings. The +elements can be basic types or structures. Unless you're dealing with integers +and strings (which use pre-defined `ut_int_icd` and `ut_str_icd`), you'll need +to define a `UT_icd` helper structure. This structure contains everything that +utringbuffer (or utarray) needs to initialize, copy or destruct elements. + + typedef struct { + size_t sz; + init_f *init; + ctor_f *copy; + dtor_f *dtor; + } UT_icd; + +The three function pointers `init`, `copy`, and `dtor` have these prototypes: + + typedef void (ctor_f)(void *dst, const void *src); + typedef void (dtor_f)(void *elt); + typedef void (init_f)(void *elt); + +The `sz` is just the size of the element being stored in the array. + +The `init` function is used by utarray but is never used by utringbuffer; +you may safely set it to any value you want. + +The `copy` function is used whenever an element is copied into the buffer. +It is invoked during `utringbuffer_push_back`. +If `copy` is `NULL`, it defaults to a bitwise copy using memcpy. + +The `dtor` function is used to clean up an element that is being removed from +the buffer. It may be invoked due to `utringbuffer_push_back` (on the oldest +element in the buffer), `utringbuffer_clear`, `utringbuffer_done`, or +`utringbuffer_free`. +If the elements need no cleanup upon destruction, `dtor` may be `NULL`. + +Scalar types +~~~~~~~~~~~~ + +The next example uses `UT_icd` with all its defaults to make a ring-buffer of +`long` elements. This example pushes two longs into a buffer of capacity 1, +prints the contents of the buffer (which is to say, the most recent value +pushed), and then frees the buffer. + +.long elements +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utringbuffer.h" + +UT_icd long_icd = {sizeof(long), NULL, NULL, NULL }; + +int main() { + UT_ringbuffer *nums; + long l, *p; + utringbuffer_new(nums, 1, &long_icd); + + l=1; utringbuffer_push_back(nums, &l); + l=2; utringbuffer_push_back(nums, &l); + + p=NULL; + while((p = (long*)utringbuffer_next(nums,p))) printf("%ld\n", *p); + + utringbuffer_free(nums); + return 0; +} +------------------------------------------------------------------------------- + +Structures +~~~~~~~~~~ + +Structures can be used as utringbuffer elements. If the structure requires no +special effort to initialize, copy or destruct, we can use `UT_icd` with all +its defaults. This example shows a structure that consists of two integers. Here +we push two values, print them and free the buffer. + +.Structure (simple) +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utringbuffer.h" + +typedef struct { + int a; + int b; +} intpair_t; + +UT_icd intpair_icd = {sizeof(intpair_t), NULL, NULL, NULL}; + +int main() { + + UT_ringbuffer *pairs; + intpair_t ip, *p; + utringbuffer_new(pairs, 7, &intpair_icd); + + ip.a=1; ip.b=2; utringbuffer_push_back(pairs, &ip); + ip.a=10; ip.b=20; utringbuffer_push_back(pairs, &ip); + + for(p=(intpair_t*)utringbuffer_front(pairs); + p!=NULL; + p=(intpair_t*)utringbuffer_next(pairs,p)) { + printf("%d %d\n", p->a, p->b); + } + + utringbuffer_free(pairs); + return 0; +} +------------------------------------------------------------------------------- + +The real utility of `UT_icd` is apparent when the elements stored in the +ring-buffer are structures that require special work to initialize, copy or +destruct. + +For example, when a structure contains pointers to related memory areas that +need to be copied when the structure is copied (and freed when the structure is +freed), we can use custom `init`, `copy`, and `dtor` members in the `UT_icd`. + +Here we take an example of a structure that contains an integer and a string. +When this element is copied (such as when an element is pushed), +we want to "deep copy" the `s` pointer (so the original element and the new +element point to their own copies of `s`). When an element is destructed, we +want to "deep free" its copy of `s`. Lastly, this example is written to work +even if `s` has the value `NULL`. + +.Structure (complex) +------------------------------------------------------------------------------- +#include <stdio.h> +#include <stdlib.h> +#include "utringbuffer.h" + +typedef struct { + int a; + char *s; +} intchar_t; + +void intchar_copy(void *_dst, const void *_src) { + intchar_t *dst = (intchar_t*)_dst, *src = (intchar_t*)_src; + dst->a = src->a; + dst->s = src->s ? strdup(src->s) : NULL; +} + +void intchar_dtor(void *_elt) { + intchar_t *elt = (intchar_t*)_elt; + free(elt->s); +} + +UT_icd intchar_icd = {sizeof(intchar_t), NULL, intchar_copy, intchar_dtor}; + +int main() { + UT_ringbuffer *intchars; + intchar_t ic, *p; + utringbuffer_new(intchars, 2, &intchar_icd); + + ic.a=1; ic.s="hello"; utringbuffer_push_back(intchars, &ic); + ic.a=2; ic.s="world"; utringbuffer_push_back(intchars, &ic); + ic.a=3; ic.s="peace"; utringbuffer_push_back(intchars, &ic); + + p=NULL; + while( (p=(intchar_t*)utringbuffer_next(intchars,p))) { + printf("%d %s\n", p->a, (p->s ? p->s : "null")); + /* prints "2 world 3 peace" */ + } + + utringbuffer_free(intchars); + return 0; +} + +------------------------------------------------------------------------------- + +[[operations]] +Reference +--------- +This table lists all the utringbuffer operations. These are loosely based on the C++ +vector class. + +Operations +~~~~~~~~~~ + +[width="100%",cols="50<m,40<",grid="none",options="none"] +|=============================================================================== +| utringbuffer_new(UT_ringbuffer *a, int n, UT_icd *icd) | allocate a new ringbuffer +| utringbuffer_free(UT_ringbuffer *a) | free an allocated ringbuffer +| utringbuffer_init(UT_ringbuffer *a, int n, UT_icd *icd) | init a ringbuffer (non-alloc) +| utringbuffer_done(UT_ringbuffer *a) | dispose of a ringbuffer (non-alloc) +| utringbuffer_clear(UT_ringbuffer *a) | clear all elements from a, making it empty +| utringbuffer_push_back(UT_ringbuffer *a, element *p) | push element p onto a +| utringbuffer_len(UT_ringbuffer *a) | get length of a +| utringbuffer_empty(UT_ringbuffer *a) | get whether a is empty +| utringbuffer_full(UT_ringbuffer *a) | get whether a is full +| utringbuffer_eltptr(UT_ringbuffer *a, int j) | get pointer of element from index +| utringbuffer_eltidx(UT_ringbuffer *a, element *e) | get index of element from pointer +| utringbuffer_front(UT_ringbuffer *a) | get oldest element of a +| utringbuffer_next(UT_ringbuffer *a, element *e) | get element of a following e (front if e is NULL) +| utringbuffer_prev(UT_ringbuffer *a, element *e) | get element of a before e (back if e is NULL) +| utringbuffer_back(UT_ringbuffer *a) | get newest element of a +|=============================================================================== + +Notes +~~~~~ + +1. `utringbuffer_new` and `utringbuffer_free` are used to allocate a new ring-buffer + and to free it, + while `utringbuffer_init` and `utringbuffer_done` can be used if the UT_ringbuffer + is already allocated and just needs to be initialized or have its internal resources + freed. +2. Both `utringbuffer_new` and `utringbuffer_init` take a second parameter `n` indicating + the capacity of the ring-buffer, that is, the size at which the ring-buffer is considered + "full" and begins to overwrite old elements with newly pushed ones. +3. Once a ring-buffer has become full, it will never again become un-full except by + means of `utringbuffer_clear`. There is no way to "pop" a single old item from the + front of the ring-buffer. You can simulate this ability by maintaining a separate + integer count of the number of "logically popped elements", and starting your iteration + with `utringbuffer_eltptr(a, popped_count)` instead of with `utringbuffer_front(a)`. +4. Pointers to elements (obtained using `utringbuffer_eltptr`, `utringbuffer_front`, + `utringbuffer_next`, etc.) are not generally invalidated by `utringbuffer_push_back`, + because utringbuffer does not perform reallocation; however, a pointer to the oldest + element may suddenly turn into a pointer to the 'newest' element if + `utringbuffer_push_back` is called while the buffer is full. +5. The elements of a ring-buffer are stored in contiguous memory, but once the ring-buffer + has become full, it is no longer true that the elements are contiguously in order from + oldest to newest; i.e., `(element *)utringbuffer_front(a) + utringbuffer_len(a)-1` is + not generally equal to `(element *)utringbuffer_back(a)`. + +// vim: set nowrap syntax=asciidoc: diff --git a/doc/utstack.txt b/doc/utstack.txt new file mode 100644 index 000000000..0c628c06d --- /dev/null +++ b/doc/utstack.txt @@ -0,0 +1,158 @@ +utstack: intrusive stack macros for C +===================================== +Arthur O'Dwyer <arthur.j.odwyer@gmail.com> +v2.1.0, December 2018 + +Here's a link back to the https://github.com/troydhanson/uthash[GitHub project page]. + +Introduction +------------ +A set of very simple stack macros for C structures are included with +uthash in `utstack.h`. To use these macros in your own C program, just +copy `utstack.h` into your source directory and use it in your programs. + + #include "utstack.h" + +These macros support the basic operations of a stack, implemented as +an intrusive linked list. A stack supports the "push", "pop", and "count" +operations, as well as the trivial operation of getting the top element +of the stack. + +Download +~~~~~~~~ +To download the `utstack.h` header file, +follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file, +then look in the src/ sub-directory. + +BSD licensed +~~~~~~~~~~~~ +This software is made available under the +link:license.html[revised BSD license]. +It is free and open source. + +Platforms +~~~~~~~~~ +The 'utstack' macros have been tested on: + + * Linux, + * Mac OS X, + * Windows, using Visual Studio 2008 and Visual Studio 2010 + +Usage +----- + +Stack (list) head +~~~~~~~~~~~~~~~~~ +The stack head is simply a pointer to your element structure. You can name it +anything. *It must be initialized to `NULL`*. It doubles as a pointer to the +top element of your stack. + + element *stack = NULL; + +Stack operations +~~~~~~~~~~~~~~~ +The only operations on a stack are O(1) pushing, O(1) popping, and +O(n) counting the number of elements on the stack. None of the provided +macros permit directly accessing stack elements other than the top element. + +To increase the readability of your code, you can use the macro +`STACK_EMPTY(head)` as a more readable alternative to `head == NULL`, +and `STACK_TOP(head)` as a more readable alternative to `head`. + +[width="100%",cols="50<m,40<",grid="none",options="none"] +|=============================================================================== +|STACK_PUSH(stack,add); | push `add` onto `stack` +|STACK_POP(stack,elt); | pop `stack` and save previous top as `elt` +|STACK_COUNT(stack,tmp,count); | store number of elements into `count` +|STACK_TOP(stack) | return `stack` +|STACK_EMPTY(stack) | return `stack == NULL` +|=============================================================================== + +The parameters shown in the table above are explained here: + +stack:: + The stack head (a pointer to your element structure). +add:: + A pointer to the element structure you are adding to the stack. +elt:: + A pointer that will be assigned the address of the popped element. Need not be initialized. +tmp:: + A pointer of the same type as `elt`. Used internally. Need not be initialized. +count:: + An integer that will be assigned the size of the stack. Need not be initialized. + + +Example +~~~~~~~ +This example program reads names from a text file (one name per line), and +pushes each name on the stack; then pops and prints them in reverse order. + +.A stack of names +-------------------------------------------------------------------------------- +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include "utstack.h" + +#define BUFLEN 20 + +typedef struct el { + char bname[BUFLEN]; + struct el *next; +} el; + +el *head = NULL; /* important- initialize to NULL! */ + +int main(int argc, char *argv[]) { + el *elt, *tmp; + + char linebuf[sizeof el->bname]; + int count; + FILE *file = fopen("test11.dat", "r"); + if (file == NULL) { + perror("can't open: "); + exit(-1); + } + + while (fgets(linebuf, sizeof linebuf, file) != NULL) { + el *name = malloc(sizeof *name); + if (name == NULL) exit(-1); + strcpy(name->bname, linebuf); + STACK_PUSH(head, name); + } + fclose(file); + + STACK_COUNT(head, elt, count); + printf("%d elements were read into the stack\n", count); + + /* now pop, print, and delete each element */ + while (!STACK_EMPTY(head)) { + printf("%s\n", STACK_TOP(head)->bname); + STACK_POP(head, elt); + free(elt); + } + + return 0; +} +-------------------------------------------------------------------------------- + +[[flex_names]] +Other names for next +~~~~~~~~~~~~~~~~~~~~ +If the element structure's `next` field is named something else, a separate group +of macros must be used. These work the same as the regular macros, but take the +field name as an extra parameter. + +These "flexible field name" macros are shown below. They all end with `2`. Each +operates the same as its counterpart without the `2`, but they take the name of +the `next` field as a trailing argument. + +[width="100%",cols="50<m,40<",grid="none",options="none"] +|=============================================================================== +|STACK_PUSH2(stack,add,next); | push `add` onto `stack` +|STACK_POP2(stack,elt,next); | pop `stack` and save previous top as `elt` +|STACK_COUNT2(stack,tmp,count,next); | store number of elements into `count` +|=============================================================================== + + +// vim: set nowrap syntax=asciidoc: diff --git a/doc/utstring.txt b/doc/utstring.txt new file mode 100644 index 000000000..56e2f4c00 --- /dev/null +++ b/doc/utstring.txt @@ -0,0 +1,239 @@ +utstring: dynamic string macros for C +===================================== +Troy D. Hanson <tdh@tkhanson.net> +v2.1.0, December 2018 + +Here's a link back to the https://github.com/troydhanson/uthash[GitHub project page]. + +Introduction +------------ +A set of basic dynamic string macros for C programs are included with +uthash in `utstring.h`. To use these in your own C program, just copy +`utstring.h` into your source directory and use it in your programs. + + #include "utstring.h" + +The dynamic string supports operations such as inserting data, concatenation, +getting the length and content, substring search, and clear. It's ok to put +binary data into a utstring too. The string <<operations,operations>> are +listed below. + +Some utstring operations are implemented as functions rather than macros. + +Download +~~~~~~~~ +To download the `utstring.h` header file, +follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file, +then look in the src/ sub-directory. + +BSD licensed +~~~~~~~~~~~~ +This software is made available under the +link:license.html[revised BSD license]. +It is free and open source. + +Platforms +~~~~~~~~~ +The 'utstring' macros have been tested on: + + * Linux, + * Windows, using Visual Studio 2008 and Visual Studio 2010 + +Usage +----- + +Declaration +~~~~~~~~~~~ + +The dynamic string itself has the data type `UT_string`. It is declared like, + + UT_string *str; + +New and free +~~~~~~~~~~~~ +The next step is to create the string using `utstring_new`. Later when you're +done with it, `utstring_free` will free it and all its content. + +Manipulation +~~~~~~~~~~~~ +The `utstring_printf` or `utstring_bincpy` operations insert (copy) data into +the string. To concatenate one utstring to another, use `utstring_concat`. To +clear the content of the string, use `utstring_clear`. The length of the string +is available from `utstring_len`, and its content from `utstring_body`. This +evaluates to a `char*`. The buffer it points to is always null-terminated. +So, it can be used directly with external functions that expect a string. +This automatic null terminator is not counted in the length of the string. + +Samples +~~~~~~~ + +These examples show how to use utstring. + +.Sample 1 +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utstring.h" + +int main() { + UT_string *s; + + utstring_new(s); + utstring_printf(s, "hello world!" ); + printf("%s\n", utstring_body(s)); + + utstring_free(s); + return 0; +} +------------------------------------------------------------------------------- + +The next example demonstrates that `utstring_printf` 'appends' to the string. +It also shows concatenation. + +.Sample 2 +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utstring.h" + +int main() { + UT_string *s, *t; + + utstring_new(s); + utstring_new(t); + + utstring_printf(s, "hello " ); + utstring_printf(s, "world " ); + + utstring_printf(t, "hi " ); + utstring_printf(t, "there " ); + + utstring_concat(s, t); + printf("length: %u\n", utstring_len(s)); + printf("%s\n", utstring_body(s)); + + utstring_free(s); + utstring_free(t); + return 0; +} +------------------------------------------------------------------------------- + +The next example shows how binary data can be inserted into the string. It also +clears the string and prints new data into it. + +.Sample 3 +------------------------------------------------------------------------------- +#include <stdio.h> +#include "utstring.h" + +int main() { + UT_string *s; + char binary[] = "\xff\xff"; + + utstring_new(s); + utstring_bincpy(s, binary, sizeof(binary)); + printf("length is %u\n", utstring_len(s)); + + utstring_clear(s); + utstring_printf(s,"number %d", 10); + printf("%s\n", utstring_body(s)); + + utstring_free(s); + return 0; +} +------------------------------------------------------------------------------- + +[[operations]] +Reference +--------- +These are the utstring operations. + +Operations +~~~~~~~~~~ + +[width="100%",cols="50<m,40<",grid="none",options="none"] +|=============================================================================== +| utstring_new(s) | allocate a new utstring +| utstring_renew(s) | allocate a new utstring (if s is `NULL`) otherwise clears it +| utstring_free(s) | free an allocated utstring +| utstring_init(s) | init a utstring (non-alloc) +| utstring_done(s) | dispose of a utstring (non-alloc) +| utstring_printf(s,fmt,...) | printf into a utstring (appends) +| utstring_bincpy(s,bin,len) | insert binary data of length len (appends) +| utstring_concat(dst,src) | concatenate src utstring to end of dst utstring +| utstring_clear(s) | clear the content of s (setting its length to 0) +| utstring_len(s) | obtain the length of s as an unsigned integer +| utstring_body(s) | get `char*` to body of s (buffer is always null-terminated) +| utstring_find(s,pos,str,len) | forward search from pos for a substring +| utstring_findR(s,pos,str,len) | reverse search from pos for a substring +|=============================================================================== + +New/free vs. init/done +~~~~~~~~~~~~~~~~~~~~~~ +Use `utstring_new` and `utstring_free` to allocate a new string or free it. If +the UT_string is statically allocated, use `utstring_init` and `utstring_done` +to initialize or free its internal memory. + +Substring search +~~~~~~~~~~~~~~~~ +Use `utstring_find` and `utstring_findR` to search for a substring in a utstring. +It comes in forward and reverse varieties. The reverse search scans from the end of +the string backward. These take a position to start searching from, measured from 0 +(the start of the utstring). A negative position is counted from the end of +the string, so, -1 is the last position. Note that in the reverse search, the +initial position anchors to the 'end' of the substring being searched for; +e.g., the 't' in 'cat'. The return value always refers to the offset where the +substring 'starts' in the utstring. When no substring match is found, -1 is +returned. + +For example if a utstring called `s` contains: + + ABC ABCDAB ABCDABCDABDE + +Then these forward and reverse substring searches for `ABC` produce these results: + + utstring_find( s, -9, "ABC", 3 ) = 15 + utstring_find( s, 3, "ABC", 3 ) = 4 + utstring_find( s, 16, "ABC", 3 ) = -1 + utstring_findR( s, -9, "ABC", 3 ) = 11 + utstring_findR( s, 12, "ABC", 3 ) = 4 + utstring_findR( s, 2, "ABC", 3 ) = 0 + +"Multiple use" substring search +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +The preceding examples show "single use" versions of substring matching, where +the internal Knuth-Morris-Pratt (KMP) table is internally built and then freed +after the search. If your program needs to run many searches for a given +substring, it is more efficient to save the KMP table and reuse it. + +To reuse the KMP table, build it manually and then pass it into the internal +search functions. The functions involved are: + + _utstring_BuildTable (build the KMP table for a forward search) + _utstring_BuildTableR (build the KMP table for a reverse search) + _utstring_find (forward search using a prebuilt KMP table) + _utstring_findR (reverse search using a prebuilt KMP table) + +This is an example of building a forward KMP table for the substring "ABC", and +then using it in a search: + + long *KPM_TABLE, offset; + KPM_TABLE = (long *)malloc( sizeof(long) * (strlen("ABC")) + 1)); + _utstring_BuildTable("ABC", 3, KPM_TABLE); + offset = _utstring_find(utstring_body(s), utstring_len(s), "ABC", 3, KPM_TABLE ); + free(KPM_TABLE); + +Note that the internal `_utstring_find` has the length of the UT_string as its +second argument, rather than the start position. You can emulate the position +parameter by adding to the string start address and subtracting from its length. + +Notes +~~~~~ + +1. To override the default out-of-memory handling behavior (which calls `exit(-1)`), + override the `utstring_oom()` macro before including `utstring.h`. + For example, + + #define utstring_oom() do { longjmp(error_handling_location); } while (0) + ... + #include "utstring.h" + +// vim: set nowrap syntax=asciidoc: |