aboutsummaryrefslogtreecommitdiff
path: root/utils/gambling_sites_download.sh
Commit message (Collapse)AuthorAge
* Swap from Aho-Corasick to an experimental/home-grown algorithm that uses a ↵Luca Deri2023-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | probabilistic approach for handling Internet domain names. For switching back to Aho-Corasick it is necessary to edit ndpi-typedefs.h and uncomment the line // #define USE_LEGACY_AHO_CORASICK [1] With Aho-Corasick $ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory nDPI Memory statistics: nDPI Memory (once): 37.34 KB Flow Memory (per flow): 960 B Actual Memory: 33.09 MB Peak Memory: 33.09 MB [2] With the new algorithm $ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory nDPI Memory statistics: nDPI Memory (once): 37.31 KB Flow Memory (per flow): 960 B Actual Memory: 7.42 MB Peak Memory: 7.42 MB In essence from ~33 MB to ~7 MB This new algorithm will enable larger lists to be loaded (e.g. top 1M domans https://s3-us-west-1.amazonaws.com/umbrella-static/index.html) In ./lists there are file names that are named as <category>_<string>.list With -G ndpiReader can load all of them at startup
* Changes for supporinng more efficient sub-string matchingLuca Deri2023-08-26
|
* Included Gambling website data from the Polish `hazard.mf.gov.pl` list (#2041)snicket21002023-07-14
| | | | | | | | | | | | | * Refreshed the Belgium Gambling Site list data Unfortunately some hostnames have been removed from that list, which means they are disappearing from the `ndpi_gambling_match.c.inc` file as well. * build: added `libxml2-utils` (for `xmllint`) * Included Gambling website data from the Polish `hazard.mf.gov.pl` list The list contains over 30k gambling website hostnames as of today.
* Improved helper scripts. (#1986)Toni2023-05-28
| | | | | * added additional (more restrictive) checks Signed-off-by: lns <matzeton@googlemail.com>
* Added scripts to auto generate hostname/SNI *.inc files. (#1984)Toni2023-05-20
* add illegal gambling sites (Belgium) Signed-off-by: lns <matzeton@googlemail.com>