libndpi.git - Open Source Deep Packet Inspection Software Toolkit

	Commit message (Collapse)	Author	Age
*	Test multiple `ndpiReader` configurations (#1931)	Ivan Nardi	2023-04-06
\| \| \| \| \| \| \| \| \|	Extend internal unit tests to handle multiple configurations. As some examples, add tests about: * disabling some protocols * disabling Ookla aggressiveness Every configurations data is stored in a dedicated directory under `tests\cfgs`
*	Add a new protocol id for generic advertisement/analytics/tracking stuff (#1904)	Ivan Nardi	2023-03-20
\|
*	ndpiReader: print how many packets (per flow) were needed to perform full ↵	Ivan Nardi	2023-03-01
\| \| \| \| \| \|	DPI (#1891) Average values are already printed, but this change should ease to identify regressions/improvements.
*	Further reduction of the size of some traces used as unit test (#1879)	Ivan Nardi	2023-01-30
\| \| \|	See a944514d. No flow/classification/metadata have been removed.
*	Improved connection refused detection	Luca Deri	2023-01-25
\|
*	Fix classification "by-port" (#1851)	Ivan Nardi	2023-01-17
\| \| \| \| \| \| \| \| \| \| \|	Classification "by-port" should be the last possible effort, after having test all the LRU caches. Remove some dead code from `ndpi_detection_giveup()`: `flow->guessed_protocol_id` is never set to any od those voip protocols and at that point in this function we never have both a master and a application protocols. Coverage reports (both from unit tests and from fuzzing) confirms that was dead code.
*	Reduce the size of some traces used as unit test (#1845)	Ivan Nardi	2023-01-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No traces and no flows has been removed; only long sessions has been reduced, keeping only their first packets. This is quite important in fuzzing systems, since these pcaps are used as initial seed. There is no value in fuzzing long sessions, because only the very first packets are really used/processed by nDPI. Before: ``` du -h tests/pcap/ 200M tests/pcap/ ``` After: ``` du -h tests/pcap/ 98M tests/pcap/ ```
*	STUN: add detection of ZOOM peer-to-peer flows (#1825)	Ivan Nardi	2022-12-11
\| \| \| \|	See: "Enabling Passive Measurement of Zoom Performance in Production Networks" https://dl.acm.org/doi/pdf/10.1145/3517745.3561414
*	Make LRU caches ipv6 aware (#1810)	Ivan Nardi	2022-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Simplest solution, keeping the existing cache data structure TLS certificate cache is used for DTLS traffic, too. Note that Ookla cache already works with ipv6 flows. TODO: * make the key/hashing more robust (extending the key size?) * update bittorrent cache too. That task is quite difficult because ntopng uses a public function (`ndpi_guess_undetected_protocol()`) intrinsically ipv4 only...
*	TLS: improve handling of ALPN(s) (#1784)	Ivan Nardi	2022-10-25
\| \| \| \| \| \| \| \|	Tell "Advertised" ALPN list from "Negotiated" ALPN; the former is extracted from the CH, the latter from the SH. Add some entries to the known ALPN list. Fix printing of "TLS Supported Versions" field.
*	Extend content match lists	Nardi Ivan	2022-09-22
\|
*	Remove classification "by-ip" from protocol stack (#1743)	Ivan Nardi	2022-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Basically: * "classification by-ip" (i.e. `flow->guessed_protocol_id_by_ip` is NEVER returned in the protocol stack (i.e. `flow->detected_protocol_stack[]`); * if the application is interested into such information, it can access `ndpi_protocol->protocol_by_ip` itself. There are mainly 4 points in the code that set the "classification by-ip" in the protocol stack: the generic `ndpi_set_detected_protocol()`/ `ndpi_detection_giveup()` functions and the HTTP/STUN dissectors. In the unit tests output, a print about `ndpi_protocol->protocol_by_ip` has been added for each flow: the huge diff of this commit is mainly due to that. Strictly speaking, this change is NOT an API/ABI breakage, but there are important differences in the classification results. For examples: * TLS flows without the initial handshake (or without a matching SNI/certificate) are simply classified as `TLS`; * similar for HTTP or QUIC flows; * DNS flows without a matching request domain are simply classified as `DNS`; we don't have `DNS/Google` anymore just because the server is 8.8.8.8 (that was an outrageous behaviour...); * flows previusoly classified only "by-ip" are now classified as `NDPI_PROTOCOL_UNKNOWN`. See #1425 for other examples of why adding the "classification by-ip" in the protocol stack is a bad idea. Please, note that IPV6 is not supported :( (long standing issue in nDPI) i.e. `ndpi_protocol->protocol_by_ip` wil be always `NDPI_PROTOCOL_UNKNOWN` for IPv6 flows. Define `NDPI_CONFIDENCE_MATCH_BY_IP` has been removed. Close #1687
*	Avoid useless host automa lookup (#1724)	Ivan Nardi	2022-09-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The host automa is used for two tasks: * protocol sub-classification (obviously); * DGA evaluation: the idea is that if a domain is present in this automa, it can't be a DGA, regardless of its format/name. In most dissectors both checks are executed, i.e. the code is something like: ``` ndpi_match_host_subprotocol(..., flow->host_server_name, ...); ndpi_check_dga_name(..., flow->host_server_name,...); ``` In that common case, we can perform only one automa lookup: if we check the sub-classification before the DGA, we can avoid the second lookup in the DGA function itself.
*	Patricia tree, Ahocarasick automa, LRU cache: add statistics (#1683)	Ivan Nardi	2022-07-29
\| \| \| \| \| \| \| \| \| \|	Add (basic) internal stats to the main data structures used by the library; they might be usefull to check how effective these structures are. Add an option to `ndpiReader` to dump them; enabled by default in the unit tests. This new option enables/disables dumping of "num dissectors calls" values, too (see b4cb14ec).
*	Update the protocol bitmask for some protocols (#1675)	Ivan Nardi	2022-07-27
\| \| \| \| \| \| \|	Tcp retransmissions should be ignored. Remove some unused protocol bitmasks. Update script to download Whatsapp IP list.
*	Improved Jabber/XMPP detection. (#1661)	Toni	2022-07-13
\| \| \|	Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
*	Keep track of how many dissectors calls we made for each flow (#1657)	Ivan Nardi	2022-07-11
\|
*	Fix handling of NDPI_UNIDIRECTIONAL_TRAFFIC risk (#1636)	Ivan Nardi	2022-07-05
\|
*	Added unidirectional traffic flow risk	Luca Deri	2022-06-20
\|
*	Add a "confidence" field about the reliability of the classification. (#1395)	Ivan Nardi	2022-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	As a general rule, the higher the confidence value, the higher the "reliability/precision" of the classification. In other words, this new field provides an hint about "how" the flow classification has been obtained. For example, the application may want to ignore classification "by-port" (they are not real DPI classifications, after all) or give a second glance at flows classified via LRU caches (because of false positives). Setting only one value for the confidence field is a bit tricky: more work is probably needed in the next future to tweak/fix/improve the logic.
*	Improve/add several protocols (#1383)	Ivan Nardi	2021-12-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve Microsoft, GMail, Likee, Whatsapp, DisneyPlus and Tiktok detection. Add Vimeo, Fuze, Alibaba and Firebase Crashlytics detection. Try to differentiate between Messenger/Signal standard flows (i.e chat) and their VOIP (video)calls (like we already do for Whatsapp and Snapchat). Add a partial list of some ADS/Tracking stuff. Fix Cassandra, Radius and GTP false positives. Fix DNS, Syslog and SIP false negatives. Improve GTP (sub)classification: differentiate among GTP-U, GTP_C and GTP_PRIME. Fix 3 LGTM warnings.
*	ndpiReader: slight simplificaton of the output (#1378)	Ivan Nardi	2021-11-27
\|
*	Reworked HTTP protocol dissection including HTTP proxy and HTTP connect	Luca Deri	2021-11-25
\|
*	Rework how hostname/SNI info is saved (#1330)	Ivan Nardi	2021-11-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looking at `struct ndpi_flow_struct` the two bigger fields are `host_server_name[240]` (mainly for HTTP hostnames and DNS domains) and `protos.tls_quic.client_requested_server_name[256]` (for TLS/QUIC SNIs). This commit aims to reduce `struct ndpi_flow_struct` size, according to two simple observations: 1) maximum one of these two fields is used for each flow. So it seems safe to merge them; 2) even if hostnames/SNIs might be very long, in practice they are rarely longer than a fews tens of bytes. So, using a (single) large buffer is a waste of memory for all kinds of flows. If we need to truncate the name, we keep the last characters, easing domain matching. Analyzing some real traffic, it seems safe to assume that the vast majority of hostnames/SNIs is shorter than 80 bytes. Hostnames/SNIs are always converted to lowercase. Attention was given so as to be sure that unit-tests outputs are not affected by this change. Because of a bug, TLS/QUIC SNI were always truncated to 64 bytes (the first 64 ones): as a consequence, there were some "Suspicious DGA domain name" and "TLS Certificate Mismatch" false positives.
*	Fixed cleartext protocol assignment (#1357)	Ivan Nardi	2021-10-25
\|
*	Refreshed results list	Luca Deri	2021-10-16
\|
*	Updated output	Luca Deri	2021-08-07
\|
*	ndpiReader: add statistics about nDPI performance (#1240)	Ivan Nardi	2021-07-13
\| \| \| \| \| \| \|	The goal is to have a (roughly) idea about how many packets nDPI needs to properly classify a flow. Log this information (and guessed flows number too) during unit tests, to keep track of improvements/regressions across commits.
*	Added flow risk score	Luca Deri	2021-05-18
\|
*	Added browser TLS heuristic	Luca Deri	2021-05-13
\|
*	Improved DGA detection	Luca Deri	2021-03-03
\| \| \| \| \| \| \| \|	Before Accuracy 66%, Precision 86%, Recall 38% After Accuracy 71%, Precision 89%, Recall 49%
*	Removed check for knowns protocols (major and app protocols)	Luca Deri	2021-03-03
\|
*	Improved DGA detection with trigrams. Disadvantage: slower startup time	Luca Deri	2021-03-03
\| \| \| \| \|	Reworked Tor dissector embedded in TLS (fixes #1141) Removed false positive on HTTP User-Agent
*	Fixes #1029	Luca Deri	2020-11-27
\|
*	Add Reddit support. (#1060)	Zied Aouini	2020-11-16
	* Add Reddit protocol. * Add Reddit test file and result. Co-authored-by: Luca Deri <lucaderi@users.noreply.github.com>