libndpi.git - Open Source Deep Packet Inspection Software Toolkit

	Commit message (Collapse)	Author	Age
*	Remove classification "by-ip" from protocol stack (#1743)	Ivan Nardi	2022-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Basically: * "classification by-ip" (i.e. `flow->guessed_protocol_id_by_ip` is NEVER returned in the protocol stack (i.e. `flow->detected_protocol_stack[]`); * if the application is interested into such information, it can access `ndpi_protocol->protocol_by_ip` itself. There are mainly 4 points in the code that set the "classification by-ip" in the protocol stack: the generic `ndpi_set_detected_protocol()`/ `ndpi_detection_giveup()` functions and the HTTP/STUN dissectors. In the unit tests output, a print about `ndpi_protocol->protocol_by_ip` has been added for each flow: the huge diff of this commit is mainly due to that. Strictly speaking, this change is NOT an API/ABI breakage, but there are important differences in the classification results. For examples: * TLS flows without the initial handshake (or without a matching SNI/certificate) are simply classified as `TLS`; * similar for HTTP or QUIC flows; * DNS flows without a matching request domain are simply classified as `DNS`; we don't have `DNS/Google` anymore just because the server is 8.8.8.8 (that was an outrageous behaviour...); * flows previusoly classified only "by-ip" are now classified as `NDPI_PROTOCOL_UNKNOWN`. See #1425 for other examples of why adding the "classification by-ip" in the protocol stack is a bad idea. Please, note that IPV6 is not supported :( (long standing issue in nDPI) i.e. `ndpi_protocol->protocol_by_ip` wil be always `NDPI_PROTOCOL_UNKNOWN` for IPv6 flows. Define `NDPI_CONFIDENCE_MATCH_BY_IP` has been removed. Close #1687
*	Remove a case of guessed sub-classification	Nardi Ivan	2022-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This code is triggered only for "unknown" flows with a valid sni/hostname. Why in that case the guessed classification should be something like `DNS/Subprotocol_depending_on_hostname`? Why DNS as master and not HTTP or TLS or QUIC? Furthermore, I have not been able to trigger a positive match from that lookup. I strongly think that if we had a valid subprotocol, we would have a valid master in the first place. In doubt, remove it completely. As a follow up, we should investigate why some dissectors (the HTTP one, at least) set the sni/hostname field without setting a valid protocol, in the first place. This behaviour seems quite suspicious, if not plainly buggy.
*	HTTP: remove some wrong code	Nardi Ivan	2022-09-13
\| \| \| \| \| \| \| \| \| \|	This code seems wrong or in the wrong place, at least: * "classification by port" and "classification by ip" protocols (i.e "guessed" protocols) should be used to set the protocol stack only after trying all the dissectors, and only by the generic code * there are no reason (for a dissector) to update the "guessed" information using the protocol stack values: it is usually the other way around (see previous point)
*	Fix `ndpi_do_guess()` (#1731)	Ivan Nardi	2022-09-12
\| \| \| \| \|	Avoid a double call of `ndpi_guess_host_protocol_id()`. Some code paths work for ipv4/6 both Remove some never used code.
*	HTTP: correctly set the classification protocols	Nardi Ivan	2022-09-06
	Classification should always be set via `ndpi_set_detected_protocol()` to be sure to set a correct `confidence` value, too. Having a "known" protocol stack with `NDPI_CONFIDENCE_UNKNOWN` as confidence, is not valid. This code in HTTP dissector likely needs some more thoughts (the classification itself of the attached example doesn't make a lot of sense), but the goal of this commit is only to always have a valid `confidence` value.