| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
Extend internal unit tests to handle multiple configurations.
As some examples, add tests about:
* disabling some protocols
* disabling Ookla aggressiveness
Every configurations data is stored in a dedicated directory under
`tests\cfgs`
|
|
|
|
|
|
| |
DPI (#1891)
Average values are already printed, but this change should ease to
identify regressions/improvements.
|
| |
|
|
|
|
|
|
| |
We need to keep separete counters to keep tracks of packet numbers with
and without any payload.
Regression introduced in 5849863ef
|
| |
|
|
|
|
|
| |
Avoid some LineCall and Jabber false positives.
Detect Discord mid flows.
Fix Bittorrent detection.
|
|
|
|
|
|
|
|
|
|
|
| |
Classification "by-port" should be the last possible effort, *after*
having test all the LRU caches.
Remove some dead code from `ndpi_detection_giveup()`:
`flow->guessed_protocol_id` is never set to any od those voip protocols
and at that point in this function we never have both a master *and* a
application protocols.
Coverage reports (both from unit tests and from fuzzing) confirms that
was dead code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These protocols:
* have been addeded in the OpenDPI era
* have never been updated since then
* we don't have any pcap examples [*]
If (and it is a big if...) some of these protocols are still somehow
used and if someone is still interested in them, we can probably
re-add them starting from scratch (because the current detection
rules are probably outdated)
Protocols removed: DIRECT_DOWNLOAD_LINK, APPLEJUICE, DIRECTCONNECT,
OPENFT, FASTTRACK, SHOUTCAST, THUNDER, AYIYA, STEALTHNET, FIESTA,
FLORENSIA, AIMINI, SOPCAST
PPSTREAM dissector works (...) only on UDP.
[*]: with do have an AIMINI test pcap but it was some trivial http
traffic detected only by hostname matching, on domains no more
available...
|
|
|
|
|
| |
Signed-off-by: Darryl Sokoloski <darryl@sokoloski.ca>
Signed-off-by: Darryl Sokoloski <darryl@sokoloski.ca>
|
|
|
|
| |
See: "Enabling Passive Measurement of Zoom Performance in Production Networks"
https://dl.acm.org/doi/pdf/10.1145/3517745.3561414
|
|
|
|
|
|
|
| |
We already performed exactly these lookups in the generic code to
populate `flow->guessed_protocol_id_by_ip`: use it!
This code probably needs a deeper review, since it is basicaly a simple
matching on ip + port.
|
|
|
|
|
|
|
| |
* all credits goes to @verzulli
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
|
|
|
|
|
|
| |
* all credits goes to @verzulli
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
|
|
|
|
|
|
|
|
| |
0 as size value disable the cache.
The diffs in unit tests are due to the fact that some lookups are
performed before the first insert: before this change these lookups
weren't counted because the cache was not yet initialized, now they are.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically:
* "classification by-ip" (i.e. `flow->guessed_protocol_id_by_ip` is
NEVER returned in the protocol stack (i.e.
`flow->detected_protocol_stack[]`);
* if the application is interested into such information, it can access
`ndpi_protocol->protocol_by_ip` itself.
There are mainly 4 points in the code that set the "classification
by-ip" in the protocol stack: the generic `ndpi_set_detected_protocol()`/
`ndpi_detection_giveup()` functions and the HTTP/STUN dissectors.
In the unit tests output, a print about `ndpi_protocol->protocol_by_ip`
has been added for each flow: the huge diff of this commit is mainly due
to that.
Strictly speaking, this change is NOT an API/ABI breakage, but there are
important differences in the classification results. For examples:
* TLS flows without the initial handshake (or without a matching
SNI/certificate) are simply classified as `TLS`;
* similar for HTTP or QUIC flows;
* DNS flows without a matching request domain are simply classified as
`DNS`; we don't have `DNS/Google` anymore just because the server is
8.8.8.8 (that was an outrageous behaviour...);
* flows previusoly classified only "by-ip" are now classified as
`NDPI_PROTOCOL_UNKNOWN`.
See #1425 for other examples of why adding the "classification by-ip" in
the protocol stack is a bad idea.
Please, note that IPV6 is not supported :( (long standing issue in nDPI) i.e.
`ndpi_protocol->protocol_by_ip` wil be always `NDPI_PROTOCOL_UNKNOWN` for
IPv6 flows.
Define `NDPI_CONFIDENCE_MATCH_BY_IP` has been removed.
Close #1687
|
|
|
|
|
| |
Avoid a double call of `ndpi_guess_host_protocol_id()`.
Some code paths work for ipv4/6 both
Remove some never used code.
|
|
|
|
|
|
|
|
|
| |
Add detection over TCP and fix detection over IPv6.
Rename some variables since Stun dissector is no more "udp-centric".
Stun dissector should always classified the flow as `STUN` or
`STUN/Something`.
Don't touch `flow->guessed_host_protocol_id` field, which should be
always be related to "ip-classification" only.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The host automa is used for two tasks:
* protocol sub-classification (obviously);
* DGA evaluation: the idea is that if a domain is present in this
automa, it can't be a DGA, regardless of its format/name.
In most dissectors both checks are executed, i.e. the code is something
like:
```
ndpi_match_host_subprotocol(..., flow->host_server_name, ...);
ndpi_check_dga_name(..., flow->host_server_name,...);
```
In that common case, we can perform only one automa lookup: if we check the
sub-classification before the DGA, we can avoid the second lookup in
the DGA function itself.
|
|
|
|
|
|
|
|
| |
* CQL: fixed byte order conversion (BigEndian not LittleEndian)
* CQL: increased required successful dissected packets to prevent false-positives
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
|
|
|
|
|
| |
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
|
|
|
|
|
| |
* added static assert if supported, to complain if the flow struct changes
Signed-off-by: lns <matzeton@googlemail.com>
|
|
|
|
|
|
|
|
|
|
| |
Add (basic) internal stats to the main data structures used by the
library; they might be usefull to check how effective these structures
are.
Add an option to `ndpiReader` to dump them; enabled by default in the
unit tests.
This new option enables/disables dumping of "num dissectors calls"
values, too (see b4cb14ec).
|
|
|
|
|
|
|
|
|
|
|
| |
Since e6b332aa, we have proper support for detecting client/server
direction. So Tinc dissector is now able to properly initialize the
cache entry only when needed and not anymore at the SYN time; initializing
that entry for **every** SYN packets was a complete waste of resources.
Since 4896dabb, the various `struct ndpi_call_function_struct`
structures are not more separate objects and therefore comparing them
using only their pointers is bogus: this bug was triggered by this
change because `ndpi_str->callback_buffer_size_tcp_no_payload` is now 0.
|
|
|
|
|
|
|
| |
Tcp retransmissions should be ignored.
Remove some unused protocol bitmasks.
Update script to download Whatsapp IP list.
|
|
|
| |
Signed-off-by: lns <matzeton@googlemail.com>
|
|
|
|
| |
Signed-off-by: lns <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
|
|
|
|
|
|
| |
Skype detection over TCP has been completely disable since 659f75138 (3
years ago!).
Since that logic was too weak anyway, remove it.
|
|
|
|
| |
We should stop processing a flow if all protocols have been excluded or
if we have already processed too many packets.
|
|
|
| |
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
|
| |
|
| |
|
|
|
|
| |
Added ability to identify application and network protocols
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add few scripts to easily update some IPs lists
Some IPs lists should be updated frequently: try to easy the process.
The basic idea is taken from d59fefd0 and a8fe74e5 (for Azure
addresses): one specific .c.inc file and one script for each protocol.
Add the possibility to don't load a specific list.
Rename the old NDPI_PROTOCOL_HOTMAIL id to NDPI_PROTOCOL_MS_OUTLOOK,
to identify Hotmail/Outlook/Exchange flows.
TODO: ipv6
Remove the 9 addresses associated to BitTorrent: they have been added in
e2f21116 but it is not clear why all the traffic to/from these ips
should be classified as BitTorrent.
* Added quotes
* Added quotes
Co-authored-by: Luca Deri <lucaderi@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We should have two protocols in classification results only when the
"master" protocol allows some sub-protocols.
Classifications like `AmazonAWS`, `TLS/AmazonAWS`, `DNS/AmazonAWS` are
fine. However classifications like `NTP/Apple`, `BitTorrent/Azure`,
`DNScrypt.AmazonAWS` or `NestLogSink.Google` are misleading.
For example, `ndpiReader`shows `BitTorrent/Azure` flows under `Azure`
statistics; that seems to be wrong or, at least, very misleading.
This is quite important since we have lots of addresses from CDN
operators.
The only drawback of this solution is that right now ICMP traffic is
classified simply as `ICMP`; if we are really interested in ICMP stuff
we can restore the old behaviour later.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a general rule, the higher the confidence value, the higher the
"reliability/precision" of the classification.
In other words, this new field provides an hint about "how" the flow
classification has been obtained.
For example, the application may want to ignore classification "by-port"
(they are not real DPI classifications, after all) or give a second
glance at flows classified via LRU caches (because of false positives).
Setting only one value for the confidence field is a bit tricky: more
work is probably needed in the next future to tweak/fix/improve the logic.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
The goal is to have a (roughly) idea about how many packets nDPI needs
to properly classify a flow.
Log this information (and guessed flows number too) during unit tests,
to keep track of improvements/regressions across commits.
|
| |
|
|
|
|
| |
Optimized stddev calculation
|
| |
|
| |
|
|
|
|
| |
Added packet lenght distribution bins
|
| |
|
|
|
|
| |
The first decoded address is now reported by ndpiReader
|
|
|
|
|
| |
Used IP-based detection to compute the application protocol
Improved application detection
|
| |
|
| |
|
| |
|