| Commit message (Collapse) | Author | Age |
|
|
| |
Fix: 2c7fb9179
|
|
|
|
| |
Attribute 0xC057 is defined in the Google public implementation of
webrtc (which is used by Google products but also by other applications)
|
| |
|
| |
|
|
|
|
|
|
| |
Regardless of the name, the removed trace doesn't contain meaningful
Hangout traffic.
Remove last piece of sub-classifiction based only on ip addresses.
|
| |
|
|
|
|
| |
TCP framing is optional
|
| |
|
|
|
|
| |
Look for RTP packets in the STUN sessions.
TODO: tell RTP from RTCP
|
|
|
|
|
| |
For a lot of protocols, reduce the number of packets after which the
protocols dissector gives up.
The values are quite arbitary, tring to not impact on classification
|
|
|
|
| |
The list has been taken from https://www.similarweb.com/top-websites/adult/
Fix a GoTo false positive.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` define from
`fuzz/Makefile.am`; it is already included by the main configure script
(when fuzzing).
Add a knob to force disabling of AESNI optimizations: this way we can
fuzz also no-aesni crypto code.
Move CRC32 algorithm into the library.
Add some fake traces to extend fuzzing coverage. Note that these traces
are hand-made (via scapy/curl) and must not be used as "proof" that the
dissectors are really able to identify this kind of traffic.
Some small updates to some dissectors:
CSGO: remove a wrong rule (never triggered, BTW). Any UDP packet starting
with "VS01" will be classified as STEAM (see steam.c around line 111).
Googling it, it seems right so.
XBOX: XBOX only analyses UDP flows while HTTP only TCP ones; therefore
that condition is false.
RTP, STUN: removed useless "break"s
Zattoo: `flow->zattoo_stage` is never set to any values greater or equal
to 5, so these checks are never true.
PPStream: `flow->l4.udp.ppstream_stage` is never read. Delete it.
TeamSpeak: we check for `flow->packet_counter == 3` just above, so the
following check `flow->packet_counter >= 3` is always false.
|
|
|
|
|
|
|
|
| |
All dissector callbacks should not be exported by the library; make static
some other local functions.
The callback logic in `ndpiReader` has never been used.
With internal libgcrypt, `gcry_control()` should always return no
errors.
We can check `categories` length at compilation time.
|
|
|
| |
Two caches already implemented a similar mechanism: make it generic.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The application may enable only some protocols.
Disabling a protocol means:
*) don't register/use the protocol dissector code (if any)
*) disable classification by-port for such a protocol
*) disable string matchings for domains/certificates involving this protocol
*) disable subprotocol registration (if any)
This feature can be tested with `ndpiReader -B list_of_protocols_to_disable`.
Custom protocols are always enabled.
Technically speaking, this commit doesn't introduce any API/ABI
incompatibility. However, calling `ndpi_set_protocol_detection_bitmask2()`
is now mandatory, just after having called `ndpi_init_detection_module()`.
Most of the diffs (and all the diffs in `/src/lib/protocols/`) are due to
the removing of some function parameters.
Fix the low level macro `NDPI_LOG`. This issue hasn't been detected
sooner simply because almost all the code uses only the helpers `NDPI_LOG_*`
|
|
|
|
| |
See: "Enabling Passive Measurement of Zoom Performance in Production Networks"
https://dl.acm.org/doi/pdf/10.1145/3517745.3561414
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplest solution, keeping the existing cache data structure
TLS certificate cache is used for DTLS traffic, too.
Note that Ookla cache already works with ipv6 flows.
TODO:
* make the key/hashing more robust (extending the key size?)
* update bittorrent cache too. That task is quite difficult because
ntopng uses a public function (`ndpi_guess_undetected_protocol()`)
intrinsically ipv4 only...
|
|
|
|
|
|
|
|
|
| |
LRU callbacks have been added in 460ff3c7a, but they have never been
used and they have never been extended to the other LRU caches.
`ndpi_search_tcp_or_udp()` basically returns the classification by port/ip
of the flow; calling it from the dissector is useless.
The same for TOR detection: ips are checked in the generic code
|
|
|
|
|
|
|
|
| |
0 as size value disable the cache.
The diffs in unit tests are due to the fact that some lookups are
performed before the first insert: before this change these lookups
weren't counted because the cache was not yet initialized, now they are.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically:
* "classification by-ip" (i.e. `flow->guessed_protocol_id_by_ip` is
NEVER returned in the protocol stack (i.e.
`flow->detected_protocol_stack[]`);
* if the application is interested into such information, it can access
`ndpi_protocol->protocol_by_ip` itself.
There are mainly 4 points in the code that set the "classification
by-ip" in the protocol stack: the generic `ndpi_set_detected_protocol()`/
`ndpi_detection_giveup()` functions and the HTTP/STUN dissectors.
In the unit tests output, a print about `ndpi_protocol->protocol_by_ip`
has been added for each flow: the huge diff of this commit is mainly due
to that.
Strictly speaking, this change is NOT an API/ABI breakage, but there are
important differences in the classification results. For examples:
* TLS flows without the initial handshake (or without a matching
SNI/certificate) are simply classified as `TLS`;
* similar for HTTP or QUIC flows;
* DNS flows without a matching request domain are simply classified as
`DNS`; we don't have `DNS/Google` anymore just because the server is
8.8.8.8 (that was an outrageous behaviour...);
* flows previusoly classified only "by-ip" are now classified as
`NDPI_PROTOCOL_UNKNOWN`.
See #1425 for other examples of why adding the "classification by-ip" in
the protocol stack is a bad idea.
Please, note that IPV6 is not supported :( (long standing issue in nDPI) i.e.
`ndpi_protocol->protocol_by_ip` wil be always `NDPI_PROTOCOL_UNKNOWN` for
IPv6 flows.
Define `NDPI_CONFIDENCE_MATCH_BY_IP` has been removed.
Close #1687
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The field `flow->guessed_host_protocol_id` is set at the beginning of
the flow analysis and it represents the "classification by ip" of the flow
itself.
This field should never be changed. Dissectors which want to provide an
"hint" about the classification, should update `flow->guessed_protocol_id`
instead. Such "hint" is useless if the dissector set the "extra-dissection"
data-path.
Rename such field to `guessed_protocol_id_by_ip` to better describe its
role.
Preliminary work necessary for #1687
|
|
|
|
|
|
|
|
|
| |
Add detection over TCP and fix detection over IPv6.
Rename some variables since Stun dissector is no more "udp-centric".
Stun dissector should always classified the flow as `STUN` or
`STUN/Something`.
Don't touch `flow->guessed_host_protocol_id` field, which should be
always be related to "ip-classification" only.
|
| |
|
| |
|
|
|
| |
See #1312
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a general rule, the higher the confidence value, the higher the
"reliability/precision" of the classification.
In other words, this new field provides an hint about "how" the flow
classification has been obtained.
For example, the application may want to ignore classification "by-port"
(they are not real DPI classifications, after all) or give a second
glance at flows classified via LRU caches (because of false positives).
Setting only one value for the confidence field is a bit tricky: more
work is probably needed in the next future to tweak/fix/improve the logic.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve Microsoft, GMail, Likee, Whatsapp, DisneyPlus and Tiktok
detection.
Add Vimeo, Fuze, Alibaba and Firebase Crashlytics detection.
Try to differentiate between Messenger/Signal standard flows (i.e chat)
and their VOIP (video)calls (like we already do for Whatsapp and
Snapchat).
Add a partial list of some ADS/Tracking stuff.
Fix Cassandra, Radius and GTP false positives.
Fix DNS, Syslog and SIP false negatives.
Improve GTP (sub)classification: differentiate among GTP-U, GTP_C and
GTP_PRIME.
Fix 3 LGTM warnings.
|
|
|
|
|
|
|
|
|
|
|
| |
There are no valid reasons for a (generic) protocol to ignore IPv6
traffic.
Note that:
* I have not found the specifications of "CheckPoint High Availability
Protocol", so I don't know how/if it supports IPv6
* all LRU caches are still IPv4 only
Even if src_id/dst_id stuff is probably useless (see #1279), the right
way to update the protocol classification is via `ndpi_set_detected_protocol()`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looking at `struct ndpi_flow_struct` the two bigger fields are
`host_server_name[240]` (mainly for HTTP hostnames and DNS domains) and
`protos.tls_quic.client_requested_server_name[256]`
(for TLS/QUIC SNIs).
This commit aims to reduce `struct ndpi_flow_struct` size, according to
two simple observations:
1) maximum one of these two fields is used for each flow. So it seems safe
to merge them;
2) even if hostnames/SNIs might be very long, in practice they are rarely
longer than a fews tens of bytes. So, using a (single) large buffer is a
waste of memory for all kinds of flows. If we need to truncate the name,
we keep the *last* characters, easing domain matching.
Analyzing some real traffic, it seems safe to assume that the vast
majority of hostnames/SNIs is shorter than 80 bytes.
Hostnames/SNIs are always converted to lowercase.
Attention was given so as to be sure that unit-tests outputs are not
affected by this change.
Because of a bug, TLS/QUIC SNI were always truncated to 64 bytes (the
*first* 64 ones): as a consequence, there were some "Suspicious DGA
domain name" and "TLS Certificate Mismatch" false positives.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can write to `flow->protos` only after a proper classification.
This issue has been found in Kerberos, DHCP, HTTP, STUN, IMO, FTP,
SMTP, IMAP and POP code.
There are two kinds of fixes:
* write to `flow->protos` only if a final protocol has been detected
* move protocol state out of `flow->protos`
The hard part is to find, for each protocol, the right tradeoff between
memory usage and code complexity.
Handle Kerberos like DNS: if we find a request, we set the protocol
and an extra callback to further parsing the reply.
For all the other protocols, move the state out of `flow->protos`. This
is an issue only for the FTP/MAIL stuff.
Add DHCP Class Identification value to the output of ndpiReader and to
the Jason serialization.
Extend code coverage of fuzz tests.
Close #1343
Close #1342
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are no real reasons to embed `struct ndpi_packet_struct` (i.e. "packet")
in `struct ndpi_flow_struct` (i.e. "flow"). In other words, we can avoid
saving dissection information of "current packet" into the "flow" state,
i.e. in the flow management table.
The nDPI detection module processes only one packet at the time, so it is
safe to save packet dissection information in `struct ndpi_detection_module_struct`,
reusing always the same "packet" instance and saving a huge amount of memory.
Bottom line: we need only one copy of "packet" (for detection module),
not one for each "flow".
It is not clear how/why "packet" ended up in "flow" in the first place.
It has been there since the beginning of the GIT history, but in the original
OpenDPI code `struct ipoque_packet_struct` was embedded in
`struct ipoque_detection_module_struct`, i.e. there was the same exact
situation this commit wants to achieve.
Most of the changes in this PR are some boilerplate to update something
like "flow->packet" into something like "module->packet" throughout the code.
Some attention has been paid to update `ndpi_init_packet()` since we need
to reset some "packet" fields before starting to process another packet.
There has been one important change, though, in ndpi_detection_giveup().
Nothing changed for the applications/users, but this function can't access
"packet" anymore.
The reason is that this function can be called "asynchronously" with respect
to the data processing, i.e in context where there is no valid notion of
"current packet"; for example ndpiReader calls it after having processed all
the traffic, iterating the entire session table.
Mining LRU stuff seems a bit odd (even before this patch): probably we need
to rethink it, as a follow-up.
|
|
|
|
| |
While at it, improve detection of Facebook Messenger
|
| |
|
|
|
|
| |
Cleaned up TLS code for DTLS detection by defining a new DTLS protocol
|
|
|
|
| |
prevented Skype calls to be properly identified
|
| |
|
|
|
| |
STUN traffic doesn't use multicast addresses
|
|
|
|
|
| |
Optimized various UDP dissectors
Removed dead protocols such as pando and pplive
|
| |
|
| |
|
|
|
|
| |
Cleaned Microsoft IP addresses list
|
| |
|
| |
|
| |
|
| |
|
| |
|