aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorPaul Donald <itsascambutmailmeanyway@gmail.com>2023-11-27 09:08:25 +0100
committerGitHub <noreply@github.com>2023-11-27 09:08:25 +0100
commita5dcc1739616f9fe1cda6bd1dea06c30f07dcdcf (patch)
treebfa882c3e4770993a881b122d11a22f207f7c941 /README.md
parent3416db11dc3ff5b6c259f25d955e283d1d6e4b40 (diff)
Update README.md (#32)
Sp/gr. Co-authored-by: Toni <matzeton@googlemail.com>
Diffstat (limited to 'README.md')
-rw-r--r--README.md82
1 files changed, 42 insertions, 40 deletions
diff --git a/README.md b/README.md
index 70908402f..3443b3d10 100644
--- a/README.md
+++ b/README.md
@@ -16,26 +16,26 @@
# Disclaimer
-Please respect&protect the privacy of others.
+Please respect & protect the privacy of others.
The purpose of this software is not to spy on others, but to detect network anomalies and malicious traffic.
# Abstract
nDPId is a set of daemons and tools to capture, process and classify network traffic.
-It's minimal dependencies (besides a half-way modern c library and POSIX threads) are libnDPI (>=4.8.0 or current github dev branch) and libpcap.
+Its minimal dependencies (besides a half-way modern C library and POSIX threads) are libnDPI (>=4.8.0 or current github dev branch) and libpcap.
The daemon `nDPId` is capable of multithreading for packet processing, but w/o mutexes for performance reasons.
-Instead synchronization is achieved by a packet distribution mechanism.
-To balance all workload to all threads (more or less) equally a unique identifier represented as hash value is calculated using a 3-tuple consisting of IPv4/IPv6 src/dst address, IP header value of the layer4 protocol and (for TCP/UDP) src/dst port. Other protocols e.g. ICMP/ICMPv6 are lacking relevance for DPI, thus nDPId does not distinguish between different ICMP/ICMPv6 flows coming from the same host. Saves memory and performance, but might change in the future.
+Instead, synchronization is achieved by a packet distribution mechanism.
+To balance the workload to all threads (more or less) equally, a unique identifier represented as hash value is calculated using a 3-tuple consisting of: IPv4/IPv6 src/dst address; IP header value of the layer4 protocol; and (for TCP/UDP) src/dst port. Other protocols e.g. ICMP/ICMPv6 lack relevance for DPI, thus nDPId does not distinguish between different ICMP/ICMPv6 flows coming from the same host. This saves memory and performance, but might change in the future.
-`nDPId` uses libnDPI's JSON serialization interface to generate a JSON strings for each event it receive from the library and which it then sends out to a UNIX-socket (default: /tmp/ndpid-collector.sock ). From such a socket, `nDPIsrvd` (or other custom applications) can retrieve incoming JSON-messages and further proceed working/distributing messages to higher-level applications.
+`nDPId` uses libnDPI's JSON serialization interface to generate a JSON strings for each event it receives from the library and which it then sends out to a UNIX-socket (default: `/tmp/ndpid-collector.sock` ). From such a socket, `nDPIsrvd` (or other custom applications) can retrieve incoming JSON-messages and further proceed working/distributing messages to higher-level applications.
-Unfortunately `nDPIsrvd` does currently not support any encryption/authentication for TCP connections (TODO!).
+Unfortunately, `nDPIsrvd` does not yet support any encryption/authentication for TCP connections (TODO!).
# Architecture
-This project uses some kind of microservice architecture.
+This project uses a kind of microservice architecture.
```text
connect to UNIX socket [1] connect to UNIX/TCP socket [2]
@@ -71,7 +71,7 @@ where:
JSON messages streamed by both `nDPId` and `nDPIsrvd` are presented with:
-* a 5-digit-number describing (as decimal number) of the **entire** JSON string including the newline `\n` at the end;
+* a 5-digit-number describing (as decimal number) the **entire** JSON string including the newline `\n` at the end;
* the JSON messages
```text
@@ -88,12 +88,12 @@ as with the following example:
The full stream of `nDPId` generated JSON-events can be retrieved directly from `nDPId`, without relying on `nDPIsrvd`, by providing a properly managed UNIX-socket.
-Technical details about JSON-messages format can be obtained from related `.schema` file included in the `schema` directory
+Technical details about the JSON-message format can be obtained from the related `.schema` file included in the `schema` directory
# Events
-`nDPId` generates JSON strings whereas each string is assigned to a certain event.
+`nDPId` generates JSON strings whereby each string is assigned to a certain event.
Those events specify the contents (key-value-pairs) of the JSON string.
They are divided into four categories, each with a number of subevents.
@@ -132,10 +132,10 @@ Detailed JSON-schema is available [here](schema/daemon_event_schema.json)
## Packet Events
-There are 2 events containing base64 encoded packet payload either belonging to a flow or not:
+There are 2 events containing base64 encoded packet payloads either belonging to a flow or not:
1. packet: does not belong to any flow
-2. packet-flow: does belong to a flow e.g. TCP/UDP or ICMP
+2. packet-flow: belongs to a flow e.g. TCP/UDP or ICMP
Detailed JSON-schema is available [here](schema/packet_event_schema.json)
@@ -143,11 +143,11 @@ Detailed JSON-schema is available [here](schema/packet_event_schema.json)
There are 9 distinct events related to a flow:
1. new: a new TCP/UDP/ICMP flow seen which will be tracked
-2. end: a TCP connections terminates
+2. end: a TCP connection terminates
3. idle: a flow timed out, because there was no packet on the wire for a certain amount of time
4. update: inform nDPIsrvd or other apps about a long-lasting flow, whose detection was finished a long time ago but is still active
5. analyse: provide some information about extracted features of a flow (Experimental; disabled per default, enable with `-A`)
-6. guessed: `libnDPI` was not able to reliable detect a layer7 protocol and falls back to IP/Port based detection
+6. guessed: `libnDPI` was not able to reliably detect a layer7 protocol and falls back to IP/Port based detection
7. detected: `libnDPI` sucessfully detected a layer7 protocol
8. detection-update: `libnDPI` dissected more layer7 protocol data (after detection already done)
9. not-detected: neither detected nor guessed
@@ -158,8 +158,8 @@ Detailed JSON-schema is available [here](schema/flow_event_schema.json). Also, a
A flow can have three different states while it is been tracked by `nDPId`.
-1. skipped: the flow will be tracked, but no detection will happen to safe memory, see command line argument `-I` and `-E`
-2. finished: detection finished and the memory used for the detection is free'd
+1. skipped: the flow will be tracked, but no detection will happen to safe memory. See command line argument `-I` and `-E`
+2. finished: detection finished and the memory used for the detection is freed
3. info: detection is in progress and all flow memory required for `libnDPI` is allocated (this state consumes most memory)
# Build (CMake)
@@ -181,7 +181,7 @@ see below for a full/test live-session
![](examples/ndpid_install_and_run.gif)
-Based on your building environment and/or desiderata, you could need:
+Based on your build environment and/or desiderata, you could need:
```shell
mkdir build
@@ -197,8 +197,8 @@ cd build
cmake .. -DSTATIC_LIBNDPI_INSTALLDIR=[path/to/your/libnDPI/installdir]
```
-If you're using the latter one, make sure that you've configured libnDPI with `./configure --prefix=[path/to/your/libnDPI/installdir]`
-and do not forget to set the all necessary CMake variables to link against shared libraries used by your nDPI build.
+If you use the latter, make sure that you've configured libnDPI with `./configure --prefix=[path/to/your/libnDPI/installdir]`
+and remember to set the all-necessary CMake variables to link against shared libraries used by your nDPI build.
e.g.:
@@ -216,19 +216,21 @@ cd build
cmake .. -DBUILD_NDPI=ON
```
-The CMake cache variable `-DBUILD_NDPI=ON` builds a version of `libnDPI` residing as git submodule in this repository.
+The CMake cache variable `-DBUILD_NDPI=ON` builds a version of `libnDPI` residing as a git submodule in this repository.
# run
-As mentioned above, in order to run `nDPId` a UNIX-socket need to be provided in order to stream our related JSON-data.
+As mentioned above, in order to run `nDPId`, a UNIX-socket needs to be provided in order to stream our related JSON-data.
Such a UNIX-socket can be provided by both the included `nDPIsrvd` daemon, or, if you simply need a quick check, with the [ncat](https://nmap.org/book/ncat-man.html) utility, with a simple `ncat -U /tmp/listen.sock -l -k`. Remember that OpenBSD `netcat` is not able to handle multiple connections reliably.
-Once the socket is ready, you can run `nDPId` capturing and analyzing your own traffic, with something similar to:
+Once the socket is ready, you can run `nDPId` capturing and analyzing your own traffic, with something similar to: `sudo nDPId -c /tmp/listen.sock`
+If you're using OpenBSD `netcat`, you need to run: `sudo nDPId -c /tmp/listen.sock -o max-reader-threads=1`
+Make sure that the UNIX socket is accessible by the user (see -u) to whom nDPId changes to, default: nobody.
-Of course, both `ncat` and `nDPId` need to point to the same UNIX-socket (`nDPId` provides the `-c` option, exactly for this. As a default, `nDPId` refer to `/tmp/ndpid-collector.sock`, and the same default-path is also used by `nDPIsrvd` as for the incoming socket).
+Of course, both `ncat` and `nDPId` need to point to the same UNIX-socket (`nDPId` provides the `-c` option, exactly for this. By default, `nDPId` refers to `/tmp/ndpid-collector.sock`, and the same default-path is also used by `nDPIsrvd` for the incoming socket).
-You also need to provide `nDPId` some real-traffic. You can capture your own traffic, with something similar to:
+Give `nDPId` some real-traffic. You can capture your own traffic, with something similar to:
```shell
socat -u UNIX-Listen:/tmp/listen.sock,fork - # does the same as `ncat`
@@ -256,7 +258,7 @@ Daemons:
make -C [path-to-a-build-dir] daemon
```
-Or you can proceed with a manual approach with:
+Or a manual approach with:
```shell
./nDPIsrvd -d
@@ -291,22 +293,22 @@ Suboptions for `-o`:
Format: `subopt` (unit, comment): description
* `max-flows-per-thread` (N, caution advised): affects max. memory usage
- * `max-idle-flows-per-thread` (N, safe): max. allowed idle flows which memory get's free'd after `flow-scan-interval`
+ * `max-idle-flows-per-thread` (N, safe): max. allowed idle flows whose memory gets freed after `flow-scan-interval`
* `max-reader-threads` (N, safe): amount of packet processing threads, every thread can have a max. of `max-flows-per-thread` flows
- * `daemon-status-interval` (ms, safe): specifies how often daemon event `status` will be generated
- * `compression-scan-interval` (ms, untested): specifies how often `nDPId` should scan for inactive flows ready for compression
- * `compression-flow-inactivity` (ms, untested): the earliest period of time that must elapse before `nDPId` may consider compressing a flow that did neither send nor receive any data
- * `flow-scan-interval` (ms, safe): min. amount of time after which `nDPId` will scan for idle or long-lasting flows
- * `generic-max-idle-time` (ms, untested): time after which a non TCP/UDP/ICMP flow will time out
- * `icmp-max-idle-time` (ms, untested): time after which an ICMP flow will time out
- * `udp-max-idle-time` (ms, caution advised): time after which an UDP flow will time out
- * `tcp-max-idle-time` (ms, caution advised): time after which a TCP flow will time out
- * `tcp-max-post-end-flow-time` (ms, caution advised): a TCP flow that received a FIN or RST will wait that amount of time before flow tracking will be stopped and the flow memory free'd
- * `max-packets-per-flow-to-send` (N, safe): max. `packet-flow` events that will be generated for the first N packets of each flow
- * `max-packets-per-flow-to-process` (N, caution advised): max. packets that will be processed by `libnDPI`
+ * `daemon-status-interval` (ms, safe): specifies how often daemon event `status` is generated
+ * `compression-scan-interval` (ms, untested): specifies how often `nDPId` scans for inactive flows ready for compression
+ * `compression-flow-inactivity` (ms, untested): the shortest period of time elapsed before `nDPId` considers compressing a flow that neither sent nor received any data
+ * `flow-scan-interval` (ms, safe): min. amount of time after which `nDPId` scans for idle or long-lasting flows
+ * `generic-max-idle-time` (ms, untested): time after which a non TCP/UDP/ICMP flow times out
+ * `icmp-max-idle-time` (ms, untested): time after which an ICMP flow times out
+ * `udp-max-idle-time` (ms, caution advised): time after which an UDP flow times out
+ * `tcp-max-idle-time` (ms, caution advised): time after which a TCP flow times out
+ * `tcp-max-post-end-flow-time` (ms, caution advised): a TCP flow that received a FIN or RST waits this amount of time before flow tracking stops and the flow memory is freed
+ * `max-packets-per-flow-to-send` (N, safe): max. `packet-flow` events generated for the first N packets of each flow
+ * `max-packets-per-flow-to-process` (N, caution advised): max. amount of packets processed by `libnDPI`
* `max-packets-per-flow-to-analyze` (N, safe): max. packets to analyze before sending an `analyse` event, requires `-A`
- * `error-event-threshold-n` (N, safe): max. error events to sent until threshold time passed by
- * `error-event-threshold-time` (N, safe): time after which the error event thresold will be reset
+ * `error-event-threshold-n` (N, safe): max. error events to send until threshold time has passed
+ * `error-event-threshold-time` (N, safe): time after which the error event threshold resets
# test
@@ -329,7 +331,7 @@ e.g.:
Remember that all test results are tied to a specific libnDPI commit hash
as part of the `git submodule`. Using `test/run_tests.sh` for other commit hashes
-will most likely result in PCAP diff's.
+will most likely result in PCAP diffs.
Why not use `examples/py-flow-dashboard/flow-dash.py` to visualize nDPId's output.