diff options
author | Ivan Nardi <12729895+IvanNardi@users.noreply.github.com> | 2024-03-06 19:25:59 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-03-06 19:25:59 +0100 |
commit | 21da53d3a03cad32dffa8447d9c4ae5bae62a3a2 (patch) | |
tree | 6ed55bf40dfcf2976df48933b38e6f014a3fe852 /.gitattributes | |
parent | 8f63a1173539a79c1fa1bb5c618fe561175a1ab5 (diff) |
ahocorasick: improve matching with subdomains (#2331)
The basic idea is to have the following logic:
* pattern "DOMAIN" matches the domain itself (i.e exact match) *and* any
subdomains (i.e. "ANYTHING.DOMAIN")
* pattern "DOMAIN." matches *also* any strings for which is a prefix
[please, note that this kind of match is handy but it is quite
dangerous...]
* pattern "-DOMAIN" matches *also* any strings for which is a postfix
Examples:
* pattern "wikipedia.it":
* "wikipiedia.it" -> OK
* "foo.wikipedia.it -> OK
* "foowikipedia.it -> NO MATCH
* "wikipedia.it.com -> NO MATCH
* pattern "wikipedia.":
* "wikipedia.it" -> OK
* "foo.wikipedia.it -> OK
* "foowikipedia.it -> NO MATCH
* "wikipedia.it.com -> OK
* pattern "-wikipedia.it":
* "wikipedia.it" -> NO MATCH
* "foo.wikipedia.it -> NO MATCH
* "0001-wikipedia.it -> OK
* "foo.0001-wikipedia.it -> OK
Bottom line:
* exact match
* prefix with "." (always, implicit)
* prefix with "-" (only if esplicitly set)
* postfix with "." (only if esplicitly set)
That means that the patterns cannot start with '.' anymore.
Close #2330
Diffstat (limited to '.gitattributes')
0 files changed, 0 insertions, 0 deletions