linkify.tokenize breaks string as url and text on encountering org.
Opened this issue · 1 comments
PratyushGar commented
I have a text: org.org_custom_connector
On parsing through tokenize, it breaks it into a URL and text token -
url: "org.org" && text: "_custom_connector
[
{
"t": "url",
"v": "org.org",
"tk": [
{
"t": "TLD",
"v": "org",
"s": 0,
"e": 3
},
{
"t": "DOT",
"v": ".",
"s": 3,
"e": 4
},
{
"t": "TLD",
"v": "org",
"s": 4,
"e": 7
}
]
},
{
"t": "text",
"v": "_custom_connector",
"tk": [
{
"t": "UNDERSCORE",
"v": "_",
"s": 7,
"e": 8
},
{
"t": "DOMAIN",
"v": "custom",
"s": 8,
"e": 14
},
{
"t": "UNDERSCORE",
"v": "_",
"s": 14,
"e": 15
},
{
"t": "DOMAIN",
"v": "connector",
"s": 15,
"e": 24
}
]
}
]
Is there any way to override this behavior of not splitting the URL?
artemik commented
Confirm, I see the same issue. For example backend.eu-central1.internal:8080
gets parsed as backend.eu
only. These kind of urls are common for cloud dns names. Would be nice to have a fix/option to handle these.