url-black-list

url-black-list is a JavaScript library for blocking specified URLs, which may include unicode, using IDNA and punycode.

Motivation

In my personal app, some evil users posted spam content with URLs. In the begining, I could treat with this kind of spam by implementing filter using black list to block some domains. But afterwards, they started putting URL that can bypass our filter but send victimes to the same location. For example, browsers transform "β„°π“π’œm𝓅le.π’žβ„΄π“‚" into "example.com" when you put it in address bar. And "β„‘" can be transformed into "tel", even more amazingly, "㍑" can be "γƒͺγƒƒγƒˆγƒ«". So the simple text matching based black list is not good solution for this method because they can generate numerous number of equivalent URLs ("β„°π“π’œm𝓅le.π’žβ„΄π“‚", "E𝓍am𝓅le.π’žβ„΄π“‚", "eπ“π’œm𝓅le.co𝓂", "EXAMPLE.COM", "example.com" and so on) easily.

Installation

yarn add url-black-list
# or
npm install --save url-black-list

Examples

import { URLBlackList } from 'url-black-list';

const blackList = new URLBlackList();
blackList.add('example.com');
blackList.add('π’œπ’œπ’œπ’œ');
blackList.add('γ‚γ„γ†γˆγŠ.com');

blackList.isValidText('example.com'); // false
blackList.isValidText('β„°π“π’œm𝓅le.π’žβ„΄π“‚'); // false
blackList.isValidText('aaaa'); // false
blackList.isValidText('AAAA'); // false
blackList.isValidText('xn--l8jegik.com'); // false (This is punycode of γ‚γ„γ†γˆγŠ)

blackList.isValidText('valid.domain.com'); // true

License

MIT