Take a list of known malware and ad-serving domains and generate an amalgamated configuration file fragment for unbound. This fragment when included in the main body of unbound.conf, will block these hosts and domains serving malware and/or intrusive ads.
You will need GNU Make (any recent version). And a recent golang toolchain
(>1.11). Assuming GNU Make is available as gmake
, type:
gmake
This will generate two config file fragments for unbound:
- bad-hosts.conf: Config file fragment with a few trackers; the list of blocklist items are in myfeed.txt
- big.conf: Very large list of blocklist domains and hosts (~30MB, ~700k entries). The blocklist feed comes from bigfeed.txt (auto-generated).
Include one of the config files (bad-hosts.conf or big.conf) in your unbound.conf as follows:
# include auto-generated ad-block/malware list include: /path/to/bad-hosts.conf
And reload unbound config to use the new blocklist.
The blocklist is generated by a golang program in the blgen directory. It is built using the shell script build. The output binary is put in a platform specific directory (bin/$os-$arch/blgen). Usage:
blgen [options] [blocklist ...] Read one or more blocklist files and generate a composite file containing blocked hosts and domains. The final output is written to STDOUT or to an output file. blgen can optionally read a feed (txt file) of well known 3rd party malware and tracker URLs. The feed.txt is a simple file: - Each line starts with either a 'txt' or 'json' followed by a URL. - The keyword 'txt' or 'json' identifies the type of output returned by the URL Example: txt http://pgl.yoyo.org/files/adhosts/plaintext txt http://mirror2.malwaredomains.com/files/justdomains Options: -c, --cache-dir D Use 'D' as the cache directory ["."] -F, --feed F Read blocklists from feed file 'F' [""] --no-cache Ignore the cache and re-fetch every blocklist [False] -o, --output-file F Write output to file 'F' [""] -f, --output-format T Set output format to 'T' (text or unbound) [""] -v, --verbose Show verbose output [false] -W, --allowlist F Add whistlist entries from file 'F' [[]]
The -W flag can be used multiple times to add multiple allow list sources.
blgen caches the downloaded blocklists and only refreshes it once a day. In the default invocation of blgen in GNUmakefile, the cache-dir is the current directory. Each cache file uses the URL as the prefix and a truncated SHA256 sum of the URL as the suffix. The cache can be ignored via the --no-cache option.
The go program is organized as follows:
- internal/blgen: contains the implementation of the blocklist DB, fetching host-lists etc.
- blgen/: contains the driver program ("main()") along with a few helper routines to generate the output.