/hosts-compress-windows

Compress (consolidate) multiple URLs into single IP lines in the Windows hosts file.

Primary LanguageC++GNU General Public License v3.0GPL-3.0

Hosts Compress - Windows Beep Frequency Logo

Readme Card


About

This is a Windows command line utility that compresses (consolidates) multiple blocked (blocked with 0.0.0.0 and/or 127.0.0.1) URLs to a single IP line in a system's hosts file[1] with a few options such as setting the URL count (/c, /count) per IP line, discarding (/d, /discard) any extra content other than the IPs, and provisions for setting the input and output files (/i, /input and /o, /output respectively). The default output is to the console, but can be directed to a file with the afore mentioned /o argument. The default input file is the default Windows path (C:\Windows\System32\drivers\etc\hosts), but you can provide any other path with the afore mentioned /i argument, of course.

I have written the Windows version as the primary version and first because there only seems to be a problem with large hosts files in Windows because of the DNS Cache service.

This is the evolutionary step up from my scripts (PowerShell and Bash) as it is written in C++ and exponentially faster (whereas the scripts can take anwyhere from 3-5 minutes for a 200k+ line file, this program takes under 1 second; usually around 500ms for a file of the same size) than any script could ever be.

NOTE: This tool is highly recommended over the scripts, but you can, of course, still use the scripts.

Description

The hosts file is a text file in all (that I know of) operating systems that maps hostnames[2] (google.com for example) to IP addresses[3] (0.0.0.0 or 127.0.0.1 for examples) and they can become very large if you use a custom one (especially a consolidated one such as one of Steven Black's). In Linux the file's size is not much of an issue, but in Windows (especially) this can slow down DNS caching and therefore cause internet/application issues such as slowing down the internet/external network or even bringing the internet to a halt. A way to circumvent this issue is to compress (or aggregate) multiple host names into single lines (or disable the DNS Client, which I definitely do not recommend for various reasons (and especially if you use WSL)). Windows can handle 9 hostnames per line; for example:

0.0.0.0 fakename_1.url fakename_2.url fakename_3.url fakename_4.url fakename_5.url fakename_6.url fakename_7.url fakename_8.url fakename_9.url

whereas Linux has no limit (that I know of; I have tested 10000 names on a single line).

Therefore it's a great idea to compress larges hosts files, especially in Windows and one can do so with the tool found here. My last compression (as of the official release of this project) turned 189k+ hostnames into 21k+ lines and went from an hour plus of no internet after boot to instantaneous internet.

NOTE: There's a misconception that the hosts file's file size matters (some people say over 1MB is problematic) and it really doesn't. It's more about the number or URLS per IP line. For example; a file size of 6MB+ compresses down to 5MB and runs exponentially better to near unnoticeable.

I will be release a Linux version shortly, but to be honest it's not really needed. I am only releasing it because not much alteration to the Windows source is needed to convert it for Linux and you never know, someone may want to use it.

Motivation

A need to use a large hosts file in Windows and actually have access to the internet.


Usage

Environment

This is a Windows command line tool with arguments and as such can be run from any console/terminal, Run, shortcut, AutoHotkey, or any other method of launching a console application in the Windows environment.

The main focus for now is Windows and Linux, but as stated in the Motivation section the whole reason I started this is for the shipwreck that is the Windows environment. I love Windows and Linux, but to deny there are some issues in Windows (in both really, but most notably Windows) is folly.

How To Use

As with any portable program this can be placed anywhere you like on your machine, but more preferably in a directory that is in your %PATH% environment variable so you can run it without a full path (hostscompress for example). I recommend using a dedicated Bin directory, but of course, it's your choice. If your path isn't already in your %PATH% environment then I suggest adding it[4], but if not then you have provide the full path (C:\Path\To\hostscompress.exe for example) when executing the program.

Examples

Here you can find basic examples of how to use this specific program, but for more information about how to work with the hosts file (such as replace) you can hop on over to my "Hosts Compression Scripts" repository located here: https://github.com/Lateralus138/hosts-compress-windows.

Get Help:

 PS> hostscompress /help

  Hosts Compress - Consolidate multiple blocked URLs to single IP lines in a
  systems's hosts file with various options.

  @USAGE
    hostscompress [SWITCHES] [[OPTIONS] <PARAMS>]

  @SWITCHES:
    /h, /help       This help message.
    /m, /monochrome Verbose output is void of color.
    /q, /quiet      No verbosity; silences all errors and output with the errors
                    and output with the exception of the resulting compression
                    results if no output file is provided.
    /d, discard     Discard everything except the compressed lines from the
                    resulting output. This is only recommended if the HOSTS file
                    is only used for blocking URLs.

  @OPTIONS:
    /i, /input      Path to an optional INPUT FILE to parse. Defaults to the
                    default system hosts file location.
    /o, /output     Path to an optional OUTPUT FILE. The default is to output
                    to the console.
    /c, count       NUMBER of URLs to compress to a single line. The default is
                    9 (2-9)

Compress URLs of the default hosts file and display in the console:

 PS> hostscompress
 Reading "C:\WINDOWS\System32\drivers\etc\hosts" content...
Compiling urls for [0.0.0.0]...
Found urls for [0.0.0.0]; Compressing...
〘████████████████████████████████████████████████████████████████████████████████████████████████████〙100%
Compressed [193500] urls to [21500] lines...
Compiling urls for [127.0.0.1]...
No urls found for [127.0.0.1]...
# Title: StevenBlack/hosts
#
# This hosts file is a merged collection of hosts from reputable sources,
# with a dash of crowd sourcing via GitHub
#
# Date: 11 September 2023 01:16:59 (UTC)
# Number of unique domains: 193,508
#
# Fetch the latest version of this file: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
# Project home page: https://github.com/StevenBlack/hosts
# Project releases: https://github.com/StevenBlack/hosts/releases
#
# ===============================================================

127.0.0.1 localhost
127.0.0.1 localhost.localdomain
127.0.0.1 local
255.255.255.255 broadcasthost
::1 localhost
::1 ip6-localhost
::1 ip6-loopback
fe80::1%lo0 localhost
ff00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
0.0.0.0 0.0.0.0

# Custom host records are listed here.


# End of custom host records.
# Start StevenBlack

#=====================================
# Title: Hosts contributed by Steven Black
# http://stevenblack.com

0.0.0.0 ck.getcookiestxt.com eu1.clevertap-prod.com wizhumpgyros.com coccyxwickimp.com webmail-who-int.000webhostapp.com 010sec.com 01mspmd5yalky8.com 0byv9m0.0.0.0 aircovid19virus.com 
...
...
...

Compress URLs of the default hosts file and display in the console with only 3 urls per line and discarding excess:

 PS> hostscompress /c 3 /d
 Reading "C:\WINDOWS\System32\drivers\etc\hosts" content...
Compiling urls for [0.0.0.0]...
Found urls for [0.0.0.0]; Compressing...
〘████████████████████████████████████████████████████████████████████████████████████████████████████〙100%
Compressed [193500] urls to [21500] lines...
Compiling urls for [127.0.0.1]...
No urls found for [127.0.0.1]...
0.0.0.0 ck.getcookiestxt.com eu1.clevertap-prod.com wizhumpgyros.com
0.0.0.0 coccyxwickimp.com webmail-who-int.000webhostapp.com 010sec.com
0.0.0.0 01mspmd5yalky8.com 0byv9m0.0.0.0 aircovid19virus.com 
...
...
...

Compress URLs of the default hosts file and write to a new file in the current directory:

 PS> hostscompres /o .\hosts
Reading "C:\WINDOWS\System32\drivers\etc\hosts" content...
Compiling urls for [0.0.0.0]...
Found urls for [0.0.0.0]; Compressing...
〘████████████████████████████████████████████████████████████████████████████████████████████████████〙100%
Compressed [193500] urls to [21500] lines...
Compiling urls for [127.0.0.1]...
No urls found for [127.0.0.1]...
Compressed data has been successfully written to:
"C:\Users\<USERNAME>\hosts"...
 PS>

Benchmark process (Measure-Object won't work here for now):

 PS> Function Epoch() { return [decimal] [datetimeoffset]::UtcNow.ToUnixTimeMilliseconds() }
 PS> $begin = Epoch; hostscompress.exe /o hosts; $end = Epoch; [String]($end - $begin) + "ms"
Reading "C:\WINDOWS\System32\drivers\etc\hosts" content...
Compiling urls for [0.0.0.0]...
Found urls for [0.0.0.0]; Compressing...
〘████████████████████████████████████████████████████████████████████████████████████████████████████〙100%
Compressed [193500] urls to [21500] lines...
Compiling urls for [127.0.0.1]...
No urls found for [127.0.0.1]...
Compressed data has been successfully written to:
"C:\Users\<USERNAME>\hosts"...
778ms

Project Information

This project is written in C++.

C++

Changelog

  • 1.0.0.0 - Initial release.
  • 1.1.0.0 - Fixed regular expression match for URLs that begin with 0\.0\.0\.0\..* and 127\.0\.0\.1\..* while still keeping 0.0.0.0 0.0.0.0 and 127.0.0.1 127.0.0.1.
  • 1.1.1.0 - Bug fix for last entry not being prepended with the correct IP address as reported in issue: Missing 0.0.0.0 in front of the last line in the output file #1.
  • 2.0.0.0 - Complete removal of progress bar and any console mode code and therefore considered a completely different version.
    • Progress bar removed because this program is lightning fast even for large files. There's no need for it.
    • Console mode removed because it no longer works well in modern Windows/CMD. I can't imagine a reason for people to be using CMD in this day and age anyway and I no longer feel compelled to cater to it. Having said that, I do still plan to look into possibilities in the future, but it's not a priority.

Coming Updates

  • 2.1.0.0 - I will possibly be looking into proper display usage for console mode in modern CMD. This is not a priority and no promise for a time line.

Source File Quality

This is graded by CodeFactor and is subjective, but helps me to refactor my work.

Name Status
codefactor.io

File MD5 Hashes

All hashes are retrieved at compile/build time.

Current Windows X86 MD5

WINDOWS X86 MD5

Current Windows X64 MD5

WINDOWS X64 MD5

Other Miscellaneous File Information

Description Status
Project Release Date GitHub Release Date
Total downloads for this project GitHub all releases
Complete repository size This Repo Size
Commits in last month GitHub commit activity
Commits in last year GitHub commit activity

Notes

Note 1

hosts file search @ DuckDuckGo

Note 2

what is host/domain name search @ DuckDuckGo

Note 3

what is an IP address search @ DuckDuckGo

Note 4

Adding a path to the Windows %PATH% environment variable search @ DuckDuckGo


Media

Logo

LOGOIMAGE

Help

HELPIMAGE

Example

EXAMPLEIMG


Support Me If You Like

If you like any of the projects below and care to donate to my PayPal:

PayPal Donation

Or Buy Me A Coffee if your prefer:

Buy Me A Coffee


License Info

License Excerpt
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.