/hosts-compression-scripts

Various scripts to help compress the hosts file.

Primary LanguagePowerShellGNU General Public License v3.0GPL-3.0

Hosts Compression Scripts

Readme Card

NOTE: The C++ Widows version of this project is finished and highly recommended over the PowerShell script found here. You can find the project at https://github.com/Lateralus138/hosts-compress-windows. This is exponentially faster (as in around 500ms for 220k+ URLs) and has a few extra features.



About

Scripts to compress the hosts file in various operating systems.

NOTICE: The universal PowerShell script now replaces the Steven Black only version and provides a slight performance increase. If you downloaded the previous version then you may want to download and replace the old version.

Description

Here you will find scripts (eventually programs) to help compress the hosts file found in various operating systems. This is especially necessary for extremely large hosts files similar to aggregated files you might find at repositories such as Steven Black's (which I highly endorse).

The hosts[1] file is a text file in all (that I know of) operating systems that maps hostnames[2] (google.com for example) to IP addresses[3] (0.0.0.0 or 127.0.0.1 for examples) and they can become very large if you use a custom one (especially a consolidated one such as one of Steven Black's). In Linux the file's size is not much of an issue, but in Windows (especially) this can slow down DNS caching and therefore cause internet/application issues such as slowing down the internet/external network or even bringing the internet to a halt. A way to circumvent this issue is to compress (or aggregate) multiple host names into single lines (or disable the DNS Client, which I definitely do not recommend for various reasons (and especially if you use WSL)). Windows can handle 9 hostnames per line; for example:

0.0.0.0 fakename_1.url fakename_2.url fakename_3.url fakename_4.url fakename_5.url fakename_6.url fakename_7.url fakename_8.url fakename_9.url

whereas Linux has no limit (that I know of; I have tested 10000 names on a single line).

Therefore it's a great idea to compress larges hosts files, especially in Windows and one can do so with scripts found here. My last compression (as of the official release of this project) turned 189k+ hostnames into 21k+ lines and went from an hour plus of no internet after boot to instantaneous internet.

NOTE 1: There's a misconception that the hosts file's file size matters (some people say over 1MB is problematic) and it really doesn't. It's more about the number or URLS per IP line. For example; a file size of 6MB+ compresses down to 5MB and runs exponentially better to near unnoticeable.

Motivation

A need to use a large hosts file in Windows and actually have access to the internet.


TODO

Please bear with me as this is a lot of work and I am a busy man, but in my opinion the most important script is ready and usable.

  • Scripts
    • Windows
      • Create PowerShell script - Universal
      • Create CMD script - Universal
      • Create AutoHotkey script/compiled executable for both.
    • Linux
      • Create Bash script - Universal
      • Create POSIX shell script - Universal
    • MacOS
      • Probably not, we'll see
    • Cross Platform
      • Create Python script
  • Programs
    • Create C++ program
    • Create (maybe) Rust program

Usage

For now these are all command line tools, whether they be a script or binary executable. No immediate plans for GUIs, but that may eventually change.

The main focus for now is Windows and Linux, but as stated in the Motivation section the whole reason I started this is for the shipwreck that is the Windows environment. I love Windows and Linux, but to deny there are some issues in Windows (in both really, but most notably Windows) is folly.

Guides And Examples

Here you will find guides for Windows and Linux as these are my primary environments for which I have the most knowledge. If you care to contribute a MacOS or any other script and/or other guides (or anything else) please feel free to fork and make a pull request.

These guides are not for use with the PiHole or adblockers, they are specifically for hosts found in the default directories such as:

Windows:

C:\Windows\System32\drivers\etc\hosts

and Linux:

/etc/hosts

I'll not give specific instructions on editing or installing the hosts file, if you need to compress yours you've probably already passed that point, but, of course, you can look it up yourself and here's a starting point if you like via DuckDuckGo:

How to edit the hosts file @ DDG

More will be added as I write more scripts. These guides assume you have already replaced or altered your hosts file. These scripts do not edit the file in place (for now) and only prints the output to the console by default.

Compressing Windows

The Windows hosts file located at C:\Windows\System32\drivers\etc\hosts

PowerShell Compression - Universal

NOTE 1: This script could take some time depending on your machine. Reason 1 being that this is a shell script and though PowerShell has come a long way it can't compare to a compiled binary (in C/C++ for example). Reason 2 being you are parsing hundreds of thousands (more or less) of URLS! If you weren't then you wouldn't need this script. This newer universal script has been refactored and has a slight performance in increase from the original Steven Black version.

NOTE 2: You can use this script to compress the default hosts file or any other hosts file by using the -InputFile argument switch; For example:

  PS> compress_hosts -InputFile C:\Path\to\alternate\hosts

The script provides information to enable Get-Help; for example:

 PS> Get-Help C:\Path\to\compress_hosts.ps1 -Full
 # or just "Get-Help compress_hosts -Full" if in %PATH%

Get-Help

 PS> Get-Help compress_hosts -Full

NAME
    C:\Users\flux\bin\compress_hosts.ps1
    
SYNOPSIS
    Windows hosts file compression script.
    
    
SYNTAX
    C:\Users\flux\bin\compress_hosts.ps1 [[-OutputFile] <String>] [[-InputFile] <String>] [<CommonParameters>]
    
    
DESCRIPTION
    PowerShell script to compress a large hosts file in Windows. The default output is to the console.
    

PARAMETERS
    -OutputFile <String>
        Path to an output file.
        
        Required?                    false
        Position?                    1
        Default value
        Accept pipeline input?       false
        Accept wildcard characters?  false
        
    -InputFile <String>
        Path to an input file. Defaults to 'C:\Windows\System32\drivers\etc\hosts'.
        
        Required?                    false
        Position?                    2
        Default value                C:\Windows\System32\drivers\etc\hosts
        Accept pipeline input?       false
        Accept wildcard characters?  false
        
    <CommonParameters>
        This cmdlet supports the common parameters: Verbose, Debug,
        ErrorAction, ErrorVariable, WarningAction, WarningVariable,
        OutBuffer, PipelineVariable, and OutVariable. For more information, see
        about_CommonParameters (https://go.microsoft.com/fwlink/?LinkID=113216). 
    
INPUTS
    
OUTPUTS
    
NOTES
    
    
        compress_hosts.ps1
        Author: Ian Pride 
        Modified date: 1:36 PM Saturday, August 26, 2023
        Version 1.0.0 - Added Get-Help comments

        ╔═══════════════════════════════════════════════════════════╗
        ║ Universal hosts file compression script for Windows       ║
        ║ © 2023 Ian Pride - New Pride Software/Services            ║
        ║ https://github.com/Lateralus138/hosts-compression-scripts ║
        ╚═══════════════════════════════════════════════════════════╝

    -------------------------- EXAMPLE 1 --------------------------

    PS>compress_hosts -OutputFile $Env:USERPROFILE\Documents\hosts

RELATED LINKS

This assumes you have already installed a custom hosts file and it is located in the appropriate directory as stated above.

  1. Download compress_hosts.ps1 from the current Releases Page and place it anywhere you like.
  2. Open a PowerShell terminal from the Start Menu, Run (Win+r), or from CMD (powershell or pwsh). StartMenuPowerShell RunPowerShell CMDPowerShell
  3. Change directory to the location of the PS1 script you downloaded from here; you don't have to, but if you don't you must provide the path to the full script (for example C:\Path\To\ScriptLocation\compress_hosts.ps1). For example:
 PS> cd C:\Users\<USERNAME>\Downloads

You can also place the script somewhere in your %PATH% and run without the extensions, for example:

 PS> compress_hosts
  1. It's possible that executing PowerShell scripts is disabled by default on your Windows machine. If so then enable it. You'll need to run this next command as Administrator in an Administrative PowerShell instance:

    • You can either Right Click->Run as administrator from the Start Menu or use the hotkey Win+x and either press 'a' or click the options from the menu.

    WINX

    • From the administrative PowerShell instance type Set-ExecutionPolicy unrestricted and hit [Enter].
     PS> Set-ExecutionPolicy unrestricted
  2. If you only want to see the compressed result from the command line then (from the directory of the script, unless you want to type the full path) type .\compress_hosts.ps1 and it will somewhat verbosely run through the process of compression and output the results to the screen:

 PS> .\compress_hosts.ps1

or if it's in you %PATH%:

 PS> compress_hosts
  1. To actually output the compressed result to a file run the same command, but redirect the output to a file: compress_hosts.ps1 -OutputFile hosts. I do not recommend overwriting the original file (as this is harder to do and I provide a more reliable method below in Replacing Windows hosts file).
 PS> .\compress_hosts.ps1 -OutputFile host

or if it's in you %PATH%:

 PS> compress_hosts -OutputFile host

You will now have a new compressed version of your hosts file. All of the original content of the file will be there, but reordered. As some of the comments (#[Added]... for example) might be mixed in with the original URLS they will now be placed at the bottom and make no sense. The other text (header and footer) in the original file will still be in their appropriate places.

Replacing Windows Hosts File

Replacing the Windows hosts file can be a pain as it is almost always "in use by another program", but there are at least 2 ways it can be done with ease (either by using a Linux to do it or by force deleting with an external program). Here I will only explain how to do it natively using an external program to force delete the file and replacing it with the new one. I will not explain how to use Linux to do it as anyone who knows how to dual boot or use a live distro to access Windows more than likely doesn't need an explanation and here that is more work than is necessary.

As with working with, editing, or replacing any file (especially a system file) I recommend making a backup of the file you are replacing. You can do this by copying and pasting the file from one place to another (or the same place really, but I'd recommend backing up to your Documents or Desktop folder or somewhere you know you'll find it). Here I will show how to do things by a GUI first and foremost, but somethings can not be done so (flushing the DNS).

As stated above the Windows hosts file is located at C:\Windows\System32\drivers\etc\hosts

  1. [Optional] Backup your original hosts file.
  2. [Possibly Optional] If the file is set to read only (not usually, but mine is and I know some security software has been known to set the hosts file to read only) then you must set the file to writable. To do so right click on the hosts file and choose 'Properties' (newer versions of Windows it might be Right Click->Show More Options->Properties) or simply by pressing [Alt+Enter] while the file is selected and then uncheck the 'Read Only' check box at the bottom of the 'General' tab ad press [Apply]:

RO1 RO2 RO3

  1. [Possibly Optional] Often your hosts file is in use (by svchost.exe) and if you try to replace or delete the file it won't let you. The best option is to use an unlocker program to unlock and delete the file. If you attempt to only unlock the file and then try to delete it manually it more than likely will be locked again so it's best to use the 'Unlock and Delete' option. I highly recommend IOBit's Unlocker which is a completely free application and I have used it for years WITHOUT FAIL, but, of course, you can search for your own as there are several options (here's a head start if you like: file unlocker programs @ DDG). With IObit unlocker you would right click on the hosts file (and possibly "Show More Options") and choose the 'IOBit Unlocker' option and when the program starts select "Unlock & Delete".

unlocker_option unlocker

  1. Flush the DNS. This is the only option that must be done from the command line and can be done in CMD or PowerShell. It's best to do this option before replacing the hosts file with the new one as when you place the new hosts file it might be in use and flushing won't be possible while it's in use. Start your terminal up and run the following command and press [Enter]:
PS> ipconfig /flushdns
  1. Copy and paste the new hosts file to the etc directory where you just deleted the original file.
  2. [Somewhat Optional] Reboot your system. You can wait to do this, but it's possible that not all services and applications that utilize the hosts file/DNS caching will utilize the new changes until reboot.

Compressing Linux

The Linux hosts file is located at /etc/hosts.

Bash Compression - Universal

This assumes you have already installed a custom hosts file or have altered it yourself. Unlike Windows this method is much easier (of course).

There are two modes; you can just view the compressed result (default) in the terminal or output the compressed result to a file.

  1. Download compress_linux_generic_hosts.bash from the current Releases Page and place it anywhere you like.
  2. Open a Bash shell in the terminal of your choice and navigate to the directory where the script is located (not necessary, but you if you don't then you must provide the full path).
  3. You may need to set the permissions of the script to executable. This can be done in a few ways using chmod.
    • Set the file to executable for the current user only:
     $ chmod u+x /path/to/compress_linux_generic_hosts.bash
    • Set the file to executable for everyone:
    $ chmod +x /path/to/compress_linux_generic_hosts.bash
    # or
    $ chmod 0755 /path/to/compress_linux_generic_hosts.bash
  4. [Default Output] To see the resulting compressed output:
     $ ./compress_linux_generic_hosts.bash
  5. [File Output] Output the compressed result to a file. You can either make a different file or overwrite the original, but if you overwrite you must use sudo:
    • To a new file
     $ ./compress_linux_generic_hosts.bash hosts
     # or
     $ ./compress_linux_generic_hosts.bash /path/to/new/hosts
    • Overwrite /etc/hosts
     $ sudo ./compress_linux_generic_hosts.bash /etc/hosts

Replacing Linux Hosts File

Unlike Windows this is simple and only necessary if you didn't use the overwrite method:

  1. Copy the file to the new file to /etc/hosts using sudo.
     $ sudo cp /path/to/new/hosts /etc/hosts
  2. [Optional] You may want to flush the DNS cache, but it's not usually necessary. I won't go in-depth here, but one possible way is to use resolvectl flush-caches.
     $ resolvectl flush-caches

Project Information

Source File Quality

This is graded by CodeFactor and is subjective, but helps me to refactor my work.

Name Status
codefactor.io

File MD5 Hashes

All hashes are retrieved at compile/build time.

Current Universal Hosts PowerShell Script

WINDOWS Universal PowerShell Script MD5

Current Linux Universal Hosts Bash Compression Script

Linux Universal Hosts Bash Compression Script MD5

Other Miscellaneous File Information

Description Status
Project Release Date GitHub Release Date
Total downloads for this project GitHub all releases
Complete repository size This Repo Size
Commits in last month GitHub commit activity
Commits in last year GitHub commit activity

Notes

Note 1

What is the hosts file @ DDG

Note 2

What is a hostname

What is a hostname @ DDG

Note 3

What is an IP address @ DDG


Contribute

Please feel free to contribute by `forking and making a pull request.

FORKTHIS


Support Me If You Like

If you like any of the projects below and care to donate to my PayPal:

PayPal Donation

Or Buy Me A Coffee if your prefer:

Buy Me A Coffee


Change Log

  • v1.0.1691110144084
    • Initial release
    • Only file was the original Steven Black's PowerShell script.
  • v1.0.1691784030635
    • Added the Bash script for Linux.
  • v1.0.1693086432 (8/26/2023 9:47:12 PM UTC)
    • Replaced older Steven Black PowerShell version.
    • New version provides a universal approach to facilitate all types of normal hosts files.
    • Essentially a whole rewrite with a slight performance increase.
    • Added an argument -InputFile to accept files other than the default.
    • Add information to the script file to facilitate Get-Help.

License Info

License Excerpt
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.