- Searches for proxies in open sources
- Intelligently stores state on disk across restarts
- Validates via configurable speed thresholds and anonymity
- Multiplexes HTTP/HTTPS MITM to HTTP, HTTPS, SOCKS4, and SOCKS5
- Exposes REST API for refresh stats and pool health
- Exposes minimal Query Language for filtering of History and Proxy Stats.
- Records request history in-memory for further UI inspection
- Real-time statistics display about available pool
- Packaged as a single executable binary, that also includes Web UI
Download service, start it up, wait couple of minutes for the pool to pick up. Now run curl -D - -x http://127.0.0.1:8090 -k http://httpbin.org/get
couple of times and see different origins and user agent headers.
- Source is an async process that looks at one or more pages for refreshed proxy list.
- Refresher component does best effort on scheduling items.
- Some sources perform better forwarded through a Pool, warming it up.
- One proxy may be seen in multiple sources, so we keep exclusive proxies per source across refreshes, which are not found in other sources.
- Proxy consists of protocol (HTTP, HTTPS, SOCKS4, or SOCKS5) and IP:PORT.
- Proxy becomes Scheduled immediately after it's seen in the source.
- Scheduled could transition into Probing queue if it's not Ignored (e.g. Timeouts or Blacklist).
- Probing uses configurable pool of rotating anonymity checkers to check for liveliness.
- Timeout items are re-added to Scheduled queue as Reverify source to probe item up to 5 times.
- Blacklist hosts historical faulty proxies that should never be probed again.
- Successful check results in Found queue and gets added to a Pool.
- Pool subdivides its memory into shards for randomized rotation and minimal resource contention.
- Pool uses configurable and backpressure-controlled workers to perform HTTP request forwarding.
- Every forwarded request gets a serial number (returned in
X-Proxy-Serial
header) and picks a different shard for an attempt, which is reflected in response inX-Proxy-Attempt
header. - Every forwarded request can later be inspected through
GET /api/history
or UI. - Every attempt picks first available working random proxy from a shard and marks it as Offered. Total number of offers per used proxy is returned in response in
X-Proxy-Offered
header. - In the event of no working proxies in a shard, proxy pool exhaustion errors can do backpressure and slow down issuing of serial numbers through simple leaky bucket algorithm.
- Every succeeded attempt through a proxy increases it's Success Rate (Succeeded/Offered), which is also calculated per hour. Total number of succeded attempts of used proxy are returned via
X-Proxy-Succeed
header. Proxy used is returned inX-Proxy-Through
header. - Every failed attempt marks proxy as not working and suspends offering it for 5 minutes.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
http://localhost:8089/ shows current source refresh status and stats.
http://localhost:8089/proxies provides search interface over active pool of found proxies. By default, entries are sorted by last working on top.
http://localhost:8089/history provides search interface over last 1000 forwarding attempts (configurable).
http://localhost:8089/blacklist provides search interface over unsuccessful probes.
Conf file is looked in the following paths:
$PWD/slrp.yml
$PWD/config.yml
$HOME/.slrp/config.yml
Default configuration is approximately the following:
app:
state: $HOME/.slrp/data
sync: 1m
log:
level: info
format: pretty
server:
addr: "localhost:8089"
read_timeout: 15s
mitm:
addr: "localhost:8090"
read_timeout: 15s
idle_timeout: 15s
write_timeout: 15s
pprof:
enable: false
addr: "localhost:6060"
checker:
timeout: 5s
strategy: simple
history:
limit: 1000
Every configuration property can be overridden through environment variable by using SLRP_
prefix followed by section name and key, divided by _
. For example, in order to set log level to trace, do SLRP_LOG_LEVEL=TRACE slrp
.
Fabric that holds application components together.
state
- where data persists on disk through restarts of the application. Default is.slrp/data
of your home directory.sync
- how often data is synchronised to disk, pending availability of any updates of component state. Default is every minute.
Structured logging meta-components.
level
- log level of application. Default isinfo
. Possible values aretrace
,debug
,info
,warn
, anderror
.format
- format of log lines printed. Default ispretty
, though it's recommended for exploratory use only for performance reasons. Possible values arepretty
,json
, andfile
(experimental).file
will create a$PWD/slrp.log
, unless specified bylog.file
property.file
(experimental) - application logs in JSON format. Default value is$PWD/slrp.log
.
API and UI serving component.
addr
- address of listening HTTP server. Default is http://127.0.0.1:8089.read_timeout
- default is15s
.enable_profiler
- either or not enabling profiler endpoints. Default isfalse
. Developer use only.
HTTP proxy frontend.
addr
- address of listening HTTP proxy server. Default is http://127.0.0.1:8090.read_timeout
- default is15s
.idle_timeout
- default is15s
.write_timeout
- default is15s
.
Component for verification of proxy liveliness and anonymity.
timeout
- time to wait while performing verificatin. Default is5s
.strategy
- verification strategy to check the IP of the proxy. Default issimple
, which will randomly select one of publicly available sites: ifconfig.me, ifconfig.io, myexternalip.com, ipv4.icanhazip.com/, https://ipinfo.io/, api.ipify.org/, or wtfismyip.com. Another strategy isheaders
, which will look for the real IP address in https://ifconfig.me/all or https://ifconfig.io/all.json, which might have been added in HTTP headers while forwarding. And there'stwopass
strategy, that will first performsimple
check andheaders
afterwards.
Component for recording forwarded requests through a pool of proxies.
limit
- number of requests to keep in memory. Default is1000
.
You can optionally enable this feature. This product includes GeoLite2 Data created by MaxMind, available from https://www.maxmind.com.
license
- your (free) license key for MaxMind downloads. You can skip specifying license key ifmmdb_asn
andmmdb_city
are already downloaded in any other way and configured.mmdb_asn
- already (or automatically) downloaded snapshots of MaxMind database. Default is$HOME/.slrp/maxmind/GeoLite2-ASN.mmdb
mmdb_city
- already (or automatically) downloaded snapshots of MaxMind database. Default is$HOME/.slrp/maxmind/GeoLite2-City.mmdb
Retrieve last sync status for all components
Get information about refresh status for all sources
Get 20 last used proxies
Get 100 last forwarding attempts
Get sanitized HTTP response from forwarding attempt
Get first 20 blacklisted items sorted by proxy along with common error stats
- ProxyBroker is pretty similar project in nature. Requires couple of Python module dependencies and had the last commit in March 2019.
- Scylla is pretty similar project in nature. Requires couple of Python module dependencies.
- ProxyBuilder