Multifetcher is a web-based server that enables parallel requests to external resources. This lightweight, high-performance solution takes advantage of Python's asynchronous I/O capabilities to fetch data from multiple URLs concurrently and efficiently.
- Asynchronous HTTP requests to external resources
- Dockerized application for easy setup and isolation
- HTTP POST API to receive multiple request details
- Streamed responses for real-time results
- Timeout handling for each request
These instructions will get you a copy of the project up and running on your local machine.
- Docker
- Clone this repository:
git clone https://github.com/yourusername/multifetcher.git
cd multifetcher
- Build the Docker image:
docker build -t multifetcher .
- Run the Docker container:
docker run -d -p 8000:8000 multifetcher
The server is now running at http://localhost:8000
.
Multifetcher listens for POST requests at its root URL. The body of the request should be a JSON array of objects representing the HTTP requests to make. An example POST request body might look like this:
[
{
"id": "1",
"method": "GET",
"url": "https://google.com"
},
{
"id": "2",
"method": "GET",
"headers": {"Cookie": "foo=bar"},
"url": "https://yandex.ru"
},
{
"id": "3",
"method": "GET",
"headers": {"Foo": "Bar"},
"url": "https://httpbin.org/json"
}
]
The server responds with a stream of newline-separated JSON objects. Each object corresponds to a response from one of the HTTP requests:
{"id": "1", "url": "https://google.com", "response": "<!doctype html><html itemscope=\"\"<...>"}
{"id": "2", "url": "https://yandex.ru", "response": "<!DOCTYPE html><html class=\"i-ua_js_<...>"}
{"id": "3", "url": "https://httpbin.org/json", "response": "{\n \"slideshow\": {\n<...>"}
You can test Multifetcher by running the provided test.py
script:
python test.py
This script sends a series of test HTTP requests to the server and prints the responses.
This project is licensed under the MIT License. See the LICENSE file for details.