Paginator to get html documents with JS support
- min java 8
- chrome installed on the machine
- Please request any issues and wishes to GitHub https://github.com/YunaBraska/paginator
ENV VARIABLE | DEFAULT | DESCRIPTION |
---|---|---|
SERVER_PORT | 8089 | Server port |
N/A | 10000 | HTML pages cache limit |
N/A | 10800000ms | HTML pages cache life time |
METHOD | URL | REQUEST BODY | RETURN BODY | Description |
---|---|---|---|---|
GET/PUT | /pages | url, page_cache_ms* [optional] |
Get html page from url | |
GET/PUT | /pages/elements | url, Map<queryId, cssQuery>, page_cache_ms* [optional] |
Map<queryId, List<Elements>> |
Get specific html elements |
GET/PUT | /pages | url, content, page_cache_ms* [optional] |
Manual add html page to cache | |
GET/PUT | /pages/statistics | size, maxLifeTime, sizeLimit |
Get cache statistics |
* page_cache_ms is optional - it does not overwrite the previous value at the second call.
- Request:
GET http://localhost:8089/pages/elements
- Body:
{
"url": "parse.example.com",
"css_queries": {
"form_text": "form p"
}
}
- Response
{
"form_text": [
{
"tag": "P",
"text": "Some example text here.",
"selector": "html > body > div > form > p:nth-child(1)",
"attributes": {
},
"children": [
]
}
]
}
- Request:
POST http://localhost:8089/pages
- Body:
{
"url": "my.own.example.com",
"content": "<!doctype html><html><head><title>Example Domain</title></head><body><div><h1>Example page</h1></div></body></html>"
}
- Request:
POST http://localhost:8089/pages
- Body:
{
"url": "my.own.example.com",
"content": "<!doctype html><html><head><title>Example Domain</title></head><body><div><h1>Example page</h1></div></body></html>"
}
- Create jar file:
mvn clean -Dmaven.test.skip=true package
- Build local image
docker build -t paginator .
- Docker image tag latest for repo:
docker tag "$(whoami)/paginator" SOME_REPO_PATH/paginator:latest;
- Docker image push to repo:
docker push SOME_REPO_PATH/paginator:latest
- Async page call implementation [remove synchronised]
- Endpoint to clear cache
- configurable default cache limits
////((((((((((((((((((((((((((((((* **
//////////////////////////////////* */(/.
//////////////////////////////////* */////*
//////////////////////////////////* *////////.
//////////////////////////////////*
///////......................,////////////////.
//////////////////////////////////////////////.
///////...............................,///////.
///////******************************/////////.
//////////////////////////////////////////////.
//////*. PAGINATOR ,///////.
//////////////////////////////////////////////.
**********************************************.
**********************************************.
********,....*********************************.
********, *********************************.
.,***********, ,*******************.
,,,,,,,,,,,,, ,*,,,,, .,,,,,,.
,,,,,,,,, ,,,,,,,,,,, .,,,,,,.
................ .......,,,. ..........
,,,,,,. ,,,. .,,,.
,,,,,,. .... ,,,.
,,,. ,,,.
....
....
....