REQUIRED ENVIRONMENT VARIABLES
# Mongo DB to store posts
MONGODB_HOST
MONGODB_PORT
MONGODB_DB
MONGODB_USER
MONGODB_PASSWORD
MONGODB_AUTH_DB
MONGODB_RESULT_COLLECTION
- collection to store postsMONGODB_TASK_COLLECTION
- collection to store completed tasks
# RabbitMQ to distribute tasks
RABBITMQ_HOST
RABBITMQ_PORT
RABBITMQ_USER
RABBITMQ_PASSWORD
RABBITMQ_QUEUE
# Proxy settings
PROXIES_FILE_PATH_OR_URL
- may be url(calling it gives a list of proxies), filepath, empty(proxy won't be used)PROXIES_TYPE
- http(s), socks5, empty(when proxy is not used)
# If you use gitlab ci/cd(.gitlab-ci.yml is in the project) + kubernetes, you should also define:
K8S_SERVER
- K8S api URLK8S_CERT
- K8S certificateK8S_TOKEN
- K8S tokenMOUNT_CONTAINER_PATH
- if PROXIES_FILE_PATH_OR_URL is filepathMOUNT_HOST_PATH
- if PROXIES_FILE_PATH_OR_URL is filepath, path in your kubernetes host
- add_tasks_initial.py - add tasks(dirrferent keywords, dates and so on) to RabbitMQ
- consumer-twitter-advanced-search_api.py - worker, executes(parses twits) tasks from RabbitMQ
- job_task.py - parses twits for every specified keywords, languages for last
DAYS_COUNT
days, keeps data up to dateDAYS_COUNT
(default - 3)