bipabo1l/scrapycluster_url_processing
Process crawler crawl Json data from Kafka Scrapy CLuster, filter out url information and de-duplicate, then put the results into mongodb
Python
Process crawler crawl Json data from Kafka Scrapy CLuster, filter out url information and de-duplicate, then put the results into mongodb
Python