- Dockerized
- Kafka Input/Output
envsubst < ./kubernetes/pubmed-parser-deploy.yaml | kubectl apply -f -
["python3", "-m", "server.kafka_consumer", "kafka:9092", "gzfiles", "pubmeds"]
Input topic: gzfiles
Input format: { "path": "pubmed_baseline/pubmed19n0971.xml.gz", "limit": 10 }
Output topic: pubmeds Output format: JsonLines