/covid19-with-pulsar

parsing covid19 data with pulsar and cassandra

Primary LanguagePython

Pulsar to Cassandra with Covid19

pulsar to cassandra

If you want to just try pulsar with elastic seach check out this link

Step 1: Make sure you have docker installed and running on your machine!

alt text

  1. Right click on Docker Desktop icon
  2. Select Preferences
  3. Select Resources
  4. Set CPUs = 6
  5. Set Memory to at least 6GB
  6. Press the Apply & Restart button to make the changes.

Step 2: Turn on pulsar and elastic search (may take a long time) (2GB!)

docker-compose up

Step 3: Open another terminal tab

Validate that it is running!

Wait until pulsar gives you the OK status

docker logs pulsar | grep "messaging service is ready" 

You should get something like the below in the logs

23:26:24.517 [main] INFO org.apache.pulsar.broker.PulsarService - messaging service is ready

Step 4: Load the Covid19 data into Pulsar

docker run  -ti --network covid19-with-pulsar_default -v `pwd`/python_client:/usr/src/app   apachepulsar/pulsar  python3.7 /usr/src/app/covid19_datacleaner.py

Step 5: Open another terminal tab

Step 6: Turn on the elastic search connector

docker run  -ti --network covid19-with-pulsar_default -v `pwd`:/usr/src/app   apachepulsar/pulsar /usr/src/app/pulsar_to_elasticsearch_localrun.sh

Step 7: Open another tab

Step 8: Get some records!

curl -s http://localhost:9200/my_index/_refresh
curl -s http://localhost:9200/my_index/_search