Hello Samza Base is based on Samza's Hello Samza project (based on apache/samza-hello-samza@2214946
).
It removes the bootstrapping and cluster-management tools to serve as a boilerplate from which you can create your own simple Samza tasks.
To run, you must have the existing Hello Samza infrastructure set up. Once that is ready, you can compile the tasks here and run them on the existing cluster.
mvn clean package
mkdir -p deploy/samza
tar -xvf ./target/hello-samza-0.10.1-dist.tar.gz -C deploy/samza
deploy/samza/bin/run-job.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-feed.properties
deploy/samza/bin/run-job.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-parser.properties
deploy/samza/bin/run-job.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-stats.properties
There are some customizations you should do to make this repository your own.
- Update pom.xml with your project, version, organization, and developer details.
- This will change the deploy commands above. E.g.
hello-samza
and0.10.1
are driven byartifactId
andversion
.
- Remove or update LICENSE to match your licensing desires.
- Remove or update the code under
src/main/java/samza
- Remove or update the code under
src/main/config/wikipedia-*.properties
- This will change the deploy commands above. E.g. replace
wikipedia-feed.properties
with your new properties file.
- Update or remove this README.