Java idiomatic client for Dataproc.
If you are using Maven with BOM, add this to your pom.xml file
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>libraries-bom</artifactId>
<version>9.1.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-dataproc</artifactId>
</dependency>
</dependencies>
If you are using Maven without BOM, add this to your dependencies:
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-dataproc</artifactId>
<version>1.0.0</version>
</dependency>
If you are using Gradle, add this to your dependencies
compile 'com.google.cloud:google-cloud-dataproc:1.1.0'
If you are using SBT, add this to your dependencies
libraryDependencies += "com.google.cloud" % "google-cloud-dataproc" % "1.1.0"
See the Authentication section in the base directory's README.
You will need a Google Cloud Platform Console project with the Dataproc API enabled.
You will need to enable billing to use Google Dataproc.
Follow these instructions to get your project set up. You will also need to set up the local development environment by
installing the Google Cloud SDK and running the following commands in command line:
gcloud auth login
and gcloud config set project [YOUR PROJECT ID]
.
You'll need to obtain the google-cloud-dataproc
library. See the Quickstart section
to add google-cloud-dataproc
as a dependency in your code.
Dataproc is a faster, easier, more cost-effective way to run Apache Spark and Apache Hadoop.
See the Dataproc client library docs to learn how to use this Dataproc Client Library.
Samples are in the samples/
directory. The samples' README.md
has instructions for running the samples.
Sample | Source Code | Try it |
---|---|---|
Create Cluster | source code | |
Create Cluster With Autoscaling | source code | |
Instantiate Inline Workflow Template | source code | |
Quickstart | source code | |
Submit Hadoop Fs Job | source code |
To get help, follow the instructions in the shared Troubleshooting document.
Dataproc uses gRPC for the transport layer.
Java 7 or above is required for using this client.
This library follows Semantic Versioning.
Contributions to this library are always welcome and highly encouraged.
See CONTRIBUTING for more information how to get started.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. See Code of Conduct for more information.
Apache 2.0 - See LICENSE for more information.
Java Version | Status |
---|---|
Java 7 | |
Java 8 | |
Java 8 OSX | |
Java 8 Windows | |
Java 11 |