The official Chalk client library for Java.
- at least JDK 17
The Java client is hosted on Maven Central.
Add the following dependency block to your build.gradle
.
implementation 'ai.chalk:chalk-java:0.12.0'
Add the following dependency block to your pom.xml
.
<dependencies>
<dependency>
<groupId>ai.chalk</groupId>
<artifactId>chalk-java</artifactId>
<version>0.12.0</version>
</dependency>
</dependencies>
Create a client using the ChalkClient.create()
method. The returned client gets its configuration:
- From environment variables
- From a ~/.chalk.yml file if (1) is not available
import ai.chalk.client.ChalkClient;
ChalkClient client = ChalkClient.create();
Alternatively, create a client using specific overrides. The overrides will take precedence over (1) and (2) above.
import ai.chalk.client.ChalkClient;
ChalkClient client = ChalkClient.builder()
.withEnvironmentId("tmnma7wesandrsn")
.withClientId("kjhsfd43kjhfd8fdjk")
.withClientSecret("mnxcvuyqwmj7cnwe4mgfnuywe6jnas9sdn")
.build();
Chalk uses Arrow objects during serialization of requests and deserialization of responses. These objects need to be closed to free up memory.
If you're creating an instance of ChalkClient
that persists throughout the lifetime of your application, you can
choose not to call the close()
method.
Otherwise, wrap the client in a try-with-resources block to ensure that the Arrow resources are closed properly.
try (ChalkClient client = ChalkClient.create()) {
// Use the client
}
When making a query, wrap it in a try-with-resources block, too.
try (OnlineQueryResult result = client.onlineQuery(params)) {
// Use the result
}
Use the method ChalkClient.printConfig
to debug a connection or authentication issue. This method prints all config
variables and their sources to stdout
. See the list of environment variables in ai.chalk.client.ConfigEnvVars
.
---------------------------------------------------------------------------------------------------------------------------
Config Value Source
---------------------------------------------------------------------------------------------------------------------------
Environment ID prod ChalkClient builder argument
Client ID client-4b8c614111f93f8255127e89d154a8a6 environment variable 'CHALK_CLIENT_ID'
Api Server http://127.0.0.1:8000 config file /Users/papan/.chalk.yml
Client Secret ************************************************* environment variable 'CHALK_CLIENT_SECRET'
---------------------------------------------------------------------------------------------------------------------------
If you're just getting started, here's an example query with minimal prerequisite setup steps:
import ai.chalk.client.ChalkClient;
import ai.chalk.models.OnlineQueryParams;
import ai.chalk.models.OnlineQueryResult;
import org.apache.arrow.vector.table.Table;
import org.apache.arrow.vector.table.Row;
import org.apache.arrow.vector.FieldVector;
import org.apache.arrow.vector.complex.reader.FieldReader;
public class Main {
public static void main(String[] args) {
try {
ChalkClient client = ChalkClient.create();
} catch (ChalkException e) {
e.printStackTrace();
return;
}
var userIds = Arrays.asList(1, 2, 3);
var params = OnlineQueryParams.builder()
.withInput("card_user.id", userIds)
.withOutputs("card_user.full_name")
.build();
try (OnlineQueryResult result = client.onlineQuery(params)) {
for (Row row : result.getScalarsTable()) {
long userId = row.getInt64("card_user.id");
double meanSpent = row.getFloat8("card_user.spending_mean_30d");
System.out.println("User " + userId + " spent an average of $" + meanSpent + " per day in the last 30 days");
}
} catch (ChalkException e) {
e.printStackTrace();
}
}
}
The same example, with codegen set up, would look like this:
try {
ChalkClient client = ChalkClient.create();
} catch (ChalkException e) {
e.printStackTrace();
return;
}
var userIds = Arrays.asList(1, 2, 3);
var params = OnlineQueryParams.builder()
.withInput(Features.user.id, userIds)
.withOutputs(Features.user.fullName)
.build();
try (OnlineQueryResult result = client.onlineQuery(params)) {
CardUser[] users = result.unmarshal(CardUser.class);
for (CardUser user : users) {
long userId = user.id.getValue();
double meanSpent = user.spending_mean_30d.getValue();
System.out.println("User " + userId + " spent an average of $" + meanSpent + " per day in the last 30 days");
}
} catch (ChalkException e) {
e.printStackTrace();
}
For the best query experience, we recommend using the codegen feature of Chalk CLI. With codegen, you can generate Java classes that are exactly equivalent to the features that you have defined in Python. For instance, given these Python features and dataclasses:
@features
class Transaction:
id: Primary[str]
user_id: str
amount: float
ts: datetime = feature_time()
tags: List[str]
user: User = has_one(lambda: User.id == Transaction.user_id)
@features
class Account:
id: Primary[str]
balance: float
@features
class User:
id: Primary[str]
account_id: str
account: Account = has_one(lambda: Account.id == User.account_id)
count_payments: Windowed[int] = windowed("1m", "5m")
spending_mean_30d: float
transactions: DataFrame[Transaction] = has_many(lambda: Transaction.user_id == User.id)
@dataclasses.dataclass
class Address:
street: str
city: str
Chalk CLI will generate:
// Transaction.java
public class Transaction extends FeaturesClass {
public Feature<String> id;
public Feature<String> user_id;
public Feature<Double> amount;
public Feature<java.time.LocalDateTime> ts;
public Feature<List<String>> tags;
public CardUser user;
}
// Account.java
public class Account extends FeaturesClass {
public Feature<String> id;
public Feature<Double> balance;
}
// User.java
public class CardUser extends FeaturesClass {
public Feature<String> id;
public Feature<String> name;
public Feature<String> account_id;
public Account account;
public _WindowedFeatures_1m_5m count_payments;
public Feature<Double> spending_mean_30d;
@HasMany(localKey = "id", foreignKey = "user_id")
public Feature<List<Transaction>> transactions;
}
// Features.java. This is the root class that contains all features
// that can be used when specifying inputs.
public class Features {
public static CardUser user;
public static Account account;
public static Transaction transaction;
private static Exception initException = Initializer.initFeatures(Features.class);
public static Exception getInitException() {
return Features.initException;
}
}
// _WindowedFeatures_1m_5m.java
public class _WindowedFeatures_1m_5m extends WindowedFeaturesClass {
public Feature<Integer> bucket_1m;
public Feature<Integer> bucket_5m;
}
// Address.java
public class Address extends StructFeaturesClass {
public Feature<String> street;
public Feature<String> city;
}
With these classes, we can now write type-checked queries.
import com.example.my_project.codegen_output_folder.Features;
var userIds = Arrays.asList("user_1");
var params = OnlineQueryParams.builder()
.withInput(Features.user.id, userIds)
.withOutputs(Features.user.id, Feature.card_user.name) // Scalar features
.withOutputs(Features.user.account.balance) // Has-one feature
.withOutputs(Features.user.transactions) // Has-many feature
.withOutputs(Features.user.count_payments.bucket_5m) // Windowed feature
.withOutputs(Features.user.address) // Struct-like feature
.build();
Make sure to first check whether the feature classes in Features
have been initialized successfully:
if (Features.getInitException() != null) {
Features.getInitException().printStackTrace();
return;
}
We can also unmarshal the result of a query from its Arrow representation into the generated classes.
try (OnlineQueryResult result = client.onlineQuery(params)) {
CardUser[] users = result.unmarshal(CardUser.class);
for (CardUser user : users) {
String name = user.id.name.getValue(); // Scalar features
Double balance = user.account.balance.getValue(); // Has-one feature
List<Transaction> txns = user.transactions.getValue(); // Has-many feature
Integer countPayments = user.count_payments.window_5m.getValue(); // Windowed feature
String street = user.address.street.getValue(); // Struct-like feature
}
} catch (ChalkException e) {
e.printStackTrace();
}
Querying without codegen means accessing the underlying Arrow data structures directly. Here's an example querying
for both a scalar feature (user.spending_mean_30d
) and a has-many feature (user.transactions
):
import ai.chalk.client.ChalkClient;
import ai.chalk.models.OnlineQueryParams;
import ai.chalk.models.OnlineQueryResult;
import org.apache.arrow.vector.table.Table;
import org.apache.arrow.vector.table.Row;
import org.apache.arrow.vector.FieldVector;
import org.apache.arrow.vector.complex.reader.FieldReader;
public class Main {
public static void main(String[] args) {
try {
ChalkClient client = ChalkClient.create();
} catch (ChalkException e) {
e.printStackTrace();
return;
}
var userIds = Arrays.asList(1, 2, 3);
var params = OnlineQueryParams.builder()
.withInput("card_user.id", userIds)
.withOutputs("card_user.spending_mean_30d", "card_user.transactions")
.build();
try (OnlineQueryResult result = client.onlineQuery(params)) {
for (Row row : result.getScalarsTable()) {
long userId = row.getVarCharObj("card_user.id");
double meanSpent = row.getFloat8("card_user.spending_mean_30d");
System.out.println("User " + userId + " spent an average of $" + meanSpent + " per day in the last 30 days");
}
Table txnTable = result.getGroupsTables().get("card_user.transactions");
FieldVector txnAmountVector = txnTable.getColumn("amount");
// Do something with the transaction amount vector
} catch (ChalkException e) {
e.printStackTrace();
}
}
}