[BUG] OpenAI.Evals APIs are not documented
Closed this issue · 2 comments
Describe the bug
I'm excited to see that there's an OpenAI.Evals namespace now and an EvaluationClient type! However, I'm not able to figure out how to use it - doesn't look like there's documentation, and the APIs are using generic System.ClientModel types. Should these be usable yet, or just an exciting tease of more to come?
Steps to reproduce
- Create a new console app, and add the OpenAI NuGet package
- Use the namespace 'OpenAI.Evals'
- Create an evaluation client and try to use it - unfortunately the types aren't self documenting:
EvaluationClient client = new(apiKey);
client.CreateEvaluation(The IntelliSense for that method shows params of System.ClientModel.BinaryContent, and System.ClientModel.Primitives.RequestOptions. No documentation online describing how to use it.
Code snippets
OS
macOS
.NET version
9.0.300
Library version
2.2.0
Hi @jmatthiesen. Thanks for reaching out and we regret that you're experiencing difficulties. The EvaluationClient is currently at the "protocol from" state in the OpenAI feature lifecycle. At this stage, the client exposes only a basic low-level API that allows you to use the infrastructure to call the OpenAI endpoint, but requires that you form the payload and parse the response directly. Looking forward, the Evals API will grow up into an experimental .NET-focused shape and eventually, transition to stable form.
The API Reference from the OpenAI platform docs is a great reference for doing so, as it gives you detailed information about that a flowing in both directions. I generally find that using anonymous objects to form the input is pretty straight forward, but parsing the response is more difficult. Feeding the example response from the docs to an LLM and asking it to generate classes for deserialization is quite helpful for that.
Applying that to Evals, a translation of the create operation example looks something like:
Create the Eval
var client = new EvaluationClient("<< YOUR API KEY >>");
var response = await client.CreateEvaluationAsync(
BinaryContent.Create(BinaryData.FromObjectAsJson(new
{
name = "Sentiment",
data_source_config = new
{
type = "stored_completions",
metadata = new
{
usecase = "chatbot"
}
},
testing_criteria = new[]
{
new
{
type = "label_model",
model = "o3-mini",
input = new[]
{
new
{
role = "developer",
content = "Classify the sentiment of the following statement as one of 'positive', 'neutral', or 'negative'"
},
new
{
role = "user",
content = "Statement: {{item.input}}"
}
},
passing_labels = new[] { "positive" },
labels = new []
{
"positive",
"neutral",
"negative"
},
name = "Example label grader"
}
}
})));
var evalResult = JsonSerializer.Deserialize<Eval>(response.GetRawResponse().Content, new JsonSerializerOptions
{
PropertyNameCaseInsensitive = true
});
Console.WriteLine($"Created {evalResult.Id});Objects for response deserialization
public class Eval
{
[JsonPropertyName("id")]
public string Id { get; set; }
[JsonPropertyName("object")]
public string ObjectType { get; set; }
[JsonPropertyName("created_at")]
public long CreatedAt { get; set; }
[JsonPropertyName("data_source_config")]
public DataSourceConfig DataSourceConfig { get; set; }
[JsonPropertyName("name")]
public string Name { get; set; }
[JsonPropertyName("testing_criteria")]
public List<TestingCriteria> TestingCriteria { get; set; }
[JsonPropertyName("metadata")]
public Dictionary<string, object> Metadata { get; set; }
}
public class DataSourceConfig
{
[JsonPropertyName("type")]
public string Type { get; set; }
[JsonPropertyName("max_items")]
public object MaxItems { get; set; }
[JsonPropertyName("schema")]
public Schema Schema { get; set; }
[JsonPropertyName("metadata")]
public DataSourceMetadata Metadata { get; set; }
}
public class DataSourceMetadata
{
[JsonPropertyName("usecase")]
public string UseCase { get; set; }
}
public class Schema
{
[JsonPropertyName("type")]
public string Type { get; set; }
[JsonPropertyName("properties")]
public Dictionary<string, object> Properties { get; set; }
[JsonPropertyName("required")]
public List<string> Required { get; set; }
}
public class TestingCriteria
{
[JsonPropertyName("id")]
public string Id { get; set; }
[JsonPropertyName("type")]
public string Type { get; set; }
[JsonPropertyName("input")]
public List<Message> Input { get; set; }
[JsonPropertyName("labels")]
public List<string> Labels { get; set; }
[JsonPropertyName("model")]
public string Model { get; set; }
[JsonPropertyName("name")]
public string Name { get; set; }
[JsonPropertyName("passing_labels")]
public List<string> PassingLabels { get; set; }
[JsonPropertyName("sampling_params")]
public object SamplingParams { get; set; }
}
public class Message
{
[JsonPropertyName("type")]
public string Type { get; set; }
[JsonPropertyName("role")]
public string Role { get; set; }
[JsonPropertyName("content")]
public string Content { get; set; }
}Thanks for the response @jsquire - super helpful and I appreciate all the details. This gives me enough to be able to play around for now. Feel free to close this out unless you need a tracking issue for the future work to build out the API.