/ai_engineering

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Please note: The code in these repos is sourced from the DataRobot user community and is not owned or maintained by DataRobot, Inc. You may need to make edits or updates for this code to function properly in your environment.

AI ENGINEERING

This is a repository for the AI Engineering team to share code with customers and the community. This ranges from simple example API snippets to demonstrations and end-to-end tutorials and associated code.

Additional code examples can be accessed from the following locations:

Important links

Usage

Multiple integrations and projects will be hosted within this repo, see individual readmes in subfolders for details relevant to those individual entries.

Related documentation

These are articles created in the DataRobot community by the AI Engineering group, referencing both inline code in the articles, as well as contents in this repository.

Database integration examples

DataRobot data connections - To enable integration with a variety of enterprise databases, DataRobot provides a “self-service” JDBC platform for database connectivity setup. Once configured, you can read data from production databases for model building and predictions. Create a Feature Discovery project - Feature Discovery is based on relationships between datasets and the features within those datasets. Prediction intake options - You can configure a prediction source using the Predictions > Job Definitions tab or the Batch Prediction API. Batch Prediction API : Snowflake scoring - Using JDBC to transfer data can be costly in terms of IOPS (input/output operations per second) and expense for data warehouses. This adapter reduces the load on database engines during prediction scoring by using cloud storage and bulk insert to create a hybrid JDBC-cloud storage solution.

Data processing

DataRobot Pipelines v7.3.0+ - DataRobot Pipelines enable data science and engineering teams to build and run machine learning data flows. Teams start by collecting data from various sources, cleaning them, and combining them.

MLOps custom model hosting on DataRobot

Workshop: Create, test, and deploy a custom model - Custom inference models allow you to bring your own pretrained models to DataRobot. By uploading a model artifact, you can create, test, and deploy custom inference models to a centralized deployment hub. DataRobot supports models built with a variety of coding languages, including Python, R, and Java.

MLOps overview, automatic retraining, accuracy monitoring, MLOps agent, agent use cases

MLOps overview - DataRobot MLOps provides a central hub to deploy, monitor, manage, and govern all your models in production, regardless of how they were created or when and where they were deployed. Continuous AI: Set up automatic retraining - To maintain model performance after deployment without extensive manual work, DataRobot provides an automatic retraining capability for deployments. Enable accuracy monitoring - You can monitor a deployment for accuracy using the Accuracy tab, which lets you analyze the performance of the model deployment over time, using standard statistical measures and exportable visualizations.
MLOps agent - A powerful tool for tracking and managing models for prediction. MLOps Agent use cases - Monitoring use cases for how to apply the MLOps agent.

Portable Prediction Server (PPS) - DataRobot models in Docker images deployed as containers within Kubernetes

Portable Prediction Server - The Portable Prediction Server (PPS) is a DataRobot execution environment for DataRobot model packages (.mlpkg files) distributed as a self-contained Docker image.
Portable batch predictions - Portable batch predictions (PBP) let you score large amounts of data on disconnected environments.
Custom model Portable Prediction Server - The custom model Portable Prediction Server (PPS) is a solution for deploying a custom model to an external prediction environment. It can be built and run disconnected from main installation environments.

Platform administration

Administrator's guide - The DataRobot Administrator's Guide is intended to help administrators manage their DataRobot application.

DataRobot API

API Quickstart (documentation) - The DataRobot API provides a programmatic alternative to the web interface for creating and managing DataRobot projects. The API can be used via REST or with DataRobot's Python or R clients in Windows, UNIX, and OS X environments.
DataRobot University: Python API Starter Quest (Free) - Provides the foundation skills for using Python to work with the DataRobot API. The courses are self-paced.

Batch scoring

Scoring at the command line - Syntax for scoring at the command line.

Solutions and applications

AI App Builder - The AI App Builder allows you to build and configure AI-powered applications using a no-code interface to enable core DataRobot services without having to build models and evaluate their performance in DataRobot.
DataRaobot Zepl Notebooks - Apply and share notebook-powered analytics across the enterprise.
Algorithmia - DataRobot Algorithmia is an MLOps platform where you can deploy, govern, and monitor your models as microservices. The platform lets you connect models to data sources and deploy them quickly to production.

Exported code deployment examples

DataRobot Prime - DataRobot Prime optimizes prediction models for use outside of the DataRobot application, which can provide multiple benefits. Once created, you can export these models as a Python module or a Java class, and run the exported script.
DataRobot Prime examples - You can generate source code for the model as a Python module or Java class.
Scoring Code JAR integrations - Although DataRobot provides its own scalable prediction servers that are fully integrated with other platforms, there are several reasons you may decide to deploy Scoring Code on another platform.

Development and contribution

If you'd like to report an issue or bug, suggest improvements, or contribute code to this project, please refer to CONTRIBUTING.md.

Code of conduct

This project has adopted the Contributor Covenant for its Code of Conduct. See CODE_OF_CONDUCT.md to read it in full.

License

Licensed under the Apache License 2.0. See LICENSE to read it in full.