Pinned Repositories
automl-docker
CLI-based tool to automatically build ML models from training data into a servable Docker container
bricks
Open-source natural language enrichments at your fingertips.
embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
refinery-neural-search
Neural search for refinery. Manages similarity search powered by Qdrant and outlier detection, both based on vector representations of the project records.
refinery-python-sdk
Official Python SDK for Kern AI refinery.
refinery-sample-projects
Containing examples of projects you can use to test refinery. Please select the use case from the branches.
sequence-learn
With sequence-learn, you can build models for named entity recognition as quickly as if you were building a sklearn classifier.
twitter-issues-classifier
Since the twitter algorithm has been open-sourced, the issues section of their repository is being polluted. Let's try to fix that.
weak-nlp
With weak-nlp, you can integrate heuristics like labeling functions and active learners based on weak supervision. Automate data labeling and improve label quality.
Kern AI's Repositories
code-kern-ai/refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
code-kern-ai/bricks
Open-source natural language enrichments at your fingertips.
code-kern-ai/embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
code-kern-ai/refinery-neural-search
Neural search for refinery. Manages similarity search powered by Qdrant and outlier detection, both based on vector representations of the project records.
code-kern-ai/refinery-submodule-model
Data model for refinery. Manages entities and their access for multiple services, e.g. the gateway.
code-kern-ai/refinery-embedder
Embedder for refinery. Manages the creation of document- and token-level embeddings using the embedders library.
code-kern-ai/refinery-entry
Login and registration screen for refinery. Implemented via Ory Kratos.
code-kern-ai/refinery-gateway
Gateway for refinery. Manages incoming requests and holds the workflow logic. To interact with the gateway, the UI or Python SDK can be used.
code-kern-ai/refinery-submodule-parent-images
Submodule which contains the requirements of the different parent images of refinery.
code-kern-ai/refinery-submodule-s3
S3 related AWS and Minio logic.
code-kern-ai/refinery-tokenizer
Tokenizer for refinery. Manages the creation and storage of spaCy tokens for text-based record attributes and supports multiple language models. It is used by the gateway.
code-kern-ai/refinery-ui
code-kern-ai/cicd-deployment-scripts
Scripts used for Kern AI CI/CD efforts
code-kern-ai/cognition-pdf2md
A PDF to Markdown converter
code-kern-ai/refinery-ac-exec-env
Execution environment for attribute calculation in refinery. Containerized function as a service to build custom attributes derived from the original data.
code-kern-ai/refinery-authorizer
Evaluates whether a user has access to certain resources.
code-kern-ai/refinery-common-parent-image
Defines parent image for the Docker images of the refinery services which require the integration of the model and the s3 submodule.
code-kern-ai/refinery-exec-env-parent-image
Defines parent image for the Docker images of the refinery services which provide an execution environment.
code-kern-ai/refinery-gateway-proxy
Gateway proxy for refinery. Manages incoming requests and forwards them to the gateway. Used by the Python SDK.
code-kern-ai/refinery-lf-exec-env
Execution environment for labeling functions in refinery. Containerized function as a service to execute user-defined Python scripts.
code-kern-ai/refinery-mini-parent-image
Defines parent image for the Docker images of the refinery services with the smallest set of requirements.
code-kern-ai/refinery-ml-exec-env
Execution environment for the active learning module in refinery. Containerized function as a service to build active learning models using scikit-learn and sequence-learn.
code-kern-ai/refinery-next-parent-image
code-kern-ai/refinery-torch-cpu-parent-image
Defines parent image for the Docker images of the refinery services that require torch (cpu).
code-kern-ai/refinery-torch-cuda-parent-image
Defines parent image for the Docker images of the refinery services that require torch (gpu).
code-kern-ai/refinery-updater
Updater for refinery. Manages migration logic to new versions if required.
code-kern-ai/refinery-weak-supervisor
Weak supervision for refinery. Manages the integration of heuristics such as labeling functions, active learners or zero-shot classifiers. Uses the weak-nlp library for the actual integration logic and algorithms.
code-kern-ai/submodule-javascript-functions
code-kern-ai/submodule-react-components
code-kern-ai/submodule-tailwind-config