Pinned Repositories
cdh-twitter-example
Example application for analyzing Twitter data using CDH - Flume, Oozie, Hive
cloudera-playbook
Cloudera deployment automation with Ansible
cm_api
Cloudera Manager API Client
cm_ext
Cloudera Manager Extensibility Tools and Documentation.
cod-examples
cod-examples
flink-tutorials
flume
WE HAVE MOVED to Apache Incubator. https://cwiki.apache.org/FLUME/ . Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic applications.
hue
Open source SQL Query Assistant service for Databases/Warehouses
impyla
Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
livy
Livy is an open source REST interface for interacting with Apache Spark from anywhere
Cloudera's Repositories
cloudera/impyla
Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
cloudera/cm_csds
A collection of Custom Service Descriptors
cloudera/CML_AMP_Anomaly_Detection
Apply modern, deep learning techniques for anomaly detection to identify network intrusions.
cloudera/native-toolchain
cloudera/community-ml-runtimes
cloudera/CML_AMP_Few-Shot_Text_Classification
Perform topic classification on news articles in several limited-labeled data regimes.
cloudera/ml-runtimes
cloudera/cloudera-airflow-plugins
cloudera/CML_AMP_Image_Analysis
Build a semantic search application with deep learning models.
cloudera/cmlextensions
Added functionality to the cml python package
cloudera/CML_AMP_LLM_Fine_Tuning_Studio
cloudera/CML_AMP_Structural_Time_Series
Applying a structural time series approach to California hourly electricity demand data.
cloudera/flink-basic-auth-handler
flink-basic-auth-handler
cloudera/CML_AMP_Knowledge_Graph_Backed_RAG
cloudera/cmlutils
cloudera/CML_AMP_Video_Classification
Demonstration of how to perform video classification using pre-trained TensorFlow models.
cloudera/CML_AMP_Summarize
Automatic text summarization with extractive and abstractive models.
cloudera/CML_AMP_Image-Analysis-with-Anthropic-Claude
This AMP enables transcription and information extraction from images using Anthropic Claude models, covering use cases like text extraction, document QA, and converting unstructured content into structured formats like JSON.
cloudera/CML_AMP_PromptBrew
Create, iterate, refine and test your LLM prompts with AI assistance.
cloudera/CML_AMP_Active_Learning
An interactive, visual workflow of active learning using the MNIST dataset.
cloudera/CML_AMP_AI_Text_Summarization_with_Amazon_Bedrock
CML_AMP_AI_Text_Summarization_with_Amazon_Bedrock
cloudera/CML_AMP_Tensorboard_on_CML
Demonstration of how to use TensorBoard as a CML Application.
cloudera/Community-Applied-ML-Prototypes
Community-Applied-ML-Prototypes
cloudera/HuggingFace-Spaces
HuggingFace-Spaces
cloudera/LlamaIndex_IN_CML_AMP
cloudera/spiderman
A comprehensive, high quality, human-annotated plain-text dataset for SQL AI tasks across diverse domains and complexity levels.
cloudera/CML_AMP_Summarization_with_Vertex_AI_Gemini
This AMP allows users to summarize documents and text using Google's Gemini models from the Vertex AI Model Garden. It provides two summarization modes: text-based and document-based with document summarization supported through LlamaIndex as the vector store.
cloudera/CML_GOV_AMP_Intelligent_Writing_Assistance
cloudera/DiM
cloudera/GovCloud-Applied-ML-Prototypes