Weaviate in a nutshell:
- Weaviate is an open source database of the type vector search engine.
- Weaviate allows you to store JSON documents in a class property-like fashion while attaching machine learning vectors to these documents to represent them in vector space.
- Weaviate can be used stand-alone (aka bring your vectors), with a wide variety of modules that can do the vectorization for you or extend the core capabilities.
- Weaviate has a GraphQL-API to access your data easily.
- We aim to bring your vector search set up to production to query in mere milliseconds (check our open source benchmarks to see if Weaviate fits your use case).
- Get to know Weaviate in the basics getting started guide in under five minutes.
Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), etc. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering and the fault-tolerance of a cloud-native database, all accessible through GraphQL, REST, and various programming languages client.
-
Software Engineers (docs) - Who use Weaviate as an ML-first database for your applications.
- Out-of-the-box modules for: NLP/semantic search, automatic classification and image similarity search.
- Easy to integrate into your current architecture, with full CRUD support like you're used to from other OSS databases.
- Cloud-native, distributed, runs well on Kubernetes and scales with your workloads.
-
Data Engineers (docs) - Who use Weaviate as a vector database that is built up from the ground with ANN at its core, and with the same UX they love from Lucene-based search engines.
- Weaviate has a modular setup that allows you to use your ML models inside Weaviate, but you can also use out-of-the-box ML models (e.g., SBERT, ResNet, fasttext, etc).
- Weaviate takes care of the scalability, so that you don't have to.
- Deploy and maintain ML models in production reliably and efficiently.
-
Data Scientists (docs) - Who use Weaviate for a seamless handover of their Machine Learning models to MLOps.
- Deploy and maintain your ML models in production reliably and efficiently.
- Weaviate's modular design allows you to easily package any custom trained model you want.
- Smooth and accelerated handover of your Machine Learning models to engineers.
Weaviate GraphQL demo on news article dataset containing: Transformers module, GraphQL usage, semantic search, _additional{} features, Q&A, and Aggregate{} function. You can the demo on this dataset in the GUI here: semantic search, Q&A, Aggregate.
Weaviate makes it easy to use state-of-the-art ML models while giving you the scalability, ease of use, safety and cost-effectiveness of a purpose-built vector database. Most notably:
-
Fast queries
Weaviate typically performs a 10-NN neighbor search out of millions of objects in considerably less than 100ms. -
Any media type with Weaviate Modules
Use State-of-the-Art ML model inference (e.g. Transformers) for Text, Images, etc. at search and query time to let Weaviate manage the process of vectorizing your data for you - or import your own vectors. -
Combine vector and scalar search
Weaviate allows for efficient combined vector and scalar searches, e.g “articles related to the COVID 19 pandemic published within the past 7 days”. Weaviate stores both your objects and the vectors and make sure the retrieval of both is always efficient. There is no need for third party object storage. -
Real-time and persistent
Weaviate lets you search through your data even if it’s currently being imported or updated. In addition, every write is written to a Write-Ahead-Log (WAL) for immediately persisted writes - even when a crash occurs. -
Horizontal Scalability
Scale Weaviate for your exact needs, e.g., maximum ingestion, largest possible dataset size, maximum queries per second, etc. -
High-Availability
Is on our roadmap and will be released later this year. -
Cost-Effectiveness
Very large datasets do not need to be kept entirely in memory in Weaviate. At the same time available memory can be used to increase the speed of queries. This allows for a conscious speed/cost trade-off to suit every use case. -
Graph-like connections between objects
Make arbitrary connections between your objects in a graph-like fashion to resemble real-life connections between your data points. Traverse those connections using GraphQL.
You can find detailed documentation in the developers section of our website or directly go to one of the docs using the links in the list below.
- Weaviate is an open-source search engine powered by ML, vectors, graphs, and GraphQL (ZDNet)
- Weaviate, an ANN Database with CRUD support (DB-Engines.com)
- A sub-50ms neural search with DistilBERT and Weaviate (Towards Datascience)
- Getting Started with Weaviate Python Library (Towards Datascience)
You can find code examples here