Introduction of HugeGraph
Please visit the contribution doc to get start, include theme/website description & settings~
Summary
HugeGraph is an easy-to-use, efficient, general-purpose open source graph database system(Graph Database, GitHub project address), implemented the Apache TinkerPop3 framework and is fully compatible with the Gremlin query language. With complete toolchain components, it helps users to easily build applications and products based on graph databases. HugeGraph supports fast import of more than 10 billion vertices and edges, and provides millisecond-level relational query capability (OLTP). It supports large-scale distributed graph processing (OLAP).
Typical application scenarios of HugeGraph include deep relationship exploration, association analysis, path search, feature extraction, data clustering, community detection, knowledge graph, etc., and are applicable to business fields such as network security, telecommunication fraud, financial risk control, advertising recommendation, social network and intelligence Robots etc.
Features
HugeGraph supports graph operations in online and offline environments, supports batch import of data, supports efficient complex relationship analysis, and can be seamlessly integrated with big data platforms. HugeGraph supports multi-user parallel operations. Users can enter Gremlin query statements and get graph query results in time. They can also call HugeGraph API in user programs for graph analysis or query.
This system has the following features:
- Ease of use: HugeGraph supports Gremlin graph query language and RESTful API, provides common interfaces for graph retrieval, and has peripheral tools with complete functions to easily implement various graph-based query and analysis operations.
- Efficiency: HugeGraph has been deeply optimized in graph storage and graph computing, and provides a variety of batch import tools, which can easily complete the rapid import of tens of billions of data, and achieve millisecond-level response for graph retrieval through optimized queries. Supports simultaneous online real-time operations of thousands of users.
- Universal: HugeGraph supports the Apache Gremlin standard graph query language and the Property Graph standard graph modeling method, and supports graph-based OLTP and OLAP schemes. Integrate Apache Hadoop and Apache Spark big data platform.
- Scalable: supports distributed storage, multiple copies of data and horizontal expansion, built-in multiple back-end storage engines, and can easily expand the back-end storage engine through plug-ins.
- Open: HugeGraph code is open source (Apache 2 License), customers can modify and customize independently, and selectively give back to the open source community.
The functions of this system include but are not limited to:
- Supports batch import of data from multiple data sources (including local files, HDFS files, MySQL databases and other data sources), and supports import of multiple file formats (including TXT, CSV, JSON and other formats)
- With a visual operation interface, it can be used for operation, analysis and display diagrams, reducing the threshold for users to use
- Optimized graph interface: shortest path (Shortest Path), K-step connected subgraph (K-neighbor), K-step to reach the adjacent point (K-out), personalized recommendation algorithm PersonalRank, etc.
- Implemented based on Apache TinkerPop3 framework, supports Gremlin graph query language
- Support attribute graph, attributes can be added to vertices and edges, and support rich attribute types
- Has independent schema metadata information, has powerful graph modeling capabilities, and facilitates third-party system integration
- Support multi-vertex ID strategy: support primary key ID, support automatic ID generation, support user-defined string ID, support user-defined digital ID
- The attributes of edges and vertices can be indexed to support precise query, range query, and full-text search
- The storage system adopts plug-in mode, supporting RocksDB, Cassandra, ScyllaDB, HBase, MySQL, PostgreSQL, Palo, and InMemory, etc.
- Integrate with big data systems such as Hadoop and Spark GraphX, and support Bulk Load operations
- Support high availability HA, multiple copies of data, backup recovery, monitoring, etc.
Modules
- HugeGraph-Server: HugeGraph-Server is the core part of the HugeGraph project, including sub-modules such as Core, Backend, and API;
- Core: Graph engine implementation, connecting the Backend module downward and supporting the API module upward;
- Backend: Realize the storage of graph data to the backend. The supported backends include: Memory, Cassandra, ScyllaDB, RocksDB, HBase, MySQL and PostgreSQL. Users can choose one according to the actual situation;
- API: Built-in REST Server, provides RESTful API to users, and is fully compatible with Gremlin query.
- HugeGraph-Client: HugeGraph-Client provides a RESTful API client for connecting to HugeGraph-Server. Currently, only Java version is implemented. Users of other languages can implement it by themselves;
- HugeGraph-Loader: HugeGraph-Loader is a data import tool based on HugeGraph-Client, which converts ordinary text data into graph vertices and edges and inserts them into graph database;
- HugeGraph-Computer: HugeGraph-Computer is a distributed graph processing system for HugeGraph (OLAP). It is an implementation of Pregel. It runs on Kubernetes framework;
- HugeGraph-Hubble: HugeGraph-Hubble is HugeGraph's web visualization management platform, a one-stop visual analysis platform. The platform covers the whole process from data modeling, to rapid data import, to online and offline analysis of data, and unified management of graphs;
- HugeGraph-Tools: HugeGraph-Tools is HugeGraph's deployment and management tools, including functions such as managing graphs, backup/restore, Gremlin execution, etc.
Subscribe the mailing list
HugeGraph offers an email list for development and user discussions.
- hugegraph-dev: dev@hugegraph.apache.org for both development and users discussions.
Subscribe to the mailing list by following steps:
- Email dev-subscribe@hugegraph.apache.org through your email account, and then you will receive a confirmation email.
- Reply to the confirmation email to confirm your subscription. Then, you will receive another confirmation email.
- Now you are a subscriber of the mailing list. If you have more questions, just email the mailing list and someone will reply to you soon.
- If you want to unsubscribe from the mailing list, just email dev-unsubscribe@hugegraph.apache.org and follow the steps in the confirmation email.
You can subscribe to the mailing list anytime you want. Additionally, you can check historical emails / all emails easily (even if you are not subscribing to the list).
Some notes:
- If you don't receive the confirmation email, please send it after 24 hours later.
- Don't email to dev until you subscribe to the mailing list successfully (otherwise the mail will be banned).
More information on mailing subscribe can be found at:
Contact Us
- GitHub Issues: Feedback on usage issues and functional requirements (quick response)
- Feedback Email: dev@hugegraph.apache.org (subscriber only)
- WeChat public account: Apache HugeGraph, welcome to scan this QR code to follow us.