cncf/sandbox

[Sandbox] openGemini

Closed this issue · 16 comments

Application contact emails

xiangyu9@huawei.com
xuran215@huawei.com

Project Summary

The open-source, cloud-native, distributed time series database

Project Description

What Is Time Series Data?

Time-series data or temporal data is a sequence of data points collected over time intervals, allowing us to track changes over time. Time-series data can track changes over milliseconds, days, or even years. For example, industrial sensor data, microservice logs, trace, and metric data are time series data. OpenGemini is an open-source time series database that focuses on storing and analyzing time series data.

What are the advantages of openGemini?

In fields such as the Internet of Things (IoT) and cloud computing, there is a large amount of time series data. Data is written at the GB level per second. The number of metrics converted to more than 10 million metrics per second. Generally, the data query latency is at the millisecond level, which cannot be met by most open-source time series databases. OpenGemini focuses on the storage and analysis of massive time series data.
Advantages:

  1. Open source
  2. Distributed architecture
  3. Compared with other open-source time series databases, openGemini provides high write and query performance, better memory resource control, and a higher data compression rate.

Org repo URL (provide if all repos under the org are in scope of the application)

https://github.com/openGemini

Project repo URL in scope of application

https://github.com/openGemini/openGemini

Additional repos in scope of the application

https://github.com/openGemini/openGemini-operator
https://github.com/openGemini/openGemini.github.io
https://github.com/openGemini/website
https://github.com/openGemini/opengemini-client-go
https://github.com/openGemini/data-migration-tools
https://github.com/openGemini/grafana-opengemini-datasource
https://github.com/openGemini/openGemini-dashboard
https://github.com/openGemini/gemix
https://github.com/openGemini/opengemini-client-java

Website URL

https://opengemini.org

Roadmap

https://github.com/openGemini/openGemini/blob/main/ROADMAP.md

Contributing Guide

https://github.com/openGemini/openGemini/blob/main/CONTRIBUTION.md

Code of Conduct (CoC)

https://github.com/openGemini/openGemini/blob/main/CODE_OF_CONDUCT.md

Adopters

https://github.com/openGemini/openGemini/blob/main/ADOPTERS.md

Contributing or Sponsoring Org

Huawei

Maintainers file

https://github.com/openGemini/openGemini/blob/main/MAINTAINERS.md

IP Policy

  • If the project is accepted, I agree the project will follow the CNCF IP Policy

Trademark and accounts

  • If the project is accepted, I agree to donate all project trademarks and accounts to the CNCF

Why CNCF?

The creation and philosophy of CNCF are closely linked to the open-source spirit, which is dedicated to building and promoting open source cloud native technologies and ecosystems. Before choosing CNCF, we noted the following:

  1. Prometheus prefers the monitoring system in design. Its built-in time series database cannot store a large amount of time series data. Generally, it needs to use a third-party time series database to complete the storage.
  2. KubeEdge plans to store edge device data inside the platform. The requirement for time series databases is long-term.
  3. OpenTelemetry leads to the standard of observability. However, openTelemetry does not have a unified backend storage. metrics, logs, and traces are stored in different systems, and multiple types of correlation analysis cannot be completed in the database.

These projects need time series databases, and we believe that the addition of openGemini can fill this ecological gap. As a time series database, openGemini will provide better functionality, performance, and scalability to better meet the needs of these projects. OpenGemini uses the MPP distributed architecture and delivers outstanding data write and query performance in DevOps and IoT scenarios. Compared with similar open-source time series databases, such as InfluxDB, IoTDB, and openTSDB, openGemini has better performance and lower data storage costs. Compared with traditional relational databases, its data storage cost is only 1/20. In massive time series data scenarios, its write and query performance is improved by more than 10 times. This has enabled it to be widely used by more than 20 community users in areas such as power, IoT, industrial manufacturing, observability, and O&M monitoring.

Our goal is to promote technological innovation in cloud-native open-source time series databases, reduce storage costs of massive time series data, simplify system architecture, improve storage and analysis efficiency of time series data, and strengthen integration with other cloud-native projects to make time-series data storage and analysis more convenient.

We hope to join the CNCF community and become a part of the global cloud-native developer community. Given CNCF's broad user base, using CNCF's platform will enable openGemini to benefit more and more organizations and companies.

Benefit to the Landscape

According to DB-Engines data, time series databases have become the fastest-growing database type in the past few years. Due to the rapid development of 5G, Internet of Things (IoT), and cloud native technologies, a large number of time series data storage requirements have arisen.

The benefits could be:

  1. The participation of openGemini also enriches the database types of CNCF landscape, attracting a more extensive community of developers and users.
  2. The emergence of openGemini promotes closer integration of time series databases with CNCF's projects such as Kubernetes, Prometheus, KubeEdge, and openTelemetry, and promotes further development in multiple fields.
  3. openGemini has excellent read-and-write performance, which can meet the requirements of many application scenarios. The high cardinality problem of time series data is also well solved.

Cloud Native 'Fit'

Landscape: Databases
openGemini, as a time series database, will store time series data. It provides cloud-native features such as high performance, high reliability, scalability, and observability. So it fits in "databases".

TAGs: TAG Storage
The participation of openGemini in tag-storage group will raise discussions about the integration of K8s, KubeEdge, Prometheus and openTelementry. Based on the characteristics of time series databases, we will further share and discuss the basic characteristics of time series databases in terms of availability, scalability, performance, durability, consistency, ease of use, cost and operational complexity.

Cloud Native 'Integration'

N/A

Cloud Native Overlap

N/A

Similar projects

InfluxDB
Apache IoTDB
timescledb
openTSDB

Landscape

Yes

Business Product or Service to Project separation

N/A

Project presentations

N/A

Project champions

N/A

Additional information

No response

Thanks @xiangyu5632 and Ran Xu for presenting openGemini at TAG Storage meeting today! Here's the meeting recording: https://www.youtube.com/watch?v=7MB170knbqs. Here are the slides: https://drive.google.com/file/d/1KqKCdrD0P9BlugUONnj9dZOiHw_AQoxr/view?usp=sharing
cc @chira001 @Raffaele Spazzoli

Yeah, I'm very glad to have this opportunity to introduce our project to everyone.

TAG-CS review, this project has:

  • a fairly solid Contribution document
  • no documented governance
  • 5 documented maintainers from Huawei

hi @jberkus
we have a governance document in openGemini's community repository and have an update referring to CNCF governance-maintainer.md template
you can see: https://github.com/openGemini/community/blob/main/GOVERNANCE.md

Update:

  • now has written governance based on CNCF's Maintainer Council Template

Follow-up from today's sandbox review, OpenGemini will be moved to a vote 👍
Just an FYI though - there may be a follow up regarding the project's name and trademark concerns.
/vote

Vote created

@mrbobbytables has called for a vote on [Sandbox] openGemini (#82).

The members of the following teams have binding votes:

Team
@cncf/cncf-toc

Non-binding votes are also appreciated as a sign of support!

How to vote

You can cast your vote by reacting to this comment. The following reactions are supported:

In favor Against Abstain
👍 👎 👀

Please note that voting for multiple options is not allowed and those votes won't be counted.

The vote will be open for 2months 30days 2h 52m 48s. It will pass if at least 66% of the users with binding votes vote In favor 👍. Once it's closed, results will be published here as a new comment.

I will be abstaining due to a conflict of interest.

/check-vote

Vote status

So far 18.18% of the users with binding vote are in favor (passing threshold: 66%).

Summary

In favor Against Abstain Not voted
2 0 0 9

Binding votes (2)

User Vote Timestamp
rochaporto In favor 2024-06-12 9:11:47.0 +00:00:00
TheFoxAtWork In favor 2024-06-12 21:00:46.0 +00:00:00
@dims Pending
@angellk Pending
@mauilion Pending
@linsun Pending
@dzolotusky Pending
@kevin-wangzefeng Pending
@cathyhongzhang Pending
@nikhita Pending
@kgamanji Pending

Non-binding votes (3)

User Vote Timestamp
huang-feiteng In favor 2024-06-12 2:53:32.0 +00:00:00
pacoxu In favor 2024-06-12 9:36:03.0 +00:00:00
chira001 In favor 2024-06-12 14:25:09.0 +00:00:00

/check-vote

Vote status

So far 90.91% of the users with binding vote are in favor (passing threshold: 66%).

Summary

In favor Against Abstain Not voted
10 0 0 1

Binding votes (10)

User Vote Timestamp
kevin-wangzefeng In favor 2024-06-18 4:10:45.0 +00:00:00
dims In favor 2024-06-18 14:04:42.0 +00:00:00
cathyhongzhang In favor 2024-06-17 18:31:01.0 +00:00:00
TheFoxAtWork In favor 2024-06-12 21:00:46.0 +00:00:00
nikhita In favor 2024-06-18 4:34:02.0 +00:00:00
rochaporto In favor 2024-06-12 9:11:47.0 +00:00:00
linsun In favor 2024-06-18 15:21:23.0 +00:00:00
kgamanji In favor 2024-06-18 6:39:51.0 +00:00:00
dzolotusky In favor 2024-06-18 4:09:53.0 +00:00:00
angellk In favor 2024-06-18 13:10:33.0 +00:00:00
@mauilion Pending

Non-binding votes (3)

User Vote Timestamp
huang-feiteng In favor 2024-06-12 2:53:32.0 +00:00:00
pacoxu In favor 2024-06-12 9:36:03.0 +00:00:00
chira001 In favor 2024-06-12 14:25:09.0 +00:00:00

/check-vote

Votes can only be checked once a day.

Vote closed

The vote passed! 🎉

81.82% of the users with binding vote were in favor (passing threshold: 66%).

Summary

In favor Against Abstain Not voted
9 0 1 1

Binding votes (10)

User Vote Timestamp
@cathyhongzhang In favor 2024-06-17 18:31:01.0 +00:00:00
@nikhita In favor 2024-06-18 4:34:02.0 +00:00:00
@kgamanji In favor 2024-06-18 6:39:51.0 +00:00:00
@TheFoxAtWork In favor 2024-06-12 21:00:46.0 +00:00:00
@angellk In favor 2024-06-18 13:10:33.0 +00:00:00
@dzolotusky In favor 2024-06-18 4:09:53.0 +00:00:00
@kevin-wangzefeng Abstain 2024-06-19 3:36:45.0 +00:00:00
@linsun In favor 2024-06-18 15:21:23.0 +00:00:00
@dims In favor 2024-06-18 14:04:42.0 +00:00:00
@rochaporto In favor 2024-06-12 9:11:47.0 +00:00:00

Non-binding votes (3)

User Vote Timestamp
@huang-feiteng In favor 2024-06-12 2:53:32.0 +00:00:00
@pacoxu In favor 2024-06-12 9:36:03.0 +00:00:00
@chira001 In favor 2024-06-12 14:25:09.0 +00:00:00

Hello and congrats on being accepted as a CNCF Sandbox project!

Here is the link to your onboarding task list: cncf/toc#1373

Feel free to reach out with any questions you might have!