/cloudberrydb

Cloudberry Database - Next generation unified database for Analytics and AI

Primary LanguageCApache License 2.0Apache-2.0

Cloudberry Database

Slack Twitter Follow Website GitHub Discussions GitHub commit activity(branch) GitHub contributors GitHub License FOSSA Status


Cloudberry Database (CBDB) is shipped with PostgreSQL 14.4 as its kernel and is forked from Greenplum Database 6, which serves as our code base.

Features

Cloudberry Database is 100% compatible with Greenplum, and provides all the Greenplum features you need. In addition, Cloudberry Database possesses some features that Greenplum currently lacks or does not support. Visit this feature comparison doc for details.

Code layout

The directory layout of the repository follows the same general layout as upstream PostgreSQL. There are changes compared to PostgreSQL throughout the codebase, but a few larger additions worth noting:

  • gpMgmt/ : Contains Cloudberry-specific command-line tools for managing the cluster. Scripts like gpinit, gpstart, and gpstop live here. They are mostly written in Python.

  • gpAux/ : Contains Cloudberry-specific release management scripts, and vendored dependencies. Some additional directories are submodules and will be made available over time.

  • gpcontrib/ : Much like the PostgreSQL contrib/ directory, this directory contains extensions such as gpfdist, PXF and gpmapreduce which are Cloudberry-specific.

  • doc/ : In PostgreSQL, the user manual lives here. In Cloudberry Database, the user manual is maintained separately at Cloudberry Database Website Repo.

  • hd-ci/ : Contains configuration files for the CBDB continuous integration system.

  • src/

    • src/backend/cdb/ : Contains larger Cloudberry-specific backend modules. For example, communication between segments, turning plans into parallelizable plans, mirroring, distributed transaction and snapshot management, etc. cdb stands for Cluster Database - it was a workname used in the early days. That name is no longer used, but the cdb prefix remains.

    • src/backend/gpopt/ : Contains the so-called translator library, for using the GPORCA optimizer with Cloudberry. The translator library is written in C++ code, and contains glue code for translating plans and queries between the DXL format used by GPORCA, and the PostgreSQL internal representation.

    • src/backend/gporca/ : Contains the GPORCA optimizer code and tests. This is written in C++. See README.md for more information and how to unit-test GPORCA.

    • src/backend/fts/ : FTS is a process that runs in the coordinator node, and periodically polls the segments to maintain the status of each segment.

Documentation

For Cloudberry Database documentation, please check the documentation website. Our documents are still in construction, welcome to help. If you're interested in document contribution, you can submit the pull request here.

We also recommend you take PostgreSQL Documentation and Greenplum Documentation as quick references.

Contribution

Cloudberry Database is maintained actively by a group of community database experts by individuals and companies. We believe in the Apache Way "Community Over Code" and we want to make Cloudberry Database a community-driven project.

Contributions can be diverse, such as code enhancements, bug fixes, feature proposals, documents, marketing and so on. No contribution is too small, we encourage all types of contributions. We hope you can enjoy it here.

Assume you have all the skills in collaboration, if not, please learn more about Git and GitHub. For coding guidelines, we try to follow PostgreSQL Coding Conventions.

If the change you're working on touches functionality that is common between PostgreSQL and Cloudberry Database, you may be asked to forward-port it to PostgreSQL. This is not only so that we keep reducing the delta between the two projects, but also so that any change that is relevant to PostgreSQL can benefit from a much broader review of the upstream PostgreSQL community. In general, keep both code bases handy so you can be sure whether your changes need to be forward-ported.

Before you commit your changes, please run the command to configure the commit message template for your own git: git config --global commit.template .gitmessage

Community

We have many channels for community members to discuss, ask for help, feedback ,and chat:

  • GitHub Discussions: we use GitHub Discussions to broadcast news, answer questions, share ideas. You can start a new discussion under different categories, such as "Announcements", "Ideas / Feature Requests", "Proposal" and "Q&A".

  • GitHub Issues: You can report bugs and issues with code in Cloudberry Database core.

  • Slack: Slack is used for real-time chat, including QA, Dev, Events and more.

When you involve, please follow our community Code of Conduct to help create a safe space for everyone.

Acknowledgment

Thanks to PostgreSQL, Greenplum Database and other great open source projects to make Cloudberry Database has a sound foundation.

License

Cloudberry Database is released under the Apache License, Version 2.0.