/PgDCP

PostgreSQL Data Computing Platform (PgDCP)

Primary LanguageTypeScriptGNU Affero General Public License v3.0AGPL-3.0

PostgreSQL Data Computing Platform (PgDCP)

PgDCP is Netspective's approach for using PostgreSQL to host tables, views, plus polyglot stored routines and surface them as GraphQL and REST using Postgraphile, Hasura, and PostgREST (or pREST). Using PostgreSQL whenever possible is called our Zero Middleware automated backend as a service (AutoBaaS) strategy. AutoBaaS helps us eliminate signficant amounts of GraphQL and REST boilerplate code plus reduces the number of microservices we need to deploy and manage.

Zero Middleware is first and foremost a technology strategy but also comes with a framework of code generators and reusable components to reducing data-centric code surfaces to just PostgreSQL with auto-generated, database-first secure GraphQL and REST endpoints. For use cases when higher-performance or more secure interfaces are necessary, direct access to tables, views, and stored routines using PostgreSQL wire protocol is encouraged.

We use psql and pure-Postgres migrations as Database as Code (“DaC”) for DDL, DQL, DML, etc. when possible. Instead of relying on 3rd party dependencies for schema migrations, we use PostgreSQL-first stored routines themselves along with psql to manage idempotent migrations.

The overall objective for PgDCP is to reduce the number of tools that a developer needs to know down to just PostgreSQL and psql along with SQL and PL/* hosted languages for all services.

Fear the Movement of Data! (FMD!)

The more we move data from one system to another (e.g. traditional ETL or it's cooler cousin ELTs), the more likely we are to run into scalability problems. Almost all modern databases are pretty good at querying and are able to do optimizations to improve querying either using vertical scaling or, in advanced cases, using horizontal scaling. However, no databases are good at moving data at scale. The reason we should fear the movement of data is that moving data from one system into another or vice-versa is almost always the most time-consuming part of data computing stacks.

Whenever dealing with data at scale, bring "compute to data" rather that "moving data to the compute clusters". PgDCP as an approach tries to use FDWs to leave data in their original locations when possible and only move the specific data when it cannot be processed in its original location. If it cannot eliminate data movement from source systems, PgDCP tries to reduce data movement to only a single data move into the PgDCP environment and then uses schemas, views, materialized views, etc. to transform data virtually instead of physically.

Even when applying Unified Star Schemas (USS), Dimensional Modeling, Dault Vault 2.0, or other techniques we try to use late-binding rather than early-binding when possible.

Polyglot but SQL-first

Our first choice of languages for writing data-centric micro services should be:

In case SQL or PL/pgSQL is not appropriate:

  • PL/Rust, PL/Java, PL/Go, PL/Deno or other type-safe PostgreSQL-hosted language should be prioritized.
  • If type-safety is not possible or 3rd party libraries access is more important than type-safety then PL/Python, PL/Perl, and other languages should be considered.
    • When using PL/Python or other language with package managers, consider using guidance such as programmatic access to PIP modules so that PostgreSQL server admins are not required for administering module dependencies
  • The choice of language should depend on how easy the functionality can be expressed using Postgraphile Schema Design.

Cut Out the Middle Tier: Generating JSON Directly from Postgres is a good HN discussion about the benefits and pitfalls of the PgDCP approach.

Information Assurance and Security in the Database

PgDCP requires database-first security, which means PostgreSQL schemas, users, roles, permissions, and row-level security (RLS) should drive all data security requirements. Role-based access control (RBAC) and attribute based access control (ABAC) should be implemented in PostgreSQL stored routines. If necessary, ldap2pg can be used to synchronize roles with LDAP.

Because all our API functionality (except for serving the endpoints) will also be in the database we want to ensure that we secure views, stored procedures, and stored functions as if they were the API endpoints. OWASP API Security Project provides some great advice.

Zero Trust SQL (ztSQL) for Zero Trust Data Access (ZTDA)

Zero Trust is a generally accepted cybersecurity approach that eliminates implicit trust in favor of continuously validating each stage of digital interactions. PgDCP encourages the same “never trust, always verify,” with Zero Trust SQL (ztSQL). ztSQL is designed to protect database environments and enable faster development by allowing anyone to run any SQL but leveages row-level security, attribute-based- access-control, role-based access control, and data segmentation within the database. Granular, “least access” policies should be implemented within the database so that Zero Trust Data Access (ZTDA) is possible.

Securing Access to the Database

If all access management is in the database, then securing access to the database is paramount. To that end, see:

Deliberate and Disciplined Change Management in the Database

In PgDCP PostgreSQL is not treated as a "bit bucket" where you just store data for an application. It's the center and most important part of our services' universe and requires a deliberate, disciplined, database-first change management approach. While there are many database migrations tools like LiquiBase, Flyway, and Sqitch, they all assume that the problem should be solved in a database-independent manner outside of PostgreSQL. Since the PgDCP approach is to double-down on PostgreSQL and not worry about database portability, we want to perform PosgreSQL-first database change management.

First review Martin Fowler's Evolutionary Database Design article and then see tools like:

  • Zero-downtime schema migrations in Postgres using views
  • postgresql-dbpatch, which support conducting database changes and deploying them in a robust and automated way through the database instead of external tools
  • Metagration, a PostgreSQL migration tool written in PostgreSQL
  • Konfigraf, a Postgres extension that allows you to store and manipulate data in Git repositories stored in tables within the database. This is designed to be used for storage of configuration data for applications and services.

Integrated Observability for Health, Metrics and Traceability in the Database

Observability of the database is important for forensics and quality assurance. Try to use simplified auditing based on SQL logging and FDWs to import logs instead of writing brittle custom triggers on each table/object. Separate observability into DDL changes, which can be alerted upon, as well as DML change logs which can be used for forensics. Wrap the observability of logs into a metrics table that can be scraped by Prometheus and used for alerting (e.g. whenever DDL changes occur, send alerts to anyone who needs to be informed).

Especially important is to integrate OpenTelemetry trace IDs into each DDL and DML statement so that end to end traceability becomes native to the database. Being able to track context and propogation of SQL through service layers will be critical to maintain high quality and reliability.

See semantic conventions for database client calls for how to provide traceability for database client calls and integrate W3C's Trace Context.

Client Application-friendly Type-safety in the Database

Modern applications demand type-safety (which is why PgDCP recommends TypeScript or Rust for applications). Since applications should be type-safe, we want our data models and database objects to also be type-safe. To enhance type-safety, create custom domains, custom enumerations or lookup tables based on inheritance, business types, and inheritable transaction tables ("table types"). Once you're using table inheritance you can use table inheritance wrapper functions.

GraphQL-first but REST-capable in the Database

All micro services code in PostgreSQL tables, views, functions and stored procedures will be surfaced through Postgraphile GraphQL first but our AutoBaaS requirements are that all services should be exposed through safe and secure REST interfaces via PostgREST (or pREST) as a fallback for non-GraphQL clients. We favor Postgraphile's GraphQL API because it generates code which honors PostgreSQL security, roles, and unique features more faithfully than other utilities such as Hasura.

Friendlier and Safer Experimentation

Promote the use of Database Lab Engine (DLE) to "provision independent non-production environments with multi-terabyte PostgreSQL databases in a few seconds." Every developer should be able to have their own easy to create and teardown development instances for experimentation purposes and to make sure scripts are idempotent and repeatable. If database cannot be easily torn down and recreated in development and quality assurance environments, scripting is harder.

Part of the DLE is "Joe", which should be used by engineering and QA teams:

“Joe Bot”, a virtual DBA for SQL optimization, is a revolutionary new way to troubleshoot and optimize PostgreSQL query performance. Instead of running EXPLAIN or EXPLAIN (ANALYZE, BUFFERS) directly in production, users send queries for troubleshooting to Joe Bot. Joe Bot uses the Database Lab Engine (DLE) to:

  • Generate a fresh thin clone
  • Execute the query on the clone
  • Return the resulting execution plan to the user
  • The returned plan is identical to production in terms of structure and data volumes – this is achieved thanks to two factors: thin clones have the same data and statistics as production (at a specified point in time), and the PostgreSQL planner configuration on clones matches the production configuration.

Assurance as Code in the Database

All code in PostgreSQL should be tested, or assured, with pgTAP code. All Assurance Engineering Cases (AECs) should be written code-first, not human-first (what we call Assurance as Code). If Assurance is done within the database then tools like PostgreSQL Anonymizer should be used to help build test-first databases from production data when appropriate (see this PDF for further elaboration).

Chaos in the Database

Because our responses to bugs in the database which might lead to database crashes is only as good as the number of times we see such crashes, we should use tools like pg_crash, a "Chaos Monkey" Extension for PostgreSQL databases. Per their repo "pg_crash is an extension to PostgreSQL, which allows you to periodically or randomly crash your database infrastructure by sending kill (or other) signals to your DB processes and make them fail. It is ideal for HA and failover testing."

Microsoft Excel-first UX but client-independent

Microsoft Excel should be the first UI that all data access should be designed for when accessing outside of developer-centric PgDCP use cases. If Excel can properly show your data, in a safe, secure, and performant way, then every other client can also do so. Excel-first UX should target "live ODBC" use cases where the database is directly accessed using PostgreSQL binary protocol.

Primary Keys vs. Surrogate Keys for external consumption

Good security practice for modern apps that will allow record IDs to be shared externally is to either have UUID or shortkey (see below) non-serial primary keys. If you use a serial type primary key, never send the PK out for external consumption - always use surrogate keys via uuid-ossp or similar. If you use a serial PK and share the ID externally then it will be possible for external users to "guess" IDs since the PKs would adjacent numerically.

Real-time Information via PostgreSQL Notify

Postgres Notify for Real Time Dashboards provides a great description of how to "push" updates to clients.

PgDCP Engineering Resources

Platform and site reliability engineers should review:

Engineers writing stored routines (functions, SPs) should review:

Engineers writing applications should consider these PostgreSQL-native libraries:

Engineers writing SQL-first code should use the following tools:

Engineers needing to instrument PostgreSQL:

Machine Learning without leaving PostgreSQL:

Content engineers who need datasets:

Precision Knowledge:

Stateless, non-data-centric services, are out of scope

If a custom micro service is completely stateless and does not have anything to do with reading or writing structured data, it should be written in TypeScript hosted on Deno or other micro service using Microsoft Dapr sidecar.