/glaredb-hw

Primary LanguageDockerfileMIT LicenseMIT

Glaredb playground

This is a place to tinker with query federation OSS software GlareDB

Sample Internet queries

NYC real estate transactions from 2022

SELECT *
FROM read_parquet('https://github.com/GlareDB/tutorial_data/raw/main/quickstart_data/nyc_sales-2022_01.parquet');

Query data from a local file

Manually download the file and put in the ./data/input directory

-- Relative path to the same directory from which you started the GlareDB CLI
SELECT * FROM read_parquet('./data/input/nyc_sales-2022_02.parquet');

Postgres customer data

SELECT * FROM read_postgres(
'postgresql://demo:demo@pg.demo.glaredb.com:5432/postgres', --connection
'public', --schema name
'customer' --table name
);

JOIN Postgres data with remote file to get number of sales per NYC borough

SELECT
    COUNT(sales.sale_date),
    lookup.borough_name
FROM
     read_parquet('https://github.com/GlareDB/tutorial_data/raw/main/quickstart_data/nyc_sales-2022_01.parquet') sales
JOIN
    read_postgres(
        'postgresql://demo:demo@pg.demo.glaredb.com:5432/postgres',
        'public',
        'borough_lookup'
    ) lookup
ON sales.borough = lookup.borough_id
GROUP BY lookup.borough_name;

JOIN Postgres data with local file to get number of sales per NYC borough

SELECT
    COUNT(sales.sale_date),
    lookup.borough_name
FROM
     read_parquet('./data/input/nyc_sales-2022_02.parquet') sales
JOIN
    read_postgres(
        'postgresql://demo:demo@pg.demo.glaredb.com:5432/postgres',
        'public',
        'borough_lookup'
    ) lookup
ON sales.borough = lookup.borough_id
GROUP BY lookup.borough_name;

Export the JOIN to a local table in your GlareDB deployment

CREATE TABLE sales_aggregate_by_borough AS
    SELECT
        COUNT(sales.sale_date),
        lookup.borough_name
    FROM
        read_parquet('https://github.com/GlareDB/tutorial_data/raw/main/quickstart_data/nyc_sales-2022_01.parquet') sales
    JOIN
        read_postgres(
            'postgresql://demo:demo@pg.demo.glaredb.com:5432/postgres',
            'public',
            'borough_lookup'
        ) lookup
    ON sales.borough = lookup.borough_id
    GROUP BY lookup.borough_name;

SELECT *
FROM sales_aggregate_by_borough;

Export the JOIN to a Parquet file

COPY (
    SELECT
        COUNT(sales.sale_date),
        lookup.borough_name
    FROM
        read_parquet('https://github.com/GlareDB/tutorial_data/raw/main/quickstart_data/nyc_sales-2022_01.parquet') sales
    JOIN
        read_postgres(
            'postgresql://demo:demo@pg.demo.glaredb.com:5432/postgres',
            'public',
            'borough_lookup'
        ) lookup
    ON sales.borough = lookup.borough_id
    GROUP BY lookup.borough_name
) TO './data/input/sales_aggregate_by_borough.parquet';

SELECT *
FROM './data/input/sales_aggregate_by_borough.parquet';

Resources

Python Mac versions

dev container spec

GlareDB docs

GlareDB website

Rust releases

License

This project is licensed under the MIT License

Contributing

Disclaimer: Unmaintained and Untested Code

Please note that this program is not actively maintained or tested. While it may work as intended, it's possible that it will break or behave unexpectedly due to changes in dependencies, environments, or other factors.

Use this program at your own risk, and be aware that:

  1. Bugs may not be fixed
  2. Compatibility issues may arise
  3. Security vulnerabilities may exist

If you encounter any issues or have concerns, feel free to open an issue or submit a pull request.