kristiandupont/kanel

Column type in view is unknown

gerdemb opened this issue ยท 16 comments

Kanel is not identifying the column type for one of my views. The view:

beanpost=# \d+ account_posting_hierarchy
                  View "public.account_posting_hierarchy"
 Column  |  Type  | Collation | Nullable | Default | Storage  | Description 
---------+--------+-----------+----------+---------+----------+-------------
 date    | date   |           |          |         | plain    | 
 account | text[] |           |          |         | extended | 
 amount  | amount |           |          |         | extended | 
View definition:
 WITH RECURSIVE hierarchy_cte AS (
         SELECT posting.date,
            account.name AS account,
            posting.amount
           FROM posting
             JOIN account ON account.id = posting.account_id
        UNION ALL
         SELECT hierarchy_cte_1.date,
            trim_array(hierarchy_cte_1.account, 1) AS account,
            hierarchy_cte_1.amount
           FROM hierarchy_cte hierarchy_cte_1
          WHERE array_length(hierarchy_cte_1.account, 1) > 1
        )
 SELECT hierarchy_cte.date,
    hierarchy_cte.account,
    hierarchy_cte.amount
   FROM hierarchy_cte;

And the output from Kanel for the view:

// @generated
// This file is automatically generated by Kanel. Do not modify manually.

import type { ColumnType, Selectable } from 'kysely';

/** Represents the view public.account_posting_hierarchy */
export default interface AccountPostingHierarchyTable {
  date: ColumnType<unknown, never, never>;

  account: ColumnType<unknown[], never, never>;

  amount: ColumnType<unknown, never, never>;
}

export type AccountPostingHierarchy = Selectable<AccountPostingHierarchyTable>;

Notice that all the column types are unknown. amount is a composite type that I have defined.

beanpost=# \d+ amount
                        Composite type "public.amount"
  Column  |  Type   | Collation | Nullable | Default | Storage  | Description 
----------+---------+-----------+----------+---------+----------+-------------
 number   | numeric |           |          |         | main     | 
 currency | text    |           |          |         | extended | 

It is working correctly in other views. For example with this view:

beanpost=# \d+ account_balance
                             View "public.account_balance"
      Column      |   Type   | Collation | Nullable | Default | Storage  | Description 
------------------+----------+-----------+----------+---------+----------+-------------
 id               | integer  |           |          |         | plain    | 
 name             | text[]   |           |          |         | extended | 
 open_date        | date     |           |          |         | plain    | 
 close_date       | date     |           |          |         | plain    | 
 currencies       | text[]   |           |          |         | extended | 
 balance          | amount[] |           |          |         | extended | 
 unbalanced_count | integer  |           |          |         | plain    | 
View definition:
 SELECT account.id,
    account.name,
    account.open_date,
    account.close_date,
    account.currencies,
    COALESCE(account_balance_subquery.balance, ARRAY[]::amount[]) AS balance,
    (( SELECT COALESCE(sum(
                CASE
                    WHEN balance_check.is_balanced = false THEN 1
                    ELSE 0
                END), 0::bigint) AS "coalesce"
           FROM balance_check
          WHERE balance_check.account_id = account.id))::integer AS unbalanced_count
   FROM account
     LEFT JOIN LATERAL ( SELECT posting_balance.id,
            posting_balance.date,
            posting_balance.account_id,
            posting_balance.transaction_id,
            posting_balance.flag,
            posting_balance.amount,
            posting_balance.price,
            posting_balance.cost,
            posting_balance.balance
           FROM posting_balance
          WHERE posting_balance.account_id = account.id
          ORDER BY posting_balance.date DESC
         LIMIT 1) account_balance_subquery ON true;

Kanel correctly generates this schema:

// @generated
// This file is automatically generated by Kanel. Do not modify manually.

import type { AccountId } from './Account';
import type Amount from './Amount';
import type { ColumnType, Selectable } from 'kysely';

/** Represents the view public.account_balance */
export default interface AccountBalanceTable {
  id: ColumnType<AccountId, never, never>;

  name: ColumnType<string[], never, never>;

  open_date: ColumnType<Date, never, never>;

  close_date: ColumnType<Date | null, never, never>;

  currencies: ColumnType<string[] | null, never, never>;

  balance: ColumnType<Amount[], never, never>;

  unbalanced_count: ColumnType<number, never, never>;
}

export type AccountBalance = Selectable<AccountBalanceTable>;

Notice how the balance column has the correct type Amount[]. This is my .kanel.cjs config file:

const path = require('path');
const { makeKyselyHook } = require("kanel-kysely");

/** @type {import('kanel').Config} */
module.exports = {
  connection: process.env.DATABASE_URL,

  preDeleteOutputFolder: true,
  outputPath: './app/database',
  preRenderHooks: [makeKyselyHook()],
  resolveViews: true
};

I have tried setting resolveViews to both false and true as well as removing the makeKyselyHook() but the results are the same. I'm not sure what is different about this view account_posting_hierarchy that Kanel connect identify the column types. Perhaps it's because of the WITH RECURSIVE or UNION ALL. It is my only view that is using those statements.

To be clear, here is what I think is the correct output for the view account_posting_hierarchy with the column types correctly identified.

// @generated
// This file is automatically generated by Kanel. Do not modify manually.

import type { ColumnType, Selectable } from 'kysely';
import { Amount } from './Amount';

/** Represents the view public.account_posting_hierarchy */
export default interface AccountPostingHierarchyTable {
  date: ColumnType<Date, never, never>;

  account: ColumnType<string[], never, never>;

  amount: ColumnType<Amount, never, never>;
}

export type AccountPostingHierarchy = Selectable<AccountPostingHierarchyTable>;

Version information:

    "kanel": "^3.5.5",
    "kanel-kysely": "^0.2.1",
% psql --version
psql (PostgreSQL) 15.4

Hm, that's a bug for sure, but it's not clear to me where it resides :-) I will try to look into this but a temporary workaround for you is probably to create a custom hook that patches this particular view..

Let me know if this is fixed now!

Let me know if this is fixed now!

Thanks for looking into this! Is the fix in the latest version of kanel? It's still not working for me, and I'm getting an error when I run the kanel command.

    "kanel": "^3.6.0",
    "kanel-kysely": "^0.2.3",
% npm run kanel

> kanel
> kanel --c .kanelrc.cjs

Kanel
Using config file: /Users/bengerdemann/Projects/beanpost/.kanelrc.cjs
 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ 21% | ETA: 2s | 5/23Error parsing view definition for "price_inverted". Falling back to raw data
Could not resolve source { schema: 'public', table: 'hierarchy_cte', column: 'date' }
Could not resolve source { schema: 'public', table: 'hierarchy_cte', column: 'account' }
Could not resolve source { schema: 'public', table: 'hierarchy_cte', column: 'amount' }
Clearing old files in ./app/database/schemas
 - app/database/schemas/public/Posting.ts
 - app/database/schemas/public/Account.ts
 - app/database/schemas/public/Balance.ts
 - app/database/schemas/public/Currency.ts
 - app/database/schemas/public/Price.ts
 - app/database/schemas/public/Transaction.ts
 - app/database/schemas/public/PostingBalance.ts
 - app/database/schemas/public/BalanceCheck.ts
 - app/database/schemas/public/AccountBalance.ts
 - app/database/schemas/public/TransactionBalance.ts
 - app/database/schemas/public/AccountPostingHierarchy.ts
 - app/database/schemas/public/PriceInverted.ts
 - app/database/schemas/public/Amount.ts
 - app/database/schemas/beancount/Entry.ts
 - app/database/schemas/beancount/TransactionsDetail.ts
 - app/database/schemas/beancount/OpenDetail.ts
 - app/database/schemas/beancount/CloseDetail.ts
 - app/database/schemas/beancount/PadDetail.ts
 - app/database/schemas/beancount/BalanceDetail.ts
 - app/database/schemas/beancount/NoteDetail.ts
 - app/database/schemas/beancount/PriceDetail.ts
 - app/database/schemas/beancount/DocumentDetail.ts
 - app/database/schemas/beancount/Postings.ts
 - app/database/schemas/public/PublicSchema.ts
 - app/database/schemas/beancount/BeancountSchema.ts
 - app/database/schemas/Database.ts

Output in AccountPostingHierarchy.ts is still missing column types.

// @generated
// This file is automatically generated by Kanel. Do not modify manually.

import type { ColumnType, Selectable } from 'kysely';

/** Represents the view public.account_posting_hierarchy */
export default interface AccountPostingHierarchyTable {
  date: ColumnType<unknown, never, never>;

  account: ColumnType<unknown[], never, never>;

  amount: ColumnType<unknown, never, never>;
}

export type AccountPostingHierarchy = Selectable<AccountPostingHierarchyTable>;

Please let me know if I'm doing something wrong or if there's any additional information I can provide that would help you investigate this problem.

Hm ok, I was hoping it would be fixed in this version. I will try to see what else could be wrong.

Does your table definition still look like the first message in this thread?

Does your table definition still look like the first message in this thread?

Yes.

That's so weird, I made an exact unit test for that case! Maybe something fails later in the chain..

In case it helps, I also get unknowns for just this one view:

db=# \d+ v_price_history_with_returns;
                           View "public.v_price_history_with_returns"
   Column    |           Type           | Collation | Nullable | Default | Storage | Description
-------------+--------------------------+-----------+----------+---------+---------+-------------
 id          | integer                  |           |          |         | plain   |
 asset_id    | integer                  |           |          |         | plain   |
 observed_at | timestamp with time zone |           |          |         | plain   |
 current_usd | numeric(32,12)           |           |          |         | main    |
 usd_24h_ago | numeric(32,12)           |           |          |         | main    |
 return_24h  | numeric                  |           |          |         | main    |
 usd_1h_ago  | numeric(32,12)           |           |          |         | main    |
 return_1h   | numeric                  |           |          |         | main    |
View definition:
 WITH previous_prices AS (
         SELECT ph.id,
            ph.asset_id,
            ph.usd AS current_usd,
            ph.observed_at,
            COALESCE(ph24.usd, ph.usd) AS usd_24h_ago,
            COALESCE(ph1h.usd, ph.usd) AS usd_1h_ago
           FROM price_history ph
             LEFT JOIN LATERAL ( SELECT price_history.usd
                   FROM price_history
                  WHERE price_history.asset_id = ph.asset_id AND price_history.observed_at <= (ph.observed_at - '24:00:00'::interval)
                  ORDER BY price_history.observed_at DESC
                 LIMIT 1) ph24 ON true
             LEFT JOIN LATERAL ( SELECT price_history.usd
                   FROM price_history
                  WHERE price_history.asset_id = ph.asset_id AND price_history.observed_at <= (ph.observed_at - '01:00:00'::interval)
                  ORDER BY price_history.observed_at DESC
                 LIMIT 1) ph1h ON true
        )
 SELECT previous_prices.id,
    previous_prices.asset_id,
    previous_prices.observed_at,
    previous_prices.current_usd,
    previous_prices.usd_24h_ago,
    (previous_prices.current_usd - previous_prices.usd_24h_ago) / previous_prices.usd_24h_ago AS return_24h,
    previous_prices.usd_1h_ago,
    (previous_prices.current_usd - previous_prices.usd_1h_ago) / previous_prices.usd_1h_ago AS return_1h
   FROM previous_prices;

And the generated typescript is:

/** Represents the view public.v_price_history_with_returns */
export default interface VPriceHistoryWithReturnsTable {
  id: ColumnType<unknown, never, never>;

  asset_id: ColumnType<unknown, never, never>;

  observed_at: ColumnType<unknown, never, never>;

  current_usd: ColumnType<unknown, never, never>;

  usd_24h_ago: ColumnType<string, never, never>;

  return_24h: ColumnType<string, never, never>;

  usd_1h_ago: ColumnType<string, never, never>;

  return_1h: ColumnType<string, never, never>;
}

Of course, this view can't easily be reproduced. But I thought it might help shed light in case there's a clue with these lateral joins.

Versions:

    "kanel": "3.4.3",
    "kanel-kysely": "0.1.0"

Thank you. I'll investigate further when I have time :-)

I found a very focused scenario to help isolate the problem. Basically, cross-schema Views has issues knowing the source column. @gerdemb 's example is actually going between two schemas.

You can reproduce this easily. Create two views in two separate schemas.

CREATE OR REPLACE VIEW public.kanel_testing AS
    SELECT 1 as id; -- Change this to anything

CREATE OR REPLACE VIEW api.kanel_testing AS
    SELECT * FROM public.kanel_testing;

You get the following output:

public/KanelTesting.ts

/** Represents the view public.kanel_testing */
export default interface KanelTesting {
  id: number;
}

api/KanelTesting.ts

/** Represents the view api.kanel_testing */
export default interface KanelTesting {
  id: unknown;
}

This scenario can easily come up in e.g. Supabase or other providers where it is recommended to have Views intended for API use in a separate schema from your core data schema. This limits exposure to data, and makes it clear what is exposed. So there are lots of these slim Views that simply mirror the underlying data source.

If you can point me to the right area to assist (or workaround), I'd appreciate it (or of course if you immediately know where the problem is, that's probably faster haha).

Ok, so what I found was a different problem ๐Ÿ˜…
https://github.com/behaview/kanel/pull/2
Basically when you have identical View names in public and another schema, it cannot resolve due to the logic. This would be a bandaid though.

I think the source of the problem is that extract-pg-schema cannot identify the source table. In my outcome, it sees the source schema as the same schema (e.g. api), rather than seeing it as public. The self-loop causes it to just go to unknown. If it could find the source properly, my PR becomes irrelevant too.

Ah, that's a good point. I always use schemas for complete encapsulation, so I haven't had things referencing each other across that line. But it's obviously possible and I can see good use cases for it, so Kanel should support it as well.

I believe I am experiencing this issue (postgrest instead of supabase)...

I think the source of the problem is that extract-pg-schema cannot identify the source table. In my outcome, it sees the source schema as the same schema (e.g. api), rather than seeing it as public. The self-loop causes it to just go to unknown. If it could find the source properly, my PR becomes irrelevant too.

I just started using this library today, so I am not too familiar with the codebase, but it seems like this would be the area in which to incorporate a solution?

const source = (c as ViewColumn | MaterializedViewColumn).source;
let target: TableDetails | ViewDetails | MaterializedViewDetails =
config.schemas[source.schema].tables.find((t) => t.name === source.table);
if (!target) {
target = config.schemas[source.schema].views.find(
(v) =>
v.name === source.table &&
v.name !== (d as ViewDetails).informationSchemaValue.table_name,
);
}
if (!target) {
target = config.schemas[source.schema].materializedViews.find(
(v) => v.name === source.table,
);
}
if (!target) {
target = config.schemas["public"]?.tables?.find(
(t) => t.name === source.table,
);
}
if (!target) {
target = config.schemas["public"]?.views?.find(
(v) =>
v.name === source.table &&
v.name !== (d as ViewDetails).informationSchemaValue.table_name,
);
}
if (!target) {
target = config.schemas["public"]?.materializedViews?.find(
(v) => v.name === source.table,
);
}
if (!target) {
console.warn("Could not resolve source", source);
// return to prevent error: cannot read property of undefined (reading columns)
return "unknown";
}
const column = (
target.columns as Array<TableColumn | ViewColumn | MaterializedViewColumn>
).find((c) => c.name === source.column);
if (column) {
return resolveType(column, target, config);
}

The funny thing is, it looks like the zod schemas via generateZodSchemas are the correct column types, so I figure the property metadata must be right somewhere.

I am thinking of just writing a script to strip out the types and wrapping the schema with z.infer. I think that would work as a temp workaround, but perhaps an actual fix would be faster....

Yes this fixed it for me...
main...simian-loco:kanel:patch-1

I imagine this would break some other things but I don't see a testing script in package.json to check. If not though let me know you want the pull request and I can submit it.

Thank you for the great tool! Saving time is the best.

Just updated the patch-1 mentioned above to be more pull-worthy. Let me know if you would like me to create the PR as that unblocked me. Before that change I was seeing "unknown" for pretty much everything (schemas build for postgrest). Deferring the PR since I do not know what "source" might be used for later down the chain... I am assuming that deleting it might be problematic for other people, but I am not someone who should determine that.

Also added #559 and associated PR #558 to address a separate but related issue with the generateIndexFile hook. I think this one is good to go unless using the path to derive the schema name / path is bad form (as mentioned in issue #559).

@kristiandupont - Received your message on the PR. I am using a local branch, so no rush here. Thank you for the ack!