To get list of columns with transformation formula
AzatYakupov opened this issue · 3 comments
Hi there!
I am using pglast version 3.3
Please hint me how I can get all included columns in all SelectStmt objects with probably existing formula .
Just example.
SELECT t1.name AS name ,
t1.name||' '||t2.name AS new_name
FROM t1 inner join t2 on t1.id = t2.par_id;
and need to get next columns:
name: t1.name
new_name: t1.name||' '||t2.name
thanks!
The following script does that:
from pglast.parser import parse_sql
from pglast.stream import RawStream
from pglast.visitors import Visitor
class TargetColumnNames(Visitor):
def __call__(self, node):
self.tcnames = {}
super().__call__(node)
return self.tcnames
def visit_ResTarget(self, ancestors, node):
self.tcnames[node.name] = RawStream()(node.val)
def target_columns(stmt):
return TargetColumnNames()(parse_sql(stmt) if isinstance(stmt, str) else stmt)
print(target_columns("SELECT t1.name AS name, t1.name||' '||t2.name AS new_name"
" FROM t1 inner join t2 on t1.id = t2.par_id"))
As always, details matter, and your problem is to define more clear rules that describe what you mean with "list of columns": for example, what should happen if the statement has subselects?
Thanks @lelit , that's really cool! thanks!
Regarding your comment, yes you are right, need to get a ETL transformation for every column, and SQL can contain several ETLs for one column (like CTE , subqueries~subselects).
So, currently, I am going to work on your code to get a branch of transformation for one column
I think there's nothing more to be done here, right? Otherwise, feel free to reopen the issue!