DB API cursor.description information is incorrect
Opened this issue · 1 comments
padhia commented
Python DB API standards defines here what values a DB API compliant driver can return via cursor.description
.
From what I observed:
precision
andscale
are wrong. The actualscale
value is stored inprecision
andprecision
value is stored ininternal_size
.internal_size
attribute on other databases returns the amount of internal storage used by the column. So for example, column ofint
type would return 4,bigint
8 and so on.internal_size
actually has the values that are more appropriate fordisplay_size
attribute (which is alwaysnull
)- For SQL interval types, attribute
type_code
fromcursor.description
returnsstr
but in reality an instance ofteradata.datatypes.Interval
is returned. - For JSON types, again
str
is stored intype_code
, but the actual values are instances of pythondict
orlist
- When using
tdrest
module, none of the metadata exceptname
andtype_code
are returned. It would be nice to have access to the same information thattdodbc
module provides tdrest
module doesn't supportxml
type at all whereastdodbc
has no problems returningxml
as pythonstr
Here is a small Python program to reproduce above observations:
import teradata
from collections import OrderedDict
td_types = OrderedDict([
( "Char5" , "cast('abc' as char(5))" ),
( "VarChar5" , "cast('abc' as varchar(5))" ),
( "Int" , "cast(123456 as integer)" ),
( "BigInt" , "cast(1234567890123 as bigint)" ),
( "SmallInt" , "cast(1234 as smallint)" ),
( "ByteInt" , "cast(123 as byteint)" ),
( "Dec52" , "cast(123.56 as decimal(5,2))" ),
( "Dec384" , "cast(123.56 as decimal(38,4))" ),
( "Date" , "current_date" ),
( "time" , "current_time" ),
( "timestamp0" , "Current_timestamp(0)" ),
( "timestamp6" , "Current_timestamp(6)" ),
( "HourToMin" , "interval '05:06' hour to minute" ),
( "DayToMin" , "interval '03 05' day to hour" ),
( "YearToMonth", "interval '05-11' year to month" ),
( "JsonObj" , "cast('{\"ABC\":[1,2]}' as json)" ),
( "JsonArray" , "cast('[1,2]' as json)" ),
( "xml" , "cast('<a/>' as xml)" )])
# with teradata.tdrest.connect(host='xxxx', system='xxxx', username='xxxx', password='xxxx') as conn:
with teradata.tdodbc.connect(system='xxxx', username='xxxx', password='xxxx') as conn:
with conn.cursor() as csr:
fmt = '{:<20}|{:<38}|{:<28}|{:>5}|{:>10}|{:>4}|{:>5}|{:^4}'
print(fmt.format('name','py type','csr type','dsize','isize','prec','scale','null'))
print(fmt.replace('{:','{:-').format(*['']*8))
for name, expr in td_types.items():
csr.execute('SELECT {} AS Col_{}'.format(expr,name))
desc = csr.description[0]
col = csr.fetchone()[0]
line = [desc[0], type(col)] + list(desc[1:])
print(fmt.format(*[str(c) for c in line]))
padhia commented
I should add that the integral SQL data types are better represented by python int
(also long
in python2) and python float
to represent SQL real
data types instead of current implementation that uses decimal.Decimal
for all numeric data types.