ibmdbanalytics/ibmdbpy

Using IdaDataBase.as_idadataframe with clear_existing=False on existing remote database

Closed this issue · 3 comments

I have a pandas DataFrame that I want to save to an existing database on dashDb. However, I cannot use IdaDataBase.as_idadataframe(myPandasDF, tablename="myexistingSchema.myexistingTable", clear_existing=False). This is because of

https://github.com/ibmdbanalytics/ibmdbpy/blob/master/ibmdbpy/base.py#L832-L841

If the table exists already and clear_existing=False, it raises the NameError.

Was this intentional?

As a workaround I use

dashdb = IdaDataBase("jdbcstring")
dashIdaDF = IdaDataFrame(dashdb, 'myschema.mytable')
dashdb.append(dashdbIdaDF, myPandasDF)

Hi,
Yes this is intentional. In IdaDataBase.as_idadataframe it is about creating a new table containing the given PandasDataFrame.
Table names should be unique in a schema. As a consequence, the clear existing option is false per default, just to make sure you don't drop a table unintentionally.

However, if the names and numbers of columns and data types matches, you can use IdaDataBase.append
http://pythonhosted.org/ibmdbpy/base.html#append

Your workaround is actually the way it is meant to be.
Do you find it non intuitive?
How would you imagine this feature?

It is always great to have user's feedback. :)

Cheers,
Edouard

Yeah -- this makes sense. If as_idadataframe worked as I was trying to make it work, then you'd have a local pandas df that was inconsistent with the returned IdaDataFrame.

The presence of the clear_existing option confused me though. I'm not sure what you could do other than put a note in the docstring for as_idadataframe near the clear_existing argument description that says you cannot use this method to append or insert data into a remote database and then point the reader to the append.

Alright. Thanks for your comment !
This will be added to the new doc version.