CartoDB/cartoframes

Uploading a GeoDataFrame with geometry and the_geom columns to CARTO fails

Closed this issue · 4 comments

import pandas as pd
from geopandas import GeoDataFrame
from cartoframes.utils import decode_geometry
stores_df = pd.read_csv('http://libs.cartocdn.com/cartoframes/files/starbucks_brooklyn_geocoded.csv')
stores_gdf = GeoDataFrame(stores_df, geometry=decode_geometry(stores_df['the_geom']))
from cartoframes import to_carto, read_carto
to_carto(stores_gdf, 'stores', if_exists='replace')
TypeError: Input must be valid geometry objects: ['0101000020E61000005EA27A6B607D52C01956F146E6554440'
 <shapely.geometry.point.Point object at 0x125415d68>]

The problem is that the rename_geometry function of gpd does not take into account that a column already exists with the same name. In this case, we ended with two the_geom columns during the to_carto. However, the fix in CF is pretty straightforward.

Note: a solution to offer the users, which is also cleaner, is to upload directly the DataFrame using the geom_col param:

import pandas as pd
from cartoframes import to_carto
stores_df = pd.read_csv('http://libs.cartocdn.com/cartoframes/files/starbucks_brooklyn_geocoded.csv')
to_carto(stores_df, 'stores', creds, if_exists='replace', geom_col='the_geom')

The problem is that the rename_geometry function of gpd does not take into account that a column already exists with the same name

It should probably check that (currently, geopandas is relying on pandas' DataFrame.rename behavior, which just renames ad creates a duplicate column name). Do you want to open an issue on the geopandas repo for this?

Fixed. It will be available in the next stable release 1.0.5