Define SFrame columns type on new object
regina-grg opened this issue · 3 comments
When creating an SFrame there is no way to pre-define the columns type. This creates a problem when defining an empty SFrame from a dictionary. All the columns are by default from type 'float' and in order to change their type, you need to define it column by column using SArray.astype(str).
I was hoping for the option to define in when defining the SFrame like the option in read_csv , column_type_hints:
SFrame.read_csv(url, delimiter=',', header=True, error_bad_lines=False, comment_char='', escape_char='', double_quote=True, quote_char='"', skip_initial_space=True, column_type_hints=None, na_values=['NA'], line_terminator='\n', usecols=[], nrows=None, skiprows=0, verbose=True, **kwargs)
Thanks!
Hi,
I've found that usually the right types are found when passing a dictionary to create an SFrame. There isn't as much type inference work needed here since the dictionary starts in Python and has a type already, as opposed to CSV, which is why only that has the column_type_hints option. Can you provide an example that creates an SFrame with all float columns when you wouldn't want that?
Hi,
If you want to create a typed, empty SFrame, my advise is simply to cheat.
The code below first demonstrates your point (SFrame created over empty data has SArrays of type float
).
Then it creates an SArray with several typed columns, and then chooses 0 rows of it. You will be left with an empty, but still well-typed, SFrame.
In [1]: import graphlab as gl
In [2]: sf1 = gl.SFrame({"x":[]})
In [3]: sf1.column_types()
Out[3]: [float]
In [4]: sf2 = gl.SFrame({"x":[str()], "y":[float()], "z":[int()], "l":[list()], "d":[dict()]})
In [5]: sf2
Out[5]:
Columns:
d dict
l list
x str
y float
z int
Rows: 1
Data:
+----+----+---+-----+---+
| d | l | x | y | z |
+----+----+---+-----+---+
| {} | [] | | 0.0 | 0 |
+----+----+---+-----+---+
[1 rows x 5 columns]
In [6]: sf2 = sf2[:0]
In [7]: sf2
Out[7]:
Columns:
d dict
l list
x str
y float
z int
Rows: 0
Data:
[]
But is there a way I can add an empty column to the SFrame with a particular datatype ?
e.g: If I have a SFrame with columns A, B and C and all the columns have some data. I want to add an empty column D of type float. How do I do that ?