No posibility for either np.nan or None for int type, but so for float type - inconsistency ?

Question

No posibility for either np.nan or None for int type, but so for float type - inconsistency ?

username-still-not-available opened this issue 7 years ago · 2 comments

username-still-not-available commented 7 years ago

Hello,

If I want to insert None or np.nan to Integer type, it is not possible do it, but it is possible do it for float - it creates np.nan value

  Scenario: integer dtypes
    Given a gherkin table as input
      | int | int |
      |     |     |
      | 1   |     |
      | 2   |     |
    When attempting to convert to a data frame using 0 row as column names and 0 column as index
    Then it raises a ValueError exception

  Scenario: float dtypes
    Given a gherkin table as input
      | float | float |
      |       |       |
      | 4.1   |       |
      | 5.2   |       |
    When converted to a data frame using 0 row as column names and 0 column as index
    Then it matches a manually created data frame with null float data

What do you think about it ?

Answer 1 · 2018-06-05T13:58:32.000Z

I realized that pandas behave this way - "float column" can contains "None" value but Integer not. Sorry for false issue. Pandas automatically change dtype from int to float in this case.

Answer 2 · 2018-06-05T23:10:32.000Z

Yep this is unfortunately one of the most common pandas problem... It is possible to store ints in object columns to work around this but is not recommended. Same goes with boolean columns by the way.