snowflakedb/snowpark-python

SNOW-1447522: Session.create_dataframe fails in local testing.

Zedarflight opened this issue · 3 comments

  1. What version of Python are you using?

    Python 3.11.9

  2. What operating system and processor architecture are you using?

    Linux-4.18.0-372.64.1.el8_6.x86_64-x86_64-with-glibc2.31

  3. What are the component versions in the environment (pip freeze)?

    Snowpark version 1.17.0

  4. What did you do?
    I'm attempting to set up unit tests with pytest for a project I'm working on (following these docs), and have found that Session.create_dataframe doesn't work, due to how TableEmulator.init is behaving.

Easily reproducible example:

from snowflake.snowpark import Session
session = Session.builder.config('local_testing', True).create()

df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"]) # Create a Snowpark dataframe

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[7], line 4
      1 from snowflake.snowpark import Session
      2 session = Session.builder.config('local_testing', True).create()
----> 4 df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"]) # Create a Snowpark dataframe
...
snip
...
File /redacted/lib/python3.11/site-packages/snowflake/snowpark/mock/_snowflake_data_type.py:255, in TableEmulator.__init__(self, sf_types, sf_types_by_col_index, *args, **kwargs)
    248 def __init__(
    249     self,
    250     *args,
   (...)
    253     **kwargs,
    254 ) -> None:
--> 255     super().__init__(*args, **kwargs)
    256     self.sf_types = {} if not sf_types else sf_types
    257     # TODO: SNOW-976145, move to index based approach to store col type mapping

TypeError: object.__init__() takes exactly one argument (the instance to initialize)

Other reference used - create_dataframe usage + syntax was pulled from the create_dataframe page.

Some context of the actual use case:

import snowflake
mocked_session = snowflake.snowpark.Session.builder.config('local_testing', True).create()
# Mock up specific sql queries
def mock_sql(session, query):  # patch for SQL operations
    if query == "SHOW GRANTS TO USER testuser":
        return session.create_dataframe([snowflake.snowpark.Row(role='ROLENAMEHERE', granted_to="USER", grantee_name="testuser", granted_by="conftest_setup")])
    else:
        raise RuntimeError(f"Unexpected query execution: {query}")
mocker.patch.object(mocked_session, 'sql', wraps=partial(mock_sql, mocked_session)) # apply patch for SQL operations
  1. What did you expect to see?

    I expected to be able to use Session.create_dataframes when in local testing, since Session.create_dataframe is on the list of supported APIs.

  2. Can you set logging to DEBUG and collect the logs?

2024-05-24 23:12:54,567 - MainThread connection.py:399 - __init__() - INFO - Snowflake Connector for Python Version: 3.10.1, Python Version: 3.11.9, Platform: Linux-4.18.0-372.64.1.el8_6.x86_64-x86_64-with-glibc2.31
2024-05-24 23:12:54,571 - MainThread session.py:506 - __init__() - INFO - Snowpark Session information: 
"version" : 1.17.0,
"python.version" : 3.11.9,
"python.connector.version" : 3.10.1,
"python.connector.session.id" : 1,
"os.name" : Linux

stacktrace of error, see 4.

hey @Zedarflight , local testing requires pandas as dependency, have you installed pandas in your env?

Hello @Zedarflight ,

Thanks for raising the issue.

As my colleague said, it requires pandas package to be installed locally, I tested the code and its working fine.

`from snowflake.snowpark.session import Session
from snowflake.snowpark import functions as F
from snowflake.snowpark.types import *

import pandas as pd
import numpy as np
session = Session.builder.config('local_testing', True).create()

df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"]) # Create a Snowpark dataframe
df.show()

Ouput:

|"A" |"B" |

|1 |2 |
|3 |4 |
-------------`

This is not an issue from Snowflake, its an configuration issue.

Regards,
Sujan

Pandas was present in the environment. Upgrading to 1.18.0 has fixed the issue.