tsrobinson/SyGNet

Simple test per basic_example fails

Closed this issue · 5 comments

ayn2 commented

Epoch: 100%|██████████| 1/1 [00:08<00:00, 8.85s/it]
Traceback (most recent call last):
File "C:/Users/Artem/PythonProjects/Public_SyGNet/src/sygnet/test123.py", line 32, in
synth_data1 = model.sample(nobs = 1000)
File "C:\Users\Artem\PythonProjects\Public_SyGNet\src\sygnet\sygnet_interface.py", line 309, in sample
n_cat_vars = self.data_encoders[0].n_features_in_
AttributeError: 'OneHotEncoder' object has no attribute 'n_features_in_'

===
Code:
rng = default_rng(seed=100)
manual_seed(100)

def gen_sim_data(rng, n=100000):
x1 = rng.uniform(low=0, high=1, size=n)
x2 = rng.uniform(low=0, high=1, size=n)
x3 = rng.normal(loc=x1 + x2, scale=0.1)
y = rng.normal(loc=3 * x1 + 2 * x2 + 1, scale=1)

sim_data = pd.DataFrame({
    'x1': x1,
    'x2': x2,
    'x3': x3,
    'y': y
})

return sim_data

train_data = gen_sim_data(rng)
train_data.head()

model = SygnetModel(mode = "wgan")
model.fit(data = train_data, epochs = 1)
synth_data1 = model.sample(nobs = 1000)

synth_data1.head()

Thanks @ayn2! I presume this error doesn't happen when we run SygnetModel(..., mixed_activation=False)?

ayn2 commented

Nope, happens still

Not using mixed activation function -- generated data may not conform to real data if it contains categorical columns.
Epoch: 100%|██████████| 1/1 [00:09<00:00, 9.36s/it]
Traceback (most recent call last):
File "C:/Users/Artem/PythonProjects/Public_SyGNet/src/sygnet/test123.py", line 32, in
synth_data1 = model.sample(nobs = 1000)
File "C:\Users\Artem\PythonProjects\Public_SyGNet\src\sygnet\sygnet_interface.py", line 309, in sample
n_cat_vars = self.data_encoders[0].n_features_in_
AttributeError: 'OneHotEncoder' object has no attribute 'n_features_in_'

ayn2 commented

It does complete the training though, including completing >1 epochs if needed.

No problem, I'll figure it out! Thanks for checking

Hmmm, I'm not able to replicate this one using the code above, and the following imports:

import numpy as np
import pandas as pd
from numpy.random import default_rng
from sygnet import SygnetModel

It may be we've just overlapped on versions. If you're calling this internally, I'm happy to check again with the full dependency list etc. (just reopen this issue)