Diyago/Tabular-data-generation

TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'

seanko29 opened this issue · 1 comments

Describe the bug
A clear and concise description of what the bug is.


TypeError Traceback (most recent call last)
Input In [1], in <cell line: 11>()
8 test = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=list("ABCD"))
10 # generate data
---> 11 new_train1, new_target1 = OriginalGenerator().generate_data_pipe(train, target, test, )
12 new_train2, new_target2 = GANGenerator().generate_data_pipe(train, target, test, )
14 # example with all params defined

File ~/.conda/envs/medicine/lib/python3.8/site-packages/tabgan/abc_sampler.py:77, in SampleData.generate_data_pipe(self, train_df, target, test_df, deep_copy, only_adversarial, use_adversarial, only_generated_data)
75 if use_adversarial:
76 logging.info("Applying adversarial filtering")
---> 77 new_train, new_target = generator.adversarial_filtering(
78 new_train, new_target, test_df
79 )
80 gc.collect()
82 logging.info("Total finishing, returning data")

File ~/.conda/envs/medicine/lib/python3.8/site-packages/tabgan/sampler.py:204, in SamplerOriginal.adversarial_filtering(self, train_df, target, test_df)
202 self._validate_data(train_df, target, test_df)
203 train_df[self.TEMP_TARGET] = target
--> 204 ad_model.adversarial_test(test_df, train_df.drop(self.TEMP_TARGET, axis=1))
206 train_df["test_similarity"] = ad_model.trained_model.predict(
207 train_df.drop(self.TEMP_TARGET, axis=1)
208 )
209 train_df.sort_values("test_similarity", ascending=False, inplace=True)

File ~/.conda/envs/medicine/lib/python3.8/site-packages/tabgan/adversarial_model.py:63, in AdversarialModel.adversarial_test(self, left_df, right_df)
55 concated = pd.concat([left_df, right_df])
56 lgb_model = Model(
57 cat_validation=self.cat_validation,
58 encoders_names=self.encoders_names,
(...)
61 model_params=self.model_params,
62 )
---> 63 train_score, val_score, avg_num_trees = lgb_model.fit(
64 concated.drop("gt", axis=1), concated["gt"]
65 )
66 self.metrics = {"train_score": train_score,
67 "val_score": val_score,
68 "avg_num_trees": avg_num_trees}
69 self.trained_model = lgb_model

File ~/.conda/envs/medicine/lib/python3.8/site-packages/tabgan/adversarial_model.py:176, in Model.fit(self, X, y)
174 mean_score_train = np.mean(self.scores_list_train)
175 mean_score_val = np.mean(self.scores_list_val)
--> 176 avg_num_trees = int(np.mean(self.models_trees))
178 return mean_score_train, mean_score_val, avg_num_trees

File <array_function internals>:180, in mean(*args, **kwargs)

File ~/.conda/envs/medicine/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3474, in mean(a, axis, dtype, out, keepdims, where)
3471 else:
3472 return mean(axis=axis, dtype=dtype, out=out, **kwargs)
-> 3474 return _methods._mean(a, axis=axis, dtype=dtype,
3475 out=out, **kwargs)

File ~/.conda/envs/medicine/lib/python3.8/site-packages/numpy/core/_methods.py:179, in _mean(a, axis, dtype, out, keepdims, where)
176 dtype = mu.dtype('f4')
177 is_float16_result = True
--> 179 ret = umr_sum(arr, axis, dtype, out, keepdims, where=where)
180 if isinstance(ret, mu.ndarray):
181 ret = um.true_divide(
182 ret, rcount, out=ret, casting='unsafe', subok=False)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

image

Desktop (please complete the following information):

  • OS: [e.g. Windows, Ubuntu]
    Ubuntu/Linux server with python 3.8 and on jupyter notebook
    Additional context
    Add any other context about the problem here.

I was trying to do tabgan on your example with random data but it does not seem to work.
I tried with my own tabular data (with no categorical variables and without any NaNs but has 0s of course).
but it does not seem to work and print out the same result.

Do you have any idea why it is not working?

thank you in advanced.

Could you please try new version pip install tabgan==1.2.1

Could try to reproduce in colab? Just tried - worked smoothly