Get errors in quickstart
f-hafner opened this issue · 2 comments
f-hafner commented
When following the steps in the quickstart, I get some errors.
To reproduce:
- OS: ubuntu 20
git clone git@github.com:alan-turing-institute/privacy-sdg-toolbox.git
cd privacy-sdg-toolbox
poetry install
(python 3.9)- then create a new notebook, run the notebook with the project
.venv
(I use VS code), and follow the steps in the quickstart
The specific errors I get
First, there is something wrong in this cell:
from sklearn.ensemble import RandomForestClassifier
attacker = tapas.attacks.ShadowModellingAtack(
FeatureBasedSetClassifier(
tapas.attacks.NaiveSetFeature() + tapas.attacks.HistSetFeature() + tapas.attacks.CorrSetFeature(),
RandomForestClassifier(n_estimators = 100)
),
label = "Groundhog"
)
RandomForestClassifier vs FeatureBasedSetClassifier? (why do we do the import first?)FeatureBasedSetClassifier
is never importedeither way,when running the cell, I getAttributeError: module 'tapas.attacks' has no attribute 'ShadowModellingAtack'
Second, when training the Groundhog attack:
attacker = tapas.attacks.GroundhogAttack()
attacker.train(threat_model, num_samples=1000)
I get a value error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[19], line 3
1 attacker = tapas.attacks.GroundhogAttack()
----> 3 attacker.train(threat_model, num_samples=1000)
File ~/repositories/projects/GANS/privacy-sdg-toolbox/tapas/attacks/shadow_modelling.py:87, in ShadowModellingAttack.train(self, threat_model, num_samples)
84 synthetic_datasets, labels = threat_model.generate_training_samples(num_samples)
86 # Fit the classifier to the data.
---> 87 self.classifier.fit(synthetic_datasets, labels)
88 self.trained = True
File ~/repositories/projects/GANS/privacy-sdg-toolbox/tapas/attacks/set_classifiers.py:144, in FeatureBasedSetClassifier.fit(self, datasets, labels)
143 def fit(self, datasets: list[Dataset], labels: list[int]):
--> 144 self.classifier.fit(self.features(datasets), labels)
File ~/repositories/projects/GANS/privacy-sdg-toolbox/tapas/attacks/set_classifiers.py:85, in SetFeature.__call__(self, *args, **kwargs)
84 def __call__(self, *args, **kwargs):
---> 85 return self.extract(*args, **kwargs)
File ~/repositories/projects/GANS/privacy-sdg-toolbox/tapas/attacks/set_classifiers.py:108, in CombinedSetFeatures.extract(self, dataset)
107 def extract(self, dataset: Dataset) -> np.array:
--> 108 return np.concatenate([f.extract(dataset) for f in self.features], axis=1)
File ~/repositories/projects/GANS/privacy-sdg-toolbox/tapas/attacks/set_classifiers.py:108, in (.0)
...
--> 101 cidx = [categories.index(c) for c in col_data]
102 col_data_onehot[np.arange(len(col_data)), cidx] = 1
104 return col_data_onehot
ValueError: nan is not in list
fhoussiau commented
Hi, sorry for the late reply.
- You are correct, the quickstart is missing
tapas.attacks.
beforeFeatureBasedSetClassifier
. - There is also a typo (missing a t in
ShadowModellingAttack
)!
These will be fixed shortly.
I am a bit surprised by the second error. What dataset are you using? (we do not have support for NaNs
at this point).
f-hafner commented
Hi, thanks! I just found that the second error was a mistake on my part from preparing the UK census data. Sorry!