pytorch/captum

Can’t create tensors as shown in first Captum Titanic tutorial?!

Closed this issue · 0 comments

🐛 Bug

Basic steps in the Titanic tutorial to load CSV to tensors don't work?

To Reproduce

I'm stumped by the simplest part of the most basic "Titanic Basic" captum tutorial: converting the data into tensors?!

After getting the data and performing the first preprocessing steps,
converting to numpy arrays and separating out train and test sets works fine:

	data = titanic_data.to_numpy()

	train_indices = np.random.choice(len(labels), int(0.7*len(labels)), replace=False)
	test_indices = list(set(range(len(labels))) - set(train_indices))
	train_features = data[train_indices]
	train_labels = labels[train_indices]
	test_features = data[test_indices]
	test_labels = labels[test_indices]

but converting to tensors doesn't work:

	File ".../Titanic_Basic_Interpret.py", line 139, in <module>
	input_tensor = torch.as_tensor(train_features,dtype=torch.float32)
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.

Maybe the problem is "no more magic, convert_objects has been deprecated in pandas 0.17" ? It seems the tutorial was added back in 2019, but other issues seem to have used it more recently?

I've tried some of the suggestions there (building a separate dictionary of data types and then data = data.astype(dtype=dtypeDict), converting each column separately:

	for c in titanic_data.columns:
	    titanic_data[c] = pd.to_numeric(titanic_data[c])

but these don't go thru either. What could the issue be?!

Expected behavior

Environment

Describe the environment used for Captum


 - Captum / PyTorch Version: 2.1.0.post100 / 0.7.0
 - OS: OSX 14.5
 - How you installed Captum / PyTorch (`conda`, `pip`, source): pip
 - Python version: 3.11.7