Warning: The given NumPy array is not writeable
fisheggg opened this issue · 1 comments
fisheggg commented
Hi,
Thanks for making this tool!
I've got a warning message from pytorch when loading tfrecords using MultiTFRecordDataset
:
UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:143.)
return default_collate([torch.as_tensor(b) for b in batch])
Here's how I write and load the .tfrecord shards:
# writing to .tfrecord shards
out_f = tf.io.TFRecordWriter(output_path)
# feature_sliced is a 3-dim np.array with type np.float32
features = {
"fbank": tf.train.Feature(bytes_list=tf.train.BytesList(value=[feature_sliced[:, :, slice_idx].tobytes()])),
}
example = tf.train.Example(features = tf.train.Features(feature=features))
out_f.write(example.SerializeToString())
# loading from .tfrecord shards
description = {
"fbank": "byte",
}
def transform(features):
features["fbank"] = np.frombuffer(bytes(bytearray(features["fbank"])), dtype=np.float32).T.reshape(-1, 128)
features["song_title"] = bytes(bytearray(features["song_title"])).decode("utf-8")
return features
train_set = MultiTFRecordDataset(
tfrecord_pattern,
index_pattern,
splits_train,
description=description,
transform=transform,
infinite=False
)
package versions I'm using:
torch==1.8.1+cu101
numpy==1.21.0
tfrecord==1.14.1
Thanks for looking into this!
Best,
Arthur
luisfmnunes commented
I know it's kind of a late response but, this is a problem with np.frombuffer implementation, which returns a READ-ONLY numpy array (probably a const allocated array on C-side) so you need to make a copy of this buffer in order to allow torch to modify it and wrap the Tensor around it. just add .copy() by the end of the np.frombuffer() call.