"ValueError: 'a' cannot be empty unless no samples are taken" in preparation step
cjw531 opened this issue · 4 comments
Hi,
I was trying to follow the first step here to get BRDF priors, but I am getting the following error:
$ REPO_DIR="$repo_dir" "$repo_dir/nerfactor/trainvali_run.sh" "$gpus" --config='brdf.ini' --config_override="data_root=$data_root,outroot=$outroot,viewer_prefix=$viewer_prefix"
2021-08-18 12:21:52.126973: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-08-18 12:21:52.165148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-08-18 12:21:52.165428: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-08-18 12:21:52.171059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-08-18 12:21:52.173348: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-08-18 12:21:52.173750: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-08-18 12:21:52.176352: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-08-18 12:21:52.186737: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-08-18 12:21:52.192472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-08-18 12:21:52.194714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-08-18 12:21:52.195193: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-08-18 12:21:52.203031: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3299990000 Hz
2021-08-18 12:21:52.203763: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f68dc000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-18 12:21:52.203782: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-08-18 12:21:52.290736: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x564ac481a330 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-08-18 12:21:52.290795: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2021-08-18 12:21:52.292470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:68:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2021-08-18 12:21:52.292553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-08-18 12:21:52.292584: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-08-18 12:21:52.292611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-08-18 12:21:52.292637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-08-18 12:21:52.292663: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-08-18 12:21:52.292690: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-08-18 12:21:52.292717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-08-18 12:21:52.294454: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-08-18 12:21:52.294508: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-08-18 12:21:52.295132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-18 12:21:52.295142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2021-08-18 12:21:52.295148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2021-08-18 12:21:52.296183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10150 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:68:00.0, compute capability: 7.5)
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I0818 12:21:52.299249 140098729092928 mirrored_strategy.py:500] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
[util/io] Output directory already exisits:
/home/jiwonchoi/nerfactor/output/train/merl/lr1e-2
[util/io] Overwrite is off, so doing nothing
[trainvali] For results, see:
/home/jiwonchoi/nerfactor/output/train/merl/lr1e-2
Traceback (most recent call last):
File "/home/jiwonchoi/nerfactor/nerfactor/trainvali.py", line 341, in <module>
app.run(main)
File "/home/jiwonchoi/.conda/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/jiwonchoi/.conda/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/home/jiwonchoi/nerfactor/nerfactor/trainvali.py", line 81, in main
dataset_train = Dataset(config, 'train', debug=FLAGS.debug)
File "/home/jiwonchoi/nerfactor/nerfactor/datasets/brdf_merl.py", line 52, in __init__
mats = np.random.choice(self.brdf_names, n_iden, replace=False)
File "mtrand.pyx", line 908, in numpy.random.mtrand.RandomState.choice
ValueError: 'a' cannot be empty unless no samples are taken
I double checked my paths. Not sure where this error has originated from.
I tried to recreate the *.npz
dataset, with train/val/test split. However, it only creates test.npz
and gives this reshape error as follows:
$ REPO_DIR="$repo_dir" "$repo_dir"/data_gen/merl/make_dataset_run.sh "$indir" "$ims" "$outdir"
Training & Validation: 0%| | 0/5 [00:00<?, ?it/s]Loading MERL-BRDF: /Users/jchoi/workspace/nerfactor/data/brdf_merl/Copyright_Notice.txt
Training & Validation: 0%| | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/Users/jchoi/workspace/nerfactor/data_gen/merl/make_dataset.py", line 144, in <module>
app.run(main)
File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/Users/jchoi/workspace/nerfactor/data_gen/merl/make_dataset.py", line 75, in main
brdf = MERL(path=path)
File "/Users/jchoi/workspace/nerfactor/brdf/merl/merl.py", line 31, in __init__
cube_rgb = merl.readMERLBRDF(path) # (phi_d, theta_h, theta_d, ch)
File "/Users/jchoi/workspace/nerfactor/third_party/nielsen2015on/merlFunctions.py", line 19, in readMERLBRDF
BRDFVals = np.swapaxes(np.reshape(vals,(dims[2], dims[1], dims[0], 3),'F'),1,2)
File "<__array_function__ internals>", line 6, in reshape
File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 299, in reshape
return _wrapfunc(a, 'reshape', newshape, order=order)
File "/Users/jchoi/miniconda3/envs/nerfactor/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 58, in _wrapfunc
return bound(*args, **kwds)
ValueError: cannot reshape array of size 144 into shape (808591476,1751607666,2037411651,3)
I believe the directory where I have saved the donwloaded BRDF dataset does not have issue since it creates the test numpy array.
It seems like someone opened same issue before, not sure how he resolved it by deleting readme inside the downloaded brdf folder.
UPDATE:
I got this message, but no train.npz
nor validation.npz
created.
Training & Validation: 0it [00:00, ?it/s]
Hi, I suspect the BRDF data were not even successfully loaded. Could you try inserting a breakpoint right before nerfactor/third_party/nielsen2015on/merlFunctions.py
L19? What is vals
at that point? If vals
is empty there, then that explains the errors you had.
Fixed this issue by re-checking the *.binary
merl dataset. Below is how I solved this issue and load the T/V/T set in a proper way.
TL;DR> The provided path is incorrect in the README, if you are using the downloaded MERL dataset directly, you have to modify the path. The data path that you use in indir
variable should contain the actual MERL dataset, not other unnecessary files that are not ending with *.binary
.
Download Dataset
When you download the MERL BRDF dataset, the directory structure will be as follows:
brdf_merl/
├── Readme.txt
├── Copyright_Notice.txt
├── brdfs
│ ├── Readme.txt
│ └── *.binary (<--100 of them are here, omitted)
└── code
└── BRDFRead.cpp
Modify the path
In the README, the data path is set as follows:
indir="$proj_root/data/brdf_merl"
However, if you use this brdf_merl/
directory directly here, it will give you an error like me because the data generator code is trying to read the invalid folders/files such as Readme.txt
, Copyright_Notice.txt
, code/
, and etc. Your folder path should only contain the *.binary
files.
Also, the brdfs/
folder has another Readme.txt
file, so remember to get rid of this before running the code.
Therefore, fixing the indir
variable into:
indir="$proj_root/data/brdf_merl/brdfs"
will resolve the issue because this brdfs folder contains the actual dataset only.
*Side note: I also set ims='512'
instead of 256 because in the later step where you train brdf priors, it seems like the data_root
variable uses the one with the size of 512.