tobiascz/VideoPose3D

About making output video with VideoPose3d

KevinWisbay opened this issue · 15 comments

HI,
I appreciate your study very much, since I've been looking for it several months ago.
I had tried the original VideoPose3D of facebook, then I found there should be 2d-keypoints as inputs for it, that you've done in this project!
I have some problem now. I hope that you could help me.

  1. in step 3, I use your infer_simple.py directly, then I find the outputs are .pdf files. (should they be .jpg files?)
  2. the final step, I can't find the d-pt-243.bin file in checkpoint folder in the original VideoPose3D project, and the link of it seems unavailable now. Can you give the new link, or tell me the way to make it?
  3. in the final command, I don't find any arguments which point to the 2D-keypoint pics.

Thank you very much!

Hi,

I am sorry to hear that you had problems running the code.

  1. I use the infer_simple.py that is provided in the dectectron_tools folder inside my repositroy updated infer_simple. You are right the unmodifieid infere_simple creates for each frame a .pdf file as output but my infer_simple adds the export of the 2D poses. So the results are the pdf files and a file called "data_2d_detections.npz" which contains all 2D keypoints.

  2. For the d-pt-243.bin file check out this issue

  3. You dont give it 2D -keypoint pics rather then the video --viz-video "InTheWildData/out_cutted.mp4''

I see that I have to clear up some things which I will do probably next month. I hope you can get the code running until then!

greetings

Hi,

I am sorry to hear that you had problems running the code.

1. I use the infer_simple.py that is provided in the dectectron_tools folder inside my repositroy [updated infer_simple](https://github.com/tobiascz/VideoPose3D/blob/master/detectron_tools/infer_simple.py). You are right the unmodifieid infere_simple creates for each frame a .pdf file as output but my infer_simple adds the [export of the 2D poses](https://github.com/tobiascz/VideoPose3D/blob/master/detectron_tools/infer_simple.py#L195). So the results are the pdf files and a file called "data_2d_detections.npz" which contains all 2D keypoints.

2. For the d-pt-243.bin file check out this [issue](https://github.com/tobiascz/VideoPose3D/issues/1)

3. You dont give it 2D -keypoint pics rather then the video `--viz-video "InTheWildData/out_cutted.mp4'' `

I see that I have to clear up some things which I will do probably next month. I hope you can get the code running until then!

greetings

Thank you for your answers. Now it does make the .npz file when the detectron figures the 2D keypoints. But I found it very interesting that I got an error when I used my own .npz file, like this:
(I don't know if there is something wrong when I use detectron with python2.7)

Loading dataset...
Preparing data...
Loading 2D detections...
data/data_2d_detections.npz
Traceback (most recent call last):
File "/home/kevin/.conda/envs/fb3d/lib/python3.6/site-packages/numpy/lib/format.py", line 693, in read_array
array = pickle.load(fp, **pickle_kwargs)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd8 in position 0: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_wild.py", line 73, in
keypoints = keypoints['positions_2d'].item()
File "/home/kevin/.conda/envs/fb3d/lib/python3.6/site-packages/numpy/lib/npyio.py", line 258, in getitem
pickle_kwargs=self.pickle_kwargs)
File "/home/kevin/.conda/envs/fb3d/lib/python3.6/site-packages/numpy/lib/format.py", line 699, in read_array
"to numpy.load" % (err,))

Then I tried your .npz file in the data folder. It didn't give an errors, but the result video is like #3. I checked my pics made by you infer_simple.py, and the 2D keypoints are correct.
output0017.png.pdf

Yes there are some differences between the format in python3+ and 2.7. I used 3+ with detectron.

I made a small study to compare the outputs created by VideoPose3D with Kinect. And for that I created for each video you see a 2D keypoint .npz file. I think the one that is currently in the Repository is from "Action 3" in my study and not of the scating girl video. So the result of #3 is correct it just doesn't match the video...

I try to restore the original data_2d_detections.npz file for the scater in the repository

Could you try that data_2d_detection.npz? It is from an older commit and I hope it is the right one.

I've tried your new .npz file. I think it's not the right one.

Loading dataset...
Preparing data...
Loading 2D detections...
data/data_2d_detections.npz
Traceback (most recent call last):
File "/home/kevin/.conda/envs/fb3d/lib/python3.6/site-packages/numpy/lib/npyio.py", line 447, in load
return pickle.load(fid, **pickle_kwargs)
_pickle.UnpicklingError: invalid load key, '\x0a'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_wild.py", line 67, in
keypoints = np.load('data/data_2d_' + args.keypoints + '.npz')
File "/home/kevin/.conda/envs/fb3d/lib/python3.6/site-packages/numpy/lib/npyio.py", line 450, in load
"Failed to interpret file %s as a pickle" % repr(file))
OSError: Failed to interpret file 'data/data_2d_detections.npz' as a pickle

I wonder how you use detectron with python 3.x, since I was told that detectron could be used only with python2.7 https://github.com/facebookresearch/Detectron/blob/master/INSTALL.md. In fact, I would rather to make the .npz file by my own. Can you tell me your env?

I pushed the correct data_2d_detections.npz file into the master branch. This file should be matching the scater video. It should work now.

I am using python 3.6.8 to run detectron! conda env export > environment.yaml contains:

name: torch
channels:
  - pytorch
  - defaults
dependencies:
  - blas=1.0=mkl
  - ca-certificates=2018.03.07=0
  - certifi=2018.11.29=py36_0
  - cffi=1.11.5=py36he75722e_1
  - intel-openmp=2019.1=144
  - libedit=3.1.20170329=h6b74fdf_2
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=8.2.0=hdf63c60_1
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libprotobuf=3.6.1=hd408876_0
  - libstdcxx-ng=8.2.0=hdf63c60_1
  - mkl=2019.1=144
  - mkl_fft=1.0.10=py36ha843d7b_0
  - mkl_random=1.0.2=py36hd81dba3_0
  - ncurses=6.1=he6710b0_1
  - ninja=1.8.2=py36h6bb024c_1
  - numpy=1.15.4=py36h7e9f1db_0
  - numpy-base=1.15.4=py36hde5b4d6_0
  - openssl=1.1.1a=h7b6447c_0
  - pip=18.1=py36_0
  - protobuf=3.6.1=py36he6710b0_0
  - pycparser=2.19=py36_0
  - python=3.6.8=h0371630_0
  - readline=7.0=h7b6447c_5
  - setuptools=40.6.3=py36_0
  - six=1.12.0=py36_0
  - sqlite=3.26.0=h7b6447c_0
  - tk=8.6.8=hbc83047_0
  - wheel=0.32.3=py36_0
  - xz=5.2.4=h14c3975_4
  - zlib=1.2.11=h7b6447c_3
  - pytorch-nightly=1.0.0.dev20190118=py3.6_cuda9.0.176_cudnn7.4.1_0
  - pip:
    - cycler==0.10.0
    - cython==0.29.2
    - detectron==0.0.0
    - future==0.17.1
    - kiwisolver==1.0.1
    - matplotlib==3.0.2
    - mock==2.0.0
    - opencv-python==4.0.0.21
    - pbr==5.1.1
    - pycocotools==2.0.0
    - pyparsing==2.3.1
    - python-dateutil==2.7.5
    - pyyaml==3.13
    - scipy==1.2.0
    - torch==1.0.0.dev20190118
prefix: /home/narvis/miniconda3/envs/torch

Oh, yes! It works much better now. The 2d-keypoints match the video.
But the 3d-pose-baseline video seems not to be synchronized with the 2d-keypoints video.
I put my output video here.
output_scater.zip
Can you give me some ideas?

I think it doesn't matter if I use python2.7 with detectron, b/c I use your npz file. I start from step 4 indeed. Since I use python3.6 with VideoPose3D in anaconda, I don't think it is a env problem.
My env is:

name: fb3d
channels:

  • defaults
    dependencies:
  • blas=1.0=mkl
  • ca-certificates=2019.1.23=0
  • certifi=2019.3.9=py36_0
  • cffi=1.12.2=py36h2e261b9_1
  • cudatoolkit=9.0=h13b8566_0
  • cudnn=7.3.1=cuda9.0_0
  • intel-openmp=2019.1=144
  • libedit=3.1.20181209=hc058e9b_0
  • libffi=3.2.1=hd88cf55_4
  • libgcc-ng=8.2.0=hdf63c60_1
  • libgfortran-ng=7.3.0=hdf63c60_0
  • libstdcxx-ng=8.2.0=hdf63c60_1
  • mkl=2019.1=144
  • mkl_fft=1.0.10=py36ha843d7b_0
  • mkl_random=1.0.2=py36hd81dba3_0
  • ncurses=6.1=he6710b0_1
  • ninja=1.8.2=py36h6bb024c_1
  • numpy=1.16.2=py36h7e9f1db_0
  • numpy-base=1.16.2=py36hde5b4d6_0
  • openssl=1.1.1b=h7b6447c_1
  • pip=19.0.3=py36_0
  • pycparser=2.19=py36_0
  • python=3.6.8=h0371630_0
  • pytorch=1.0.1=cuda90py36h8b0c50b_0
  • readline=7.0=h7b6447c_5
  • setuptools=40.8.0=py36_0
  • sqlite=3.27.2=h7b6447c_0
  • tk=8.6.8=hbc83047_0
  • wheel=0.33.1=py36_0
  • xz=5.2.4=h14c3975_4
  • zlib=1.2.11=h7b6447c_3
  • pip:
    • beautifulsoup4==4.7.1
    • cycler==0.10.0
    • future==0.17.1
    • google==2.0.2
    • h5py==2.9.0
    • kiwisolver==1.0.1
    • matplotlib==3.0.3
    • opencv-python==4.0.0.21
    • protobuf==3.7.0
    • pyparsing==2.3.1
    • python-dateutil==2.8.0
    • six==1.12.0
    • soupsieve==1.8
    • torch==1.0.1
      prefix: /home/kevin/.conda/envs/fb3d

I wonder if the problem is the checkpoint file? I don't know how the checkpoint is created, b/c I just download from your link.

I read your comments in VideoPose3D. There should be some differences between coco keypoints and h36m keypoints.

When I print the keypoints_symmetry in run_wild.py, it gives:
keypoints_symmetry = [[4, 5, 6, 11, 12, 13], [1, 2, 3, 14, 15, 16]]
And you said keypoints_symmetry should be like this:
keypoints_symmetry = [ [1, 3, 5, 7, 9 , 11, 13, 15],[2, 4, 6, 8, 10, 12, 14, 16]]

Now I think the problem might be the keypoints_symmetry. Can you just check your keypoints setting in run_wild.py?

@tobiascz
Thanks for your update! The output video is quite nice now.
The next step is use my own video and generate 2d keypoints for it.
I do have detectron in my anaconda env with Caffe2(complied with python2.7, it is said that Caffe2 has to be complied with py2.7).
My situation is my detectron says it can't find the caffe2 libs when I switch to python3.x env to use VideoPose3D, and I know it's a python2.7 env problem.
I see that your pytorch is not complied. Is that okay for detectron to generate 2d keypoints? If detectron works well with a complied pytorch(caffe2), I will try it from very beginning.

I did not compile pytorch from source! I installed caffe2 with cuda support using this Link
conda install pytorch-nightly -c pytorch

Thanks for reply. May I ask what is your GPU? I wonder if the output accurency depends on the performance of GPU.
And what do you use for cutting the video into square?

The GPU does not affect the accuracy of the Model! It only creates a speed up for training and testing. I used a Nvidia Titan XPS.

It is not required to cut the video into squares - it should work fine on rectangular videos. I cut it using ffmpeg.

I just fix my environment of detectron with pytorch and py3.6. Now I could generate my 2d keypoints file(npz). And the 3d baseline video looks nice. But the input video with 2d keypoints seems wired, b/c the 2d keypoints don't match it well as yours. The output pics are correct. I exactly use your yaml file, weight file.
And why do you add --viz-skip 9 to the command:

python run_wild.py
-k detections
-arc 3,3,3,3,3
-c checkpoint
--evaluate d-pt-243.bin
--render
--viz-subject S1
--viz-action Directions
--viz-video InTheWildData/out_cutted.mp4
--viz-camera 0
--viz-output output_scater.mp4
--viz-size 5
--viz-downsample 1
--viz-skip 9\

EDIT:
What do you think about the difference between our GPU? I found GPU worked hard when the detectron figured 2d keypoints.

There is a image width and height in the code you should adjust it to the size of your video. See the original documentation about —viz-skip. I set it to 9 because it was not matching perfectly but set it to 0 for your try. Also there is a Parameter for frame rate of your Video

I set viz-skip to 0 already, and it got better than 9. And I will try the different values of frame rate with any other GPUs. I will let you know the results, if there is something interesting.
Thanks again for this repository. It helps me a lot!!