Extending , 3d-pose-baseline to python-posenet ?

Question

Extending , 3d-pose-baseline to python-posenet ?

basicvisual opened this issue 5 years ago · 3 comments

Hi , currently i am trying to extend the 3d-pose-baseline to extend it to python-posenet. The github repo as below

https://github.com/rwightman/posenet-python In the file webcam_py it returns the 17 keypoint via the list keypoint_coords
for X, Y dimensions. I am having a hard time to convert into the json folder structure which acts as an input to the 3d-pose-baseline, which expects the following method of input

{"people": [{"pose_keypoints_2d": [443, 151, 443, 175, 419, 172, 376, 154, 395, 99, 470, 175, 474, 201, 0.0, 0.0, 415, 266, 419, 342, 419, 397, 455, 271, 439, 342, 427, 399, 435, 146, 451, 146, 427, 146, 463, 146]}]}
which is the 36 input points padded with zeros . I was wondering how can it be extended from the following variable keypoint_coords into the json format- The sample output of the keypoints returned is as follows

 [[[112.7389984  435.22939381]
  [105.58624109 442.22496105]
  [105.40821313 428.48843217]
  [111.54160163 455.5956664 ]
  [111.73701629 422.6852232 ]
  [157.57940773 468.98824122]
  [152.69650377 418.14950772]
  [216.09577598 494.46142991]
  [216.24983383 398.55651579]
  [187.72555422 470.64766974]
  [195.02623991 401.41814955]
  [250.42257518 456.73815289]
  [254.02587076 414.79482611]
  [342.11166065 465.78956559]
  [337.49021309 398.70648202]
  [420.26860336 483.16769808]
  [413.13895387 400.20868124]]

Answer 1 · 2019-04-17T19:36:37.000Z

Hi @una-dinosauria , Arash
I have a question on implementation as per previous question. I had managed now to get the keypoints written in the following format

`{'people': [{'pose_keypoints_2d': array([[140.59414005, 309.6962719 , 132.67413235, 316.36242962,
        132.39204025, 306.64450312, 133.98109341, 324.79055738,
        133.36865807, 294.94057655, 161.00686073, 327.11033916,
        159.45068073, 289.2108078 , 188.53337383, 367.71925831,
        188.15708065, 254.20604885, 228.49087334, 368.40724435,
        227.49163294, 249.81577826, 217.79596138, 319.821661  ,
        217.73108721, 309.10136986, 248.52326822, 365.44466782,
        244.54695415, 378.99501705, 291.03269768, 366.29990757,
        289.71720028, 365.28365541,   0.        ,   0.        ]])}]}

I have modified the following code , `https://github.com/rwightman/posenet-python/blob/master/posenet/decode.py
with the following

padding_matrix = np.zeros((18, 2))
   padding_matrix[:instance_keypoint_coords.shape[0],:instance_keypoint_coords.shape[1]] = instance_keypoint_coords
   reshape_matrix = np.reshape(padding_matrix,(1,36))
 
   #create the dictionary structure 
   my_dict = {}
   new_dict = {}
   my_dict["pose_keypoints_2d"] = reshape_matrix
   new_dict['people'] = my_dict
   
   #write the json directory and dump it to the file
   json_data = json.dumps(new_dict['people']["pose_keypoints_2d"].tolist())
   #print("------------printing json data------------------")
   #print(json_data)
  
   with open('000000000000_keypoints', 'w') as outfile:
       json.dump(json_data, outfile
)

If i understand correctly , we need to have a format , for the 3d-pose baseline (realtime) to work. I am just
to make openpose_3dpose_sandbox_realtime.py take those keypoints I am not sure what is the siolution to this

I get a warning message when i run openpose_3dpose_sandbox_realtime.py which is as follows
string indices must be integers

i am currently clueless as what can be done to make the 3D pose estimator to accept the 2D detections.
Any help would be greatly appreciated.

Answer 2 · 2019-04-25T19:02:49.000Z

Hello , i think I have managed it to port to 3d-pose-baseline from posenet python. I hoewever have not checked out the results

Diff

diff --git a/webcam_demo.py b/webcam_demo.py
index b9f1b19..0530407 100644
--- a/webcam_demo.py
+++ b/webcam_demo.py
@@ -2,6 +2,10 @@ import tensorflow as tf
 import cv2
 import time
 import argparse
+import json
+import os
+import errno
+import numpy as np
 
 import posenet

@@ -20,7 +24,7 @@ def main():
         model_cfg, model_outputs = posenet.load_model(args.model, sess)
         output_stride = model_cfg['output_stride']
 
-        cap = cv2.VideoCapture(args.cam_id)
+        cap = cv2.VideoCapture('Squat.mp4')
         cap.set(3, args.cam_width)
         cap.set(4, args.cam_height)
:...skipping...
diff --git a/webcam_demo.py b/webcam_demo.py
index b9f1b19..0530407 100644
--- a/webcam_demo.py
+++ b/webcam_demo.py
@@ -2,6 +2,10 @@ import tensorflow as tf
 import cv2
 import time
 import argparse
+import json
+import os
+import errno
+import numpy as np
 
 import posenet
 
@@ -20,7 +24,7 @@ def main():
         model_cfg, model_outputs = posenet.load_model(args.model, sess)
         output_stride = model_cfg['output_stride']
 
-        cap = cv2.VideoCapture(args.cam_id)
+        cap = cv2.VideoCapture('Squat.mp4')
         cap.set(3, args.cam_width)
         cap.set(4, args.cam_height)
 
@@ -43,20 +47,48 @@ def main():
                 output_stride=output_stride,
                 max_pose_detections=10,
                 min_pose_score=0.15)
+                
+            #Making folder 
+            try:
+                os.makedirs('output_json_dir')
+            except OSError as e:
+                if e.errno != errno.EEXIST:
+                    raise
+            
 
             keypoint_coords *= output_scale
-
+            #reshaping from 10x17x2 to 17x2
+            keypoint_coords_17_reshape = keypoint_coords[0, :, :]
+            #print(frame_count)
+            #print(keypoint_coords_17_reshape)
+            #keypoint_coords_17 = keypoint_coords_17_reshape[frame_count]
+            # reshaping from 17 x2 to 16x2 
+            #keypoint_coords_16_reshape = np.delete(keypoint_coords_17_reshape, 0, axis = 0)
+            padding_matrix = np.zeros((18, 2))

+            padding_matrix[:keypoint_coords_17_reshape.shape[0],:keypoint_coords_17_reshape.shape[1]] = keypoint_coords_17_reshape
+            reshape_matrix = np.reshape(padding_matrix,(1,36))
+            #print (reshape_matrix)  
+            print(frame_count)
+            dc = {"people":[]}
+            dc["people"].append({"pose_keypoints_2d" : reshape_matrix.tolist()})
+            if 'output_json_dir':
+            #Writing perframe 
+                with open(os.path.join('output_json_dir', '{0}_keypoints.json'.format(str(frame_count).zfill(12))), 'w') as outfile:
+                    json.dump(dc, outfile)
+            
+

Sample outpiut

{"people": [{"pose_keypoints_2d": [[33.84212351733853, 228.12424132733196, 28.958085101583727, 232.82001895970538, 29.633724733909464, 224.2092626466058, 32.63927270166622, 237.56183020631335, 36.50913925644774, 220.6442365019379, 55.02919830890916, 241.53741533451014, 55.772777818004535, 211.75396898791038, 85.56947032857386, 248.50610967655908, 84.85119363536005, 214.25719557501452, 115.81537329632303, 251.13081590197078, 111.24475798991895, 202.06230670110577, 110.00015685276955, 237.04345998549542, 103.92863113687645, 217.5115226313317, 155.1378996475883, 240.40812478510978, 147.22127641950334, 206.64174837884605, 194.14900773800676, 250.57566848345692, 187.6395447061669, 203.63684815931484, 0.0, 0.0]]}]}

Answer 3 · 2019-05-23T19:29:52.000Z

Closing for lack of activity. Please reopen if the issue is still ongoing.