lambdaloop/aniposelib

Calibration detects 0 boards

medaigle opened this issue · 14 comments

Hi,

I have a 5-camera DLC project with a separate network trained for each camera so I came here to calibrate and triangulate them. I am running aniposelib 0.3.9 on Windows 10 as shown in the Aniposelib Tutorial:

`vidnames = [['cam1-14-59-59-calibration.avi'],
['cam2-14-59-59-calibration.avi'],
['cam3-14-59-59-calibration.avi'],
['cam4-14-59-59-calibration.avi'],
['cam5-14-59-59-calibration.avi']]

cam_names = ['1', '2', '3', '4', '5']

n_cams = 5

board = Checkerboard(7, 9, square_length = 1)

cgroup = CameraGroup.from_names(cam_names)

cgroup.calibrate_videos(vidnames, board)

cgroup.dump('calibration.toml')`

Each video appears to be fully processed, but the output is "0 boards detected" for each video. This is not a very helpful error... Am I calling the function incorrectly? Are my videos bad? (I've tried .mp4's as well with the same result). Is there a way that I can provide some intermediate supervision? The DLC 3D calibration process, for instance, seems much more involved, but only supports 2 cameras. Attached are a few of the calibration videos.

Thanks!
Maria

cam3-calibration.zip

cam2-calibration.zip

cam1-calibration.zip

Hi @medaigle ,
Thank you for taking the time to write this issue!

I checked your videos and the checkerboards are actually 6 x 8, counting by the inner corners of the board.

If you would like to check the points detected, you can enable manually_verify=True when declaring the checkerboard:

board = Checkerboard(6, 8, square_length=1, manually_verify=True)

I suppose we have not documented this clearly enough. I'll update the aniposelib tutorial to clarify that the checkerboard refers to the inner corners, and to advertise the manually verify feature.

Hi @lambdaloop, thanks for the quick response! Yes, counting "inner corners" rather than counting "squares" would be more clear.

I'm back with lots of questions and suggestions... so I'll summarize at the beginning:

Main question:
Why doesn't my 3D visualization output look like a hand?

Suggestions:

  • a back or undo button for manual verification
  • an example or explanation of what should be expected for the output of calibration
  • documentation for filters
  • further explanation of functions (i.e., what it does conceptually and when to use it)

Here are the details:
I got through the calibration and triangulation without filtering (though I had used a DLC median filter) and my 3D visualization output makes no sense. The x, y, z coordinates of several joints look unlikely as well (a few enormous jumps up to 10000 while the rest of the points are in the [-10, 10] range), although a few seem reasonable (entirely in the [-20, 20] range). So I'm trying to figure out where this may have gone wrong. Here's an example frame from all 5 viewpoints and the 3D result.
cam3
cam4
cam1
cam5
cam2
frame706-3D

When calibrating with manual verification, I accidentally accepted about 5 samples that were drastically incorrect (see example board detection).
Calibration-error
There's no back button or undo option, which really slowed me down in trying to make sure that no erroneous boards slipped through.
Would these bad detections be enough to significantly throw off calibration? I'm not sure exactly what the output of calibration is supposed to look like or what a "good" error and optimality would be, but I noticed that in the tutorial the output of calibration said "ftol termination condition is satisfied" whereas mine did not in the last iteration. Here's the end of my output -- does this look reasonable?

The maximum number of function evaluations is exceeded.
Function evaluations 200, initial cost 2.5066e+10, final cost 4.1465e+07, first-order optimality 9.94e+07.
{(0, 2): (144, array([ 91.46602327, 214.93043634])),
 (0, 3): (1000, array([ 75.76532831, 172.58085768])),
 (0, 4): (336, array([339.64797281, 789.84527614])),
 (1, 2): (1000, array([ 22.75265184, 133.37567802])),
 (1, 4): (1000, array([17.68108142, 98.94017355])),
 (2, 3): (1000, array([ 3.11245682, 30.47149031])),
 (2, 4): (1000, array([ 4.67273327, 25.29479874]))}
error:  48.566364261641965

I also wanted to let you know that after calibration finished, the manual verification window did not close automatically -- it stopped responding and I had to force it to close which killed my kernel.

Would running this with filtering help here? I do have a lot of nans in the points variable (I've tried a few different score thresholds). How do you use filters? A ctrl-f search in the API reference for "filter" finds nothing. What about the other types of triangulation? Triangulate_possible sounds potentially useful, but the 1 sentence written about it is not really enough for me to understand what it does or when I would want to use it. I tried running it and got this error:
Triangulate_possible

All my code is pulled straight from the tutorial, but I'll share it here in case it's another simple mistake.

board = Checkerboard(6, 8, square_length = 1, manually_verify=True)
cgroup = CameraGroup.from_names(cam_names)
cgroup.calibrate_videos(vidnames, board) 
cgroup.dump('calibration.toml')

cgroup = CameraGroup.load('calibration.toml')
fname_dict = {
    '1': 'cam1-11-45-03DLC_resnet50_it1_filtered.h5',
    '2': 'cam2-11-45-03DLC_resnet50_it1_filtered.h5',
    '3': 'cam3-11-45-03DLC_resnet50_it1_filtered.h5',
    '4': 'cam4-11-45-03DLC_resnet50_it1_filtered.h5',
    '5': 'cam5-11-45-03DLC_resnet50_it1_filtered.h5',
}

d = load_pose2d_fnames(fname_dict, cam_names=cgroup.get_names())

score_threshold = 0.8

n_cams, n_points, n_joints, _ = d['points'].shape
points = d['points']
scores = d['scores']

bodyparts = d['bodyparts']

points[scores < score_threshold] = np.nan

points_flat = points.reshape(n_cams, -1, 2)
scores_flat = scores.reshape(n_cams, -1)

p3ds_flat = cgroup.triangulate(points_flat, progress=True)
reprojerr_flat = cgroup.reprojection_error(p3ds_flat, points_flat, mean=True)

p3ds = p3ds_flat.reshape(n_points, n_joints, 3)
reprojerr = reprojerr_flat.reshape(n_points, n_joints)
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.pyplot import get_cmap
import matplotlib.pyplot as plt
%matplotlib notebook

def connect(ax, points, bps, bp_dict, color):
    ixs = [bp_dict[bp] for bp in bps]
    return ax.plot(points[ixs, 0], points[ixs, 1], points[ixs, 2], color=color)

def connect_all(ax, points, scheme, bodyparts, cmap=None):
    if cmap is None:
        cmap = get_cmap('tab10')
    bp_dict = dict(zip(bodyparts, range(len(bodyparts))))
    lines = []
    for i, bps in enumerate(scheme):
        line = connect(ax, points, bps, bp_dict, color=cmap(i)[:3])
        lines.append(line)
    return lines

scheme = [
   ["ThumbTop", "ThumbTip"],
   ["IndexTop", "IndexTip"], 
   ["MiddleTop", "MiddleTip"],
   ["RingTop", "RingTip"],
   ["PinkyTop", "PinkyTip"],
   ["ThumbTop", "IndexTop", "MiddleTop", "RingTop", "PinkyTop"] 
]

framenum = 2350
p3d = p3ds[framenum]

fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(p3d[:,0], p3d[:,1], p3d[:,2], c='black', s=100)
connect_all(ax, p3d, scheme, bodyparts)

Also, I can't find your preprint but I'd like to read it. Where can I get it?

Thanks!!

Hi @lambdaloop,

I think I might have discovered source of this issue. I wasn't sure whether it was still a calibration problem or to make a new issue for it. I found your preprint and now I understand better how you perform calibration.

When I run the calibration function, I get this output:

defaultdict(<class 'int'>,
            {('1', '3'): 3,
             ('1', '4'): 72,
             ('1', '5'): 7,
             ('2', '3'): 423,
             ('2', '5'): 33,
             ('3', '1'): 3,
             ('3', '2'): 423,
             ('3', '4'): 33,
             ('3', '5'): 1136,
             ('4', '1'): 72,
             ('4', '3'): 33,
             ('5', '1'): 7,
             ('5', '2'): 33,
             ('5', '3'): 1136})
error:  36.35285375837613
n_samples: 100
{(0, 2): (144, array([27.28000866, 57.95695531])),
 (0, 3): (1000, array([ 4.52688032, 66.4938051 ])),
 (0, 4): (336, array([ 22.40800078, 166.34894105])),
 (1, 2): (1000, array([  7.92038916, 211.14570104])),
 (1, 4): (1000, array([ 49.40075883, 155.65871001])),
 (2, 3): (1000, array([0.64240532, 7.12293657])),
 (2, 4): (1000, array([ 7.6097086 , 74.21564821]))}
error: 35.97, mu: 49.4, ratio: 0.565

I am assuming the defaultdict contains the number of boards detected for each pair of cameras; is this correct? With this assumption, I replicated your Figure 8 for my camera setup to see if it ends up as a fully connected graph and I believe it does not.

calibration graph

In my setup, the 4 outside cameras are on the same plane, while the middle camera is a view from above. With these numbers, is it possible to create a fully connected minimal graph? Or is my camera setup fundamentally inadequate for this calibration method?

Thanks!

Hello @medaigle ,

Thank you again for your interest and braving through the code.
I apologize for my slow response. I hope that it hasn't caused you too much frustration...
I have struggled with timely responses. I will respond within a day from now on.

So to go back to your first question (why does the hand look wrong?), I think as you realized from the following posts, there seems to be some problem with your calibration. Specifically, your calibration error in all the examples that you've shown is much too high (I see you got 48.6 pixel error in one and 36 pixel error in another, whereas you should be getting <3 pixel error).

Now, why are you not getting a good calibration? I will try to go through each of the possible reasons you speculated.

  1. Could it be that there are some bad board detections? Could it be that the bad board detections you passed through using manual verification caused some trouble?
    They generally should not. You have a lot of detected images so a few bad detections should not affect the calibration. In fact, the calibration procedure was specifically designed to minimize influences of bad board detections, as we had a few bad detections as well.

  2. Could it be that the setup does not form a connected graph?
    Your setup does form a connected graph. We should clarify in the Anipose preprint that "connected" graph means that you can reach any node from any node, not that all nodes are connected to each other. This is only really used for the initialization of the camera parameters for the optimization, and this step seems to have succeeded well enough from your output.

Okay, so what could it be? I'm not exactly sure but you may want to try playing around a bit with the calibration parameters to see if you can get it to converge better.

So in the line:

cgroup.calibrate_videos(vidnames, board) 

You can add the following parameters (along with default values):

cgroup.calibrate_videos(vidnames, board, 
                        n_iters=10, start_mu=15, end_mu=1,
                        max_nfev=200, n_samp_iter=100, n_samp_full=1000,
                        error_threshold=0.3,) 

As described in the Anipose preprint, the calibration procedure does a series of iterative thresholding to select points to calibrate and then performs an optimization on these points.
The thresholds go from start_mu to end_mu,
the number of iterations is n_iters,
the max number of optimization function evaluations (at each thresholding step) is max_nfev.
It selects n_samp_iter points at each iteration.
At the very end, it performs one final optimization on n_samp_full random points.

It may be easier to play around with the parameters if you split the board detection step and the calibration step.
You can do this by replacing calibrate_videos with calibrate_rows in this way:

# get detections
all_rows = cgroup.get_rows_videos(vidnames, board)
# update initial parameters based on camera sizes
cgroup.set_camera_sizes_videos(vidnames)
# run calibration
cgroup.calibrate_rows(all_rows, board,
                      n_iters=10, start_mu=15, end_mu=1,
                      max_nfev=200, n_samp_iter=100, n_samp_full=1000,
                      error_threshold=0.3,
                      verbose=True)

As another possible thing to try, the calibration code within Anipose is actually:

# do iterative thresholding only 2 times to get in rough position
cgroup.calibrate_rows(all_rows, board,
                      init_extrinsics=True, init_intrinsics=True,
                      max_nfev=100, n_iters=2,
                      n_samp_iter=100, n_samp_full=300,
                      verbose=True)
# perform longer optimization (restarting from start_mu=15 pixels)
cgroup.calibrate_rows(all_rows, board,
                      init_intrinsics=False, init_extrinsics=False,
                      max_nfev=100, n_iters=10,
                      n_samp_iter=100, n_samp_full=1000,
                      verbose=True)

I hope that this is enough to resolve your calibration. If not, please let me know and we can try to figure the root cause and update aniposelib to be more robust.

You also had some questions about filtering and triangulation.

The simplest thing to try could be to run RANSAC triangulation, which you can call by replacing triangulate with triangulate_ransac.
triangulate_possible is a more general implementation of RANSAC, which allows you to pass a number of options for each 2D point. Hence, it needs a different array shape than triangulate and triangulate_ransac.

triangulate_optim implements the novel "spatiotemporal constraints" filters that we describe in the paper. If you're interested, I can also write up a description of how to use that as well.

Hi @lambdaloop,

Thank you! It is helpful to understand more of the details and very helpful to be able to split up the board detection and calibration. I've been playing around with a couple parameters, but I have been getting the opposite of convergence... the best error I got out of any attempt was 86. Am I still doing something wrong?

Note, these are new calibration videos with a slightly rearranged camera setup that theoretically should calibrate better. Also, I will definitely be interested in using your spatiotemporal constraints, assuming we can get calibration to work.

I'll give you a few examples of what I tried (in order, in case that matters) and the beginning and end of the output; let me know what more details would be helpful. My assumption is that setting init_extrinsics=True, and init_intrinsics=True, or not specifying these inputs at all, resets the calibration procedure so it doesn't matter what was run before it. Is this correct? If so, why doesn't it always start with the same error? If not, how do I reset it?

# get detections
all_rows = cgroup.get_rows_videos(vidnames, board)
# update initial parameters based on camera sizes
cgroup.set_camera_sizes_videos(vidnames)

# do iterative thresholding only 2 times to get in rough position
cgroup.calibrate_rows(all_rows, board,
                      init_extrinsics=True, init_intrinsics=True,
                      max_nfev=100, n_iters=3,
                      n_samp_iter=100, n_samp_full=300,
                      verbose=True)
# perform longer optimization (restarting from start_mu=15 pixels)
cgroup.calibrate_rows(all_rows, board,
                      init_intrinsics=False, init_extrinsics=False,
                      max_nfev=100, n_iters=16, start_mu=15, end_mu=1,
                      n_samp_iter=100, n_samp_full=1000,
                      verbose=True)
defaultdict(<class 'int'>,
            {('1', '2'): 2457,
             ('1', '3'): 1581,
             ('1', '4'): 1148,
             ('1', '5'): 3672,
             ('2', '1'): 2457,
             ('2', '5'): 3753,
             ('3', '1'): 1581,
             ('3', '4'): 2856,
             ('4', '1'): 1148,
             ('4', '3'): 2856,
             ('5', '1'): 3672,
             ('5', '2'): 3753})
error:  110.07027105864645
n_samples: 100
{(0, 1): (899, array([ 46.19590113, 144.59861022])),
 (0, 2): (897, array([ 68.71313085, 206.48215016])),
 (0, 3): (897, array([ 52.84082868, 191.75753193])),
 (0, 4): (899, array([ 41.55542612, 144.63040256])),
 (1, 4): (899, array([ 47.85093761, 144.23976387])),
 (2, 3): (897, array([ 72.70425431, 199.69245853]))}
error: 110.99, mu: 72.7, ratio: 0.243
......
   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality   
       0              1         1.8654e+08                                    1.91e+13    
       1              6         1.5866e+08      2.79e+07       2.10e+02       1.08e+12    
       2              7         1.2230e+08      3.64e+07       4.52e+02       6.91e+11    
.....
      36             55         5.0892e+07      5.31e+04       3.50e+00       1.69e+15    
      37             59         5.0892e+07      8.66e+02       1.10e-01       1.71e+15    
`ftol` termination condition is satisfied.
Function evaluations 59, initial cost 1.8654e+08, final cost 5.0892e+07, first-order optimality 1.71e+15.
{(0, 1): (2974, array([ 17.99840724, 114.35370159])),
 (0, 2): (2954, array([16.54396642, 93.90562056])),
 (0, 3): (2954, array([ 15.93440584, 118.33532541])),
 (0, 4): (2974, array([ 67.35782504, 241.82841156])),
 (1, 4): (2974, array([ 50.04745623, 194.33636545])),
 (2, 3): (2954, array([ 25.72342869, 116.13343729]))}
error:  86.48824696151351
 

Here I saw the error had decreased, so I tried to just do some more iterations to see if it keeps decreasing.

cgroup.calibrate_rows(all_rows, board,
                      init_intrinsics=False, init_extrinsics=False,
                      max_nfev=100, n_iters=16, start_mu=15, end_mu=1,
                      n_samp_iter=100, n_samp_full=1000,
                      verbose=True)

This started at at error: 88.3947735953344; it decreased for a few iterations, then started increasing.
It ended up at error: 172.60254741695982

So I tried to reset it:

# update initial parameters based on camera sizes
cgroup.set_camera_sizes_videos(vidnames)
# run calibration
cgroup.calibrate_rows(all_rows, board,
                      n_iters=10, start_mu=15, end_mu=1,
                      max_nfev=200, n_samp_iter=500, n_samp_full=1000,
                      error_threshold=0.3,
                      verbose=True)

But the initial error this time was 6201.335403373839. Several iterations later... I stopped it before it finished because it was clearly not going to converge.


       0              1         3.4114e+42                                    3.16e+53    
       1              9         2.5575e+42      8.54e+41       2.56e+11       2.05e+53    
       2             11         2.5086e+42      4.90e+40       2.37e+07       1.99e+53    
       3             18         2.5084e+42      1.62e+38       1.18e+04       1.99e+53    
Both `ftol` and `xtol` termination conditions are satisfied.
Function evaluations 18, initial cost 3.4114e+42, final cost 2.5084e+42, first-order optimality 1.99e+53.
{(0, 1): (2975, array([1.20252896e+19, 7.72675498e+19])),
 (0, 2): (2953, array([2.37743135e+09, 1.96664434e+10])),
 (0, 3): (2953, array([253.56159618, 447.39217155])),
 (0, 4): (2975, array([1.55241521e+19, 2.42940287e+21])),
 (1, 4): (2975, array([6.64844894e+19, 2.57802270e+21])),
 (2, 3): (2953, array([2.37743145e+09, 1.96664436e+10]))}
error: 2920919970681102848.00, mu: 66484489380833034240.0, ratio: 0.615

I'd love to work with you to try and figure this out. Please let me know what you think and what should be our next steps.
Thanks in advance for your help!!

Hmm yeah clearly something is wrong...
Right now my best guess is that perhaps something is up with the checkerboard detection.
Looking back at your videos, your checkerboard looks the same when rotated by 180 degrees. It could be that it's detecting it as different orientations in different cameras and then has trouble to calibrate this.

One simple suggestion could be to try a checkerboard with an odd number of inner corners, like 9x6 or 9x7, as it would not look the same when rotated 180 degrees. Alternatively, you could also try recording videos with a charuco board where the orientation is uniquely specified due to the aruco markers in between.

If possible, would you be able to share some calibration videos so I could try to reproduce this issue as well, and see if anything else may be wrong?

Definitely! I was thinking of trying a charuco board also. I might not get to it for about a week, though.

I had also tried scaling back to 3 cameras to see if that would calibrate better. I tried several combinations of 3 cameras out of these 5, and could not get any good calibrations that way either. Then I set up 3 cameras to mimic your tutorial setup as closely as possible and tried to calibrate that. It took me a couple tries but I was able to eventually calibrate it with 1.6 pixel error. The perspectives between cameras in that case were certainly less extreme than my 5 camera setups. So, I think your guess makes sense.

Here's some videos, thanks!
cam1-11-16-29.zip

cam2-11-16-29.zip

cam3-11-16-29.zip

cam4-11-16-29.zip

cam5-11-16-29.zip

Hi @medaigle ,

I spent some time looking into your calibration videos.

I think there are 2 main issues that I see.

  1. I plotted the calibration board detections with plain opencv and they are not in a consistent orientation, even (in some cases) for frames. Here, I show some examples from cam3.
    mpv-shot0002
    mpv-shot0001

  2. Another issue is for some cameras the board is not fully visible for most of the time and thus does not get detected. For instance, in cam 1 there are a lot of frames where the board is just barely not fully in the frame, as in the example picture.
    mpv-shot0005

I think maybe the best bet is to try using a charuco board instead of a checkerboard, as it would help with both of these issues.

Hi @lambdaloop,

Thank you for taking the time to do that! I am trying to use a charuco
board now and I'm back to the original problem in this thread -- 0 boards
detected.

I assume this is again a simple mistake in defining the board object but
I've tried many variations on the parameters and I still haven't figured it
out. Here's what I'm running:

vidnames = [['charuco-cam1-16-23-23.mp4'], ['charuco-cam4-16-23-23.mp4'],
['charuco-cam5-16-23-23.mp4']]

cam_names = ['1', '4', '5']

n_cams = 3

board = CharucoBoard(8, 10, square_length=2, marker_length=1,
marker_bits=4, dict_size=50)

cgroup = CameraGroup.from_names(cam_names)

cgroup.calibrate_videos(vidnames, board)

I assume based on the tutorial that the definition of squaresX and squaresY
is different for checkerboards/charuco boards -- that is, here I should
count the number of squares and not the number of inside corners. Is that
correct? Either way, I still get 0 boards detected so I must be missing
something else.

I'm not sure what the dict_size should be, but I experimented with your
tutorial files and I still got board detections with different dict sizes,
so I assume it is not a super important parameter.

What else am I missing?

Here's the board I'm using.

Charuco-board-used-for-the-camera-calibration_W640

Thanks!!

Hi @medaigle ,

Your code seems correct for the charuco board that you attached.
I'm not sure where you got the charuco board from, but most of them are just made with dict_size = 50. (It's the size of the set possible aruco markers.) It is also indeed the number of squares, not the number of inner corners. Now that I think about it, it is strange to have this discrepancy...

So if the code is correct, I'm guessing perhaps there is just an issue with the board itself? I think in particular an 8x10 board may have the aruco markers be too small, so it has trouble detecting it? If possible, perhaps you could try a 6x8 or even 6x6 board of the same size?

Hi @lambdaloop,

I discovered the problem with detecting the boards -- I had to flip my videos in the x direction and now it is detecting the expected number of boards. However, I'm still having issues with the calibration.

For my 3-camera setup that worked with the checkerboard, the calibration drops down to under 3 pixel error within a couple of iterations but then it climbs back up into the hundreds. After about 15 iterations I end up with errors in the quintillions and then I get nans.

For my 5-camera setup I can't even perform the calibration calculations; it appears to have some invalid values. Here's the specific error that I get:
Error-5cam-charuco-calib
Error-5cam-charuco-calib2

I checked out the detections using the manually verify feature and in both cases it seems like the detection of the markers is poor quality most of the time. I'm surprised, as these are 1.6MP cameras and it doesn't seem to matter how much of the frame is taken up by the board.

I've been experimenting a bit with lighting and file types with some improvement but in any case, not all markers are identified and a few ids show up where there is no marker. Is this expected behavior or how would you suggest improving it? A board with larger markers is still on my to-do list.

Here are some example videos as well.
3-cam-charuco.zip

Thanks!!

Hey @lambdaloop,

I have a small update. It seems these problems are mostly specific to these videos, as I now have one set of calibration videos with 4 cameras (same board) that behaves pretty normally -- the error hovers around 4.6 no matter how many iterations I run. It ends up with 3.3 pixel error if I run calibrate_videos, or 4.6 pixel error with calibrate_rows even with lots of iterations. Although, looking at the detections with manually verify I'm still not impressed; only a small fraction of markers are usually detected. This leads me to a new question: is a pixel error of less than 3 always enough to determine that a calibration is accurate?

This calibration does blow up, though, if I run calibrate_videos followed by get_rows_videos, set_camera_sizes_videos, and calibrate_rows in the same session or if I use another set of videos and switch to this set of videos in the same session.

I went back to the 3-camera and 5-camera sets I mentioned in the previous post and verified that the problems are not just because of the order I was running functions.

Not sure what this means besides keep trying to make better quality videos. What do you think?

Thanks!

Hi again @lambdaloop,

It's the same story with a 6x8 calibration board that has markers about double the size. But now I am extra confused, because the results of calibration seem random.

I ran the exact same code on the same set of videos a bunch of times; twice it calibrated properly and ended up with an error of about 3.5 and the other times (at least 4, I don't remember how many times I ran it because I was confused...) it blew up into nans.
I tried this replication experiment because it worked the first time, yesterday, and then today I wanted to go back and try triangulation as well. But when I ran the calibration today, it didn't work. So I ran it several more times trying to figure out what was going wrong and it continued to end up with nans, until it worked once more.

The triangulation with the good calibration also doesn't make any sense -- it seems like it's predicting the same point for every frame (the x, y, z coordinate plot is just 3 flat lines). It is not a DLC problem -- I verified that the h5 files are not flat lines.

So, I'm very confused and I have no idea what to do now. Any thoughts? I'll email you all the relevant files complete with the full outputs so you can hopefully see what's going on.

Thanks!!