umyelab/LabGym

Issue about Detector precision

Closed this issue · 68 comments

Hi,

I am trying to improve my detector accuracy. But even after I annotated 160 image examples, the detector still only scores ~70% mean average precision. I looked at the annotation results from testing the detector, and found that most errors probably come from redundant detection of the same object. For example in this attached detector test result image, the head of the mouse is annotated twice with different precisions. The same happens for the annotation of feces in this same test example.

Would anyone know how to solve this issue?

Thanks in advance,
Albert

10_mp4-0030_jpg rf 7afe8be19375b08063b9f9acd4fafd95

Hi,

First of all, I need some more information about how you trained the Detector?

  1. What is the inferencing frame size? And what is the original frame size of your video? Did you do any resizing?
  2. What is the iteration number?
  3. When you generate dataset in Roboflow, did you do any "preprocessing step" and any "augmentation method"? What are they?

The head is hard to be distinguished from the body because there's no clear boundary between them. That means there's no obvious visual features for the Detector to learn which portion is head and which portion is body.

You probably can try to refine the labeling by this way (see the attached image). I guess you annotated the body and head in the way of the top panel, right? If so, try to label them as the bottom panel. Make the two parts more separate, which might increase the chance for the Detector to learn how to distinguish them.
demo

Hi Yujian,

Thank you for your help. Here is more info about what I did:

  1. the inferencing frame size is 640. The original frame size is 640X480. I did not do any resizing.
  2. iteration number is 200
  3. I did not preprocess with Roboflow. I do follow the recommendation to do augmentation in Roboflow by flipping both horizontally and vertically, and to increase and decrease brightness by 10%.

You are right I do have overlap between head and body annotations. Thanks I will try your way of annotation.

By the way, what is considered acceptable mAP? Should I target for >90%?

Thanks a lot,
Albert

Hi Albert,

All the settings are good except for the "iteration number". In your case, more iteration numbers are required for good accuracy. You can train a new Detector at 640 inferencing frame size and 5,000 iterations. If the illumination doesn't change much across your videos, you don't need to use the "brightness" augmentation. Instead, add "90 degree rotation". The "flipping" and "rotation" would maximally mimic the scenarios that the mouse is in different location in a frame. Roboflow augments ~2 folds of the original amount each time. So, to maximize the augmented images, don't use the two augmentation methods in combination, but use them one after another. For example, you first use "flipping" to generate a dataset, export it and re-upload it to create a new project, you will have 2X original amount. And then, use "rotation" to generate a dataset again, you will finally have 4X original amount.

The mAP is just a number. For a typical scenario, over 80% should be good. In your case, I think it's more important to see whether the Detector can distinguish the head from the body, rather than how precise the boundaries between them are (there are actually no clear visible boundaries anyway). So I suggest you just go to "Analyze Behavior" module, select "Not categorize behavior, just track animals" and choose the trained Detector and run it on your videos. It will output annotated videos in which every detected parts will be annotated. Watch those videos to get an idea whether the Detector is able or has the potential to distinguish the head from body.

Hi Yujian,

Thanks for the advice. I retrained a detector with the same old segmentation but with 5000 iterations.
Now I seem to have a better detector looking at the automatic segmentation results (like this one attached).
But my mAP is just 67.9%. You said it's just a number, so in my case, I shouldn't be concerned about it ?

Thanks,
Albert

0_mp4-0008_jpg rf b75577272ce18ee7db60f21bfe4f8823

Hi Albert,

The mAP calculates how precise the bounding box is, for example, if the body and head is detected but the detected boundaries between them is little off from that you annotated, the mAP will be low. This is tricky because there aren't clear, visible boundaries between the head and the remaining body. They look uniform. If my understanding is correct, when you annotated them, you arbitrarily defined a boundary between them.

From this image the detected body includes the entire body and I guess when you annotated the body, you only annotated partial of the body, right? If so, this explains why the mAP is low. But your goal is to distinguish the head from the body, right? If so, you can just check whether the Detector can distinguish head from body in all your testing images and probably also in the videos to analyze, rather than just trying to improve the value of the mAP.

My previous suggestion on the annotation was trying to make the "arbitrary boundary" between head and the remaining body more separate to help the Detectors learn better.

Thanks Yujian!

I actually overlay the "head" annotation on top of the "body" annotation. And the "body" overlays on "mouse" annotation. They are overlapping yes. Kindly let me know if this is not a good approach to segment. From the detector testing results now, it looks good to me. It is just that the mAP is low.

And yes, I am modifying my segmentation in my train images according to your advice and will train another detector.

I would like to ask another question if you don't mind. How can I export the pose data from LabGym after I have trained a good detector? Or LabGym can only be used for behaviour annotations but not pose data tracking?

Thanks a lot,
Albert

Yes, if the annotations in the testing images look good to you, you don't need to worry too much about the value of mAP. And if the overlaying annotation works for your purpose, you don't need to follow my suggestions. I didn't know your analysis goals and I'm haven't tried the overlaying annotation, which doesn't mean it won't work well.

LabGym doesn't output pose data because there are already several existing tools that are specialized for this, like DLC and SLEAP. LabGym does what they don't do: categorizing the behavior and calculating quantitive measurements for each behavior. The Detector in LabGym only outputs masks of the detected object for behavior categorization.

Hi Yujian,

Thanks for your explanation all along. I have another question, which is about the bt code in the video file name when I generate behaviour examples. I notice the generation doesn't start exactly at the time I specified, but 1.3seconds earlier. For example, if I input XX_b2.5.avi to generate behaviour examples, the first behaviour example is actually generated at 1.2secs of the video.

Maybe I have done something wrong? Would you have any ideas about this ?

Thanks again,
Albert

The start time in both generating examples and analyzing behaviors is always "xx frames" earlier than what you set. The "xx frames" here is exactly the duration of your behavior examples. In your case, the duration of behavior example is 1.3 seconds according to the video fps and the "xx frames" you set.

The start time, especially during behavioral analysis, is set this way, because the identification of a behavior event always happens after LabGym acquires the "xx frames". The "xx frames", which is the duration of behavior example and is also the sliding time window for LabGym to acquire frames and perform behavioral identification during analysis. For example, if you want the analysis to start at 20th frame, which means the categorization of the first behavior event is at 20th frame, and if the duration of behavior examples (sliding time window of behavior categorization) is 10-frame, LabGym needs to acquire 10 frames prior to the targeting start time (20th frame) for the categorization of the first event. So the actual analysis start time is at 20-10=10th frame.

The start time for generating behavior examples is also set this way, just to be consistent with that during analysis. So when you generate behavior examples and you want the generation starts exactly at 2.5th second, and the duration of behavior examples is 1.3 second, you can set "_b3.8".

Thanks Yujia. I understand now.

So in my case, let's say I have a 16 seconds video (VidXX.avi) which contains 3 episodes of the mouse "standing up". First episode starts at 2 seconds, lasts for 0.5 second. Second episode starts at 7 seconds, lasts for 1 second. Third episode starts at 10 seconds, lasts for 2 seconds. To generate these three examples in LabGym, is it correct to say the most efficient way is just to cut out the parts of videos so that each short extract contains fully the behaviour event episode I am interested in?

Or I should triplicate the video VidXX.avi and then label each one with the respective bt that corresponds to the start times of the three episodes? And set the duration of generation in LabGym to be 2 seconds (the longest amongst the three episodes)? In this case, since the durations of three episodes are different, for the number of frames of example generation should be the longest episode among the three too?

Hope my question makes sense. This is important because I need to know what to put in the _bt_label if I generate behaviour examples from a large batch of videos (the target number of examples for each type of behaviour of interest is at least 100 right?).

Thanks,
Albert

Typically, to build a "good" behavior example dataset, you can:

  1. Select several videos that contains ALL behaviors that may occur during the analysis (you can directly use the videos to analyze), and observe one episode of every behavior, not only the behaviors of your interest, but also "background" behaviors. An "episode" is the shortest time needed for identifying this behavior. So in your case, one episode of "standing up" is 0.5 second and the standing up that lasts 2 seconds actually contains multiple episodes.

  2. To train a good Categorizer in LabGym, you need to sort the entire ethogram. You can choose the duration of one episode of the longest behavior as the "duration" of your behavior examples to generate. For example, there are 3 behaviors in total, one episode of A is 0.5 second, one episode of B is 0.7 second, one episode of C is 1 second, you use 1 second as the duration of behavior example.

  3. Set a start time and an interval (how many frames to skip between two consecutively generated examples) and generate the behavior examples and test different settings, like whether to include body parts or what the STD value to set to make the behavior examples most distinguishable between different categories.

You don't need to find the time windows for a behavior to generate examples, but instead, generate examples from the begin, and watch the examples to select the most perfect examples that covers the entire episode of a behavior. If you are worried that you may miss an occurrence of a behavior, for example, the second standing up, you can just reduce the interval. If go extreme, you can set the interval as 1 and the behavior example will be generated at every frame, and you will never miss any occurrence but will have a lot of examples that share overlapping frames.

So back to your 16-second video, you can either use the "preprocessing module" in LabGym to select the three time windows and merge them into one clip so that this clip is full of "standing up", or you just generate from the beginning and set the interval small, like skip 5 frames, and watch the examples to select the most appropriate ones.

Hopefully this explanation addresses your questions.

Hi YuJia,

Thanks for the clear explanation. I understand now. I previously thought I have to specify the time window of every different behaviour type episodes in each video, at least just in my head. But I get it now that it is more about fragmenting long behaviours and sorting them afterwards in LabGym.

While doing that I encounter another issue, sorry for so many questions. For example, I have this video named "2_b5.06_n1.avi", from which I generated behaviour example pairs (.jpg and .avi). And the generated example pairs names are automatically generated to be "2_b5.06_n1_Whole mouse_0_50_len30_std50.jpg" and "2_b5.06_n1_Whole mouse_0_50_len30_std50.avi". However, when I start to sort them, the following error appears:

[ WARN:0@102939.825] global loadsave.cpp:248 findDecoder imread_('/Users/albertfok/Desktop/18att M22664 Open Field image examples source videos/LabGym Generated Behaviour examples/Generated Rearing examples/2_b5.06_n1/Whole mouse_0/2_b5.jpg'): can't open/read file: check file path/integrity
Traceback (most recent call last):
 File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/gui/training/behavior_examples.py", line 1183, in sort_behaviors
  pattern_image = cv2.resize(
cv2.error: OpenCV(4.9.0) /Users/runner/work/opencv-python/opencv-python/opencv/modules/imgproc/src/resize.cpp:4152: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

It appears that LabGym stop reading the file name at the first dot/decimal point. It would work after I rename the file to remove the decimal point in "5.06". But it would be troublesome for me to rename all the behaviour examples. Or I should not use decimal point in the source video names when I generate behaviour examples? The example given in the user instructions file is with decimal point though:

Uploading Screenshot 2024-04-05 at 3.22.20 PM.png…

Thanks a lot,
Albert

Which version of LabGym did you use? This issue appeared in an old version and I thought it was fixed by @rohansatapathy. Can you take a look at this @rohansatapathy? Thanks!

And also, using GUI for sorting is just one way. Another way, which I actually prefer, is just to make the behavior examples large items in the folder that stores them, and watch them and manually drag them into different new folders and name the new folders under the behavior names. By this way, you don’t need to watch every example one by one in the GUI, but can quickly scan them (by the .jpgs) and select some into the new folders.

But we’ll definitely fix the issue you encountered in the next update if it’s not fixed already.

Yes, I remember fixing this issue in a previous version of LabGym. @AlbertFok if you're not using the most recent version of LabGym, please try upgrading and seeing if that helps resolve the issue. If that doesn't work, I'll start troubleshooting to see if I can reproduce the issue on my end.

Thank both of you for helping. I followed this document to do the installation roughly a month ago: https://github.com/yujiahu415/LabGym/blob/master/LabGym%20user%20guide_v2.2.pdf
So mine should be LabGym v2.2?

I will try to do it without the GUI first. Thanks again.

You can confirm the version number by looking at the title bar in the main LabGym window, which should say something like "LabGym v2.x.y".

On another note, based on the error message you provided earlier, it seems like you installed LabGym using pipx, which means that you must have followed the instructions here.
Therefore, you should be able to upgrade by using the command pipx upgrade LabGym. Please let me know if that helps resolve the issue.

Hi Rohan,

Thanks! The version I have been using is v2.3.4, as shown on the title bar of the GUI.

Albert

Thanks for the info! The bug you’re facing was fixed in version 2.3.5, so if you upgrade to the latest version using the command I sent earlier, you should no longer have that issue. Hope this helps!

Hi again,

I moved on from the behaviour sorting step but unfortunately encountered another error when I started to train a categorizer:

2024-04-10 15:59:42.126 Python[27840:1224985] +[CATransaction synchronize] called within transaction 2024-04-10 15:59:42.286 Python[27840:1224985] +[CATransaction synchronize] called within transaction Training the Categorizer w/ only Pattern Recognizer using the behavior examples in: /Users/albertfok/Desktop/18att M22664 Open Field image examples source videos/Categorizer prepared examples Found behavior names: ['Grooming' 'Immobility' 'Rearing' 'back' 'left' 'movement' 'right' 'straight'] Perform augmentation for the behavior examples... This might take hours or days, depending on the capacity of your computer. 2024-04-10 16:00:05.588410 Start to augment training examples... Traceback (most recent call last): File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/gui/training/categorizers.py", line 607, in train_categorizer CA.train_pattern_recognizer( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/categorizers.py", line 975, in train_pattern_recognizer _, trainX, trainY = self.build_data( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/categorizers.py", line 468, in build_data pattern_image += beta numpy.core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('uint8') with casting rule 'same_kind'

What could be the problem? And my behaviour labels are actually "moving_straight, grooming, immobility, rearing, turning_left, turning_right, mild_movement, step_back". However, some seem to be truncated in the trace. Should I change my labels?

Thanks a lot,
Albert

Hi Albert,

This seems to be a bug in the new version. Sorry for that. @rohansatapathy, can you take a look and fix this? Thanks!
At the meantime, you can downgrade LabGym to its stable version v2.2.2 by python3 -m pip install LabGym==2.2.2 or py -m pip install LabGym==2.2.2, which should not have this bug.

LabGym has been under code-refactoring to make the code more understandable for future developers. Currently, the functions in newer versions are basically same as v2.2.2. The Categorizer trained in v2.2.2 or older can also be used in newer versions. A major difference between newer version and v2.2.2 is: in newer versions, you just need to type LabGym in the terminal to initiate its user interface, but in v2.2.2, you need to first initiate Python3 in the terminal by typing python3 or py, and then type from LabGym import gui, and gui.gui() to initiate the user interface.

It seems you used pipx to install LabGym, then you can downgrade LabGym to v2.2.2 by pipx install LabGym==2.2.2.

Thanks for the prompt reply. I just tried, but after installing v2.2.2, I cannot initiate the GUI, and the following error occurs:

(base) Alberts-MacBook-Pro:~ albertfok$ python3
Python 3.11.1 (v3.11.1:a7a450f84a, Dec  6 2022, 15:24:06) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from LabGym import gui
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'LabGym'

I think I have successfully installed v2.2.2 though:
(base) Alberts-MacBook-Pro:~ albertfok$ pipx install LabGym==2.2.2 --force --include-deps Installing to existing venv 'labgym' ⚠️ Note: wheel was already on your PATH at /Users/albertfok/opt/anaconda3/bin/wheel ⚠️ Note: f2py was already on your PATH at /Users/albertfok/opt/anaconda3/bin/f2py ⚠️ Note: normalizer was already on your PATH at /Users/albertfok/opt/anaconda3/bin/normalizer ⚠️ Note: markdown_py was already on your PATH at /Users/albertfok/opt/anaconda3/bin/markdown_py ⚠️ Note: pygmentize was already on your PATH at /Users/albertfok/opt/anaconda3/bin/pygmentize ⚠️ Note: fonttools was already on your PATH at /Users/albertfok/opt/anaconda3/bin/fonttools ⚠️ Note: pyftmerge was already on your PATH at /Users/albertfok/opt/anaconda3/bin/pyftmerge ⚠️ Note: pyftsubset was already on your PATH at /Users/albertfok/opt/anaconda3/bin/pyftsubset ⚠️ Note: ttx was already on your PATH at /Users/albertfok/opt/anaconda3/bin/ttx ⚠️ Note: vba_extract.py was already on your PATH at /Users/albertfok/opt/anaconda3/bin/vba_extract.py ⚠️ Note: imageio_download_bin was already on your PATH at /Users/albertfok/opt/anaconda3/bin/imageio_download_bin ⚠️ Note: imageio_remove_bin was already on your PATH at /Users/albertfok/opt/anaconda3/bin/imageio_remove_bin ⚠️ Note: lsm2bin was already on your PATH at /Users/albertfok/opt/anaconda3/bin/lsm2bin ⚠️ Note: tiff2fsspec was already on your PATH at /Users/albertfok/opt/anaconda3/bin/tiff2fsspec ⚠️ Note: tiffcomment was already on your PATH at /Users/albertfok/opt/anaconda3/bin/tiffcomment ⚠️ Note: tifffile was already on your PATH at /Users/albertfok/opt/anaconda3/bin/tifffile ⚠️ Note: isympy was already on your PATH at /Users/albertfok/opt/anaconda3/bin/isympy installed package labgym 2.2.2, installed using Python 3.10.13 These apps are now globally available - LabGym - convert-caffe2-to-onnx - convert-onnx-to-caffe2 - f2py - fonttools - helpviewer - imageio_download_bin - imageio_remove_bin - img2png - img2py - img2xpm - import_pb_to_tensorboard - isympy - lsm2bin - markdown-it - markdown_py - normalizer - pycrust - pyftmerge - pyftsubset - pygmentize - pyshell - pyslices - pyslicesshell - pywxrc - saved_model_cli - tensorboard - tf_upgrade_v2 - tflite_convert - tiff2fsspec - tiffcomment - tifffile - toco - toco_from_protos - torchrun - ttx - vba_extract.py - wheel - wxdemo - wxdocs - wxget These manual pages are now globally available - man1/isympy.1 - man1/ttx.1 done! ✨ 🌟 ✨

Albert

I had reinstall using pip instead of pipx and it has worked now.

Albert

Sorry for so many messages, but I encountered the same error with v2.2.2:

File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/LabGym/categorizers.py", line 353, in build_data pattern_image+=beta numpy.core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('uint8') with casting rule 'same_kind'

Thanks,
Albert

I see. Then I'll fix the bug shortly.

Hi Albert, I just fixed the bug and uploaded a new version v2.3.4. You can pipx upgrade LabGym to upgrade it and let me know if it solves the issue. Thanks!

Thanks for the efficient response !
I dont encounter the same error anymore, but I have another one now...
Traceback (most recent call last): File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/gui/training/categorizers.py", line 607, in train_categorizer CA.train_pattern_recognizer( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/categorizers.py", line 1060, in train_pattern_recognizer cp = ModelCheckpoint( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/callbacks/model_checkpoint.py", line 191, in __init__ raise ValueError( ValueError: The filepath provided must end in .keras (Keras model format). Received: filepath=/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenField_cat_v1

I gather that it's the naming of my categoriser?

Thanks a lot truly,
Albert

I tried to adjust the naming of the categoriser and it seems working now! Never mind.

Ah no it just encounters another error:

Epoch 1/1000000 971/972 ━━━━━━━━━━━━━━━━━━━━ 0s 44ms/step - accuracy: 0.2111 - loss: 2.6325 Epoch 1: val_loss improved from inf to 1.53999, saving model to /Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenFieldv1.keras Traceback (most recent call last): File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/gui/training/categorizers.py", line 607, in train_categorizer CA.train_pattern_recognizer( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/categorizers.py", line 1087, in train_pattern_recognizer H = model.fit( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler raise e.with_traceback(filtered_tb) from None File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/callbacks/model_checkpoint.py", line 291, in _save_model raise IOError( OSError: Please specify a non-directory filepath for ModelCheckpoint. Filepath used is an existing directory: /Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenFieldv1.keras

This might be caused by changes in newer version of keras. What versions of tensorflow and keras did you use, and what operating system did you use? Windows or Linux?

Name: tensorflow Version: 2.16.1

>>> import keras Traceback (most recent call last): File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named 'keras'

I think I haven't installed keras then?

I am using mac.

Thanks,
Albert

I am not sure how to check my keras version. I have already run pip install keras --upgrade but when I import it I still don't have this module

You can downgrade tensorflow to 2.15.0 by pipx install --python python3.10 tensorflow==2.15.0 and see if you still get this error, if so, try downgrading further to 2.13.0. Let me know if this address the issue. No need to worry about the keras as it's using tensorflow backend.

Thanks for the suggestion! I have tried upgrading tensorflow and I still have this error:
File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler raise e.with_traceback(filtered_tb) from None File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/callbacks/model_checkpoint.py", line 291, in _save_model raise IOError( OSError: Please specify a non-directory filepath for ModelCheckpoint. Filepath used is an existing directory: /Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenFIeld.keras

But I will try downgrading and see.

Albert

Hi again,

It doesn't work with either tensorflow 2.15.0 or 2.13.0. Both would encounter the same error after the first epoch:

Epoch 1/1000000 972/972 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step - accuracy: 0.2522 - loss: 2.4780 Epoch 1: val_loss improved from inf to 1.26555, saving model to /Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenField.keras Traceback (most recent call last): File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/gui/training/categorizers.py", line 607, in train_categorizer CA.train_pattern_recognizer( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/categorizers.py", line 1087, in train_pattern_recognizer H = model.fit( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler raise e.with_traceback(filtered_tb) from None File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/callbacks/model_checkpoint.py", line 291, in _save_model raise IOError( OSError: Please specify a non-directory filepath for ModelCheckpoint. Filepath used is an existing directory: /Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenField.keras

Thanks,
Albert

This is strange. Would it be possible for you to send me the sorted behavior examples you used to train the Categorizer when you encountered this error? And also let me know all the settings in the GUI for training. I would like to reproduce this error to figure out the cause. Thanks!

Thanks for helping! Here is the google drive link to my behaviour examples: https://drive.google.com/file/d/166rd5e6P7yUKSaHBUDtol7hmKnq6j61M/view?usp=drive_link

I had actually tried different training setting and none of them worked anyways. But the latest times I was using
static image(non-interactive)
level 3 complexity level
32 input shapes, grey scale
No need to specify number of frames
default augmentation, no augmentation for validation

Best,
Albert

Hi Albert,

I tried using either tensorflow==2.16.1 or 2.15.0. 2.16.1 gave me the same error as you had but 2.15.0 worked fine and the training can be completed.
1

Can you make sure that the tensorflow in your pipx environment is 2.15.0? You can use pipx list to list the python packages in the pipx environment.

Some suggestions on the training after seeing your sorted examples and the setting for training (not related to the error message):

  1. You generated the behavior examples using "non-interactive" mode, but when you train the Categorizer, you used "static image" mode, which is not the correct way for training. When you train a Categorizer, you need to select exactly the same mode of the behavior examples when they were generated.

  2. You can select more augmentation methods. As you can see from the training result, the "back" behavior only has 1 sample for validation and the precision / recall is 0. You need more examples.

  3. The Categorizer will have difficulty in distinguishing "left" from "right", especially you selected "random rotation" and "random flipping" augmentation methods (which are included in the default). These augmentation methods will make the Categorizer insensitive to the location and direction of the mouse.

  4. With the current sorting, I'm afraid that the Categorizer cannot distinguish the behavior well. One reason is that there are many examples look identical but were sorted into different categories. For example, these examples were sorted into different categories but look identical. In expert's eyes there might be subtle differences, but LabGym can only learn to distinguish the differences from these images (especially when you choose "pattern recognizer only" for the Categorizer).
    Movement 26_movement
    Immobility 7_Immobility
    Movement 49_movement
    Immobility 52_Immobility
    Grooming 28_Grooming
    You probably can include the "animation analyzer" to analyze the "animations", which may help the Categorizer to acquire additional information from the animations to distinguish the behavior. And probably you don't need to include body parts in the pattern images because it introduces too many unnecessary details in the pattern images.

Hope this helps!

Hi Yujia,

Thanks for the advice! I was out of town for a conference so I put things to halt for the past week or so. I will revisit the similar categories and try animation analyzer. But I have just made sure that I have installed tensorflow 2.15.0 and try training again. However, the following error is generated:
File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/sklearn/preprocessing/_label.py", line 303, in fit raise ValueError("y has 0 samples: %r" % y) ValueError: y has 0 samples: array([], dtype=float64)

What does it mean?

Thanks,
Albert

This was probably because that you selected an empty folder for the prepared training examples, or the path to that folder contained some special characters or something that prevented the path to be read. Can you double check that and let me know. Thanks!

Hi Yujia,

I still have this error just after the first epoch:
Epoch 1/1000000 689/689 ━━━━━━━━━━━━━━━━━━━━ 0s 4s/step - accuracy: 0.2865 - loss: 2.4630 Epoch 1: val_loss improved from inf to 1.17261, saving model to /Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenField.keras Traceback (most recent call last): File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/gui/training/categorizers.py", line 627, in train_categorizer CA.train_combnet( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/categorizers.py", line 1599, in train_combnet H = model.fit( File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler raise e.with_traceback(filtered_tb) from None File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/callbacks/model_checkpoint.py", line 291, in _save_model raise IOError( OSError: Please specify a non-directory filepath for ModelCheckpoint. Filepath used is an existing directory: /Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/OpenField.keras

And I think my tensorflow is really 2.15.0 already:
package tensorflow 2.15.0, installed using Python 3.10.13 - estimator_ckpt_converter - import_pb_to_tensorboard - saved_model_cli - tensorboard - tf_upgrade_v2 - tflite_convert - toco - toco_from_protos

Hi Yujia,

I have decided to try another computer in the lab for this work. But to use that machine, I would need to operate LabGym from terminal instead of the GUI. I was look at instructions on Github but I couldn't find a specific page about using LabGym with terminal. Could you point it out to me ?

Thanks,
Albert

Hi Albert,
Sorry for the late response. LabGym can run from command line and without GUI. For now, you need to take a look at its source code and understand what each function does. But a detailed instruction and user guide on how to import and use each function from command line will be included later this year.
And for the tensorflow issue. You can try this: pipx runpip LabGym install tensorflow==2.15.0
Let me know whether it solves the issue. Thanks!

Hi Yujia,

I actually managed to train my categorizer with another machine using LabGym through terminal. So for now I will move on and test the categorizer trained. However, I encountered another error which I would appreciate your help:

Testing the selected Categorizer... The behavior mode of the Categorizer: Non-interactive. The type of the Categorizer: Animation Analyzer (Lv 3; Shape 32 X 32 X 1) + Pattern Recognizer (Lv 3; Shape 32 X 32 X 3). The length of a behavior example in the Categorizer: 30 frames. The Categorizer includes body parts in analysis with STD = 0. The Categorizer does not include background in analysis. Behavior names in the Categorizer: ['Grooming', 'Immobility', 'Rearing', 'back', 'left', 'movement', 'right', 'straight'] Traceback (most recent call last): File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/gui/training/categorizers.py", line 773, in test_categorizer CA.test_categorizer(self.file_path, self.path_to_categorizer, result_path=self.out_path) File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/categorizers.py", line 1831, in test_categorizer model = load_model(model_path) File "/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/keras/src/saving/saving_api.py", line 191, in load_model raise ValueError( ValueError: File format not supported: filepath=/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/m1. Keras 3 only supports V3 .keras files and legacy H5 format files (.h5extension). Note that the legacy SavedModel format is not supported byload_model()in Keras 3. In order to reload a TensorFlow SavedModel as an inference-only layer in Keras 3, usekeras.layers.TFSMLayer(/Users/albertfok/Library/Application Support/pipx/venvs/labgym/lib/python3.10/site-packages/LabGym/models/m1, call_endpoint='serving_default')(note that yourcall_endpoint might have a different name).

Thanks,
Albert

Oh never mind. I re-installed tensorflow following your advice and now the test categorizer step works.

I actually have another question now after testing my categorizer. My test result is much worse than the training report right after I trained the categorizer. I can't provide the reports now immediately because I didn't save the training report. Would you have any ideas what makes such difference ? Should I retrain my categorizer with the sorted examples I used for testing categorizer and then re-test with extra examples?
Thanks,
Albert

Several possibilities:

  1. The training examples did not cover the scenarios in the testing examples, which means the training examples are too different from the testing ones.
  2. The Categorizer is overfitting, which means it was not trained well. The Categorizer didn't learn correctly the most important features that distinguish the behaviors but picked up features that could distinguish the behaviors in that particular training dataset and could not generalize to new dataset.
  3. The statistical power of either training examples or testing examples is not enough to draw statistically reliable conclusions.

These are just some possible reasons. But I cannot be sure which one / or if other reasons caused this problem without additional information, like the sample numbers of each category in training and testing, or how the training and testing examples look like.

I see. If you don't mind, maybe I should ask some more basic questions about my attempt after giving you some context.
So I am trying to use labgym to categorize behaviours of mice in an open field. The mouse is attached with a miniscope to record calcium activity.
So I define 8 behaviours in the open field, as you could see earlier when you kindly help me to try training a categorizer. The numbers of examples I input to train categroizer for the 8 defined behaviours are 81, 205, 198, 52, 27, 214, 118, 32 respectively.
No matter the types of behaviours, I set each behaviour duration to be 30 frames (1 second in real life) and skip 15 frames to generate examples from randomly selected video extracts of the open field recordings.
So would it be that I don't have enough examples for training?

I don't think the training examples are very different from testing examples. As even the mice have size/shape difference among themselves, what they can do in an open field can't be too crazy...

And lastly, would it help to increase the complexity of the training network?

Thanks a lot!
Albert

I see. Generally, 52, 27, and 32 are at low side regarding to the example amount. As I mentioned previously, such low number of training examples may not be representative to cover the majority of scenarios of these categories. This is probably the reason why the training report was good, but the trained Categorizer could not generalize well to new data. With such low amount, you may need to choose more augmentation methods. What augmentation methods did you select?

Considering the differences among some categories of behaviors you defined are very subtle, for example, "turning left" vs "turning right", I think you need more training examples. And honestly, I don't think the Categorizers are very sensitive to distinguish the turning directions. So, I suggest you increase the example amount for each category to around 200 pairs of non-redundant examples and see if this improves the performance.

Networks of higher complexity typically need more training examples to train well. Otherwise, they tend to be overfitted. For now, I don't think you need to increase the complexity. In fact, networks of lower complexity typically generalize better than those of higher complexity. By the way, what the complexity levels and the input shapes of your Categorizer?

I see! I have tried complexity level 3 and 5. And shape 32.

You may start with level 2 and shape 16 for Animation Analyzer and level 2 and shape 32 for Pattern Recognizer, and increase the level for both by 1 and the shape for both by 16 every time until level reaches 4 and shape reaches 64. And see which Categorizer gives best performance.

If the performance is still not good, increase the number of training examples to around 200 pairs per category.

Thanks Yujia. I am trying what you have suggested. But while I am generating more ground truths for testing, I found that the detector I trained actually missegmented quite a lot of frames. As my recording sometimes could see the cable and the commutator that connects the miniscope on the mouse, my detector recognizes the chunk of cable as a mouse (e.g. attached picture). Would you have any suggestion in terms of improving my detector? Would it be enough to include more of these frames which my current detector missegment in my Roboflow annotation for training a better detector?
FRE0_b7 9_e16_Whole mouse_0_45_len30_std50
Screenshot 2024-05-07 at 11 55 34 PM
FRE0_b7 9_e16_Whole mouse_0_60_len30_std50
Screenshot 2024-05-07 at 11 56 01 PM

Thanks,
Albert

Yes, including more of these frames which your current detector missegment would definitely help. And what’s the settings when you trained Detector? For example, did you skip the preprocessing steps in Roboflow when generating the training dataset? What augmentation methods in Roboflow did you use? What’s the inferencing frame size and iteration number when you trained the Detector? And what’s the frame size of your videos during analysis? If possible, you may share the Roboflow dataset with me so I can take a look.

I just added some more examples to train another detector.
Yes I skip preprocessing in Roboflow.
I do flipping first, then use the output to augment again with brightness change and 90 degree rotation.
inferencing frame size 640 given my original frame size is 640*480.
iteration number: 5000

Here's my updated Roboflow dataset: https://drive.google.com/file/d/15TOfZ5qiDORHLOdQDkrSrIDbGkFD1q3I/view?usp=sharing

Thanks,
Albert

I am also trying to use a faster machine to train my detectors, which would require me to use LabGym from terminal. However I couldn't find the documentation to the module that contains the traindetector function in gui_detectors... Could you point me to the documentation so I can call the traindetector function from terminal ? Thanks!

Your training settings are all good. So you can just include more frames which your current Detector had missegmentation. And when you do augmentation, you probably don't need to do the 90 degree rotation because the video frame is 640 X 480, and if you do 90 degree rotation, the augmented images will be 480 X 640. But I suppose all your videos to analyze are 640 X 480, right? When you do augmentation, you only need to mimic the real scenarios.

To train a Detector from terminal, you can find the 'detector.py' under the LabGym folder, in which you can specify the output folder in 'Detector' class and use its 'train' function to train a Detector, and use the 'test' function is to test the Detector.

I see! Yes all of my videos are 640 X 480. Since I have included the 90 degree rotation in my current detector training, would it be harmful or it's just neutral to include them?

On Github, under LabGym folder there is only a detector folder, which only contains int.py but no detector.py
Screenshot 2024-05-08 at 12 24 07 PM

I think I understand what you mean and can still try.

Which version of LabGym did you use? If you use previous versions, there is a function traindetector(path_to_annotation, path_to_trainingimages, path_to_detector, iteration_num, inference_size) in the gui_detectors.py, which you can import and use for training the Detector. (https://github.com/yujiahu415/LabGym/blob/master/LabGym/gui_detectors.py) In newer versions of LabGym, this function has been refactored into the 'train' function in the 'Detector' class. This function is same as previous 'traindetector' except you need to specify the 'path_to_detector' within the 'Detector' class when you initiate it.

Arguments:
path_to_annotation: the path to the annotation .json file
path_to_trainingimages: the path to the folder storing all training images
path_to_detector: the path storing the trained Detector
iteration_num: the iteration number
inference_size: the inferencing frame size

And for your previous question, it might be harmful to include 90 degree rotation because scenarios of the rotated frames don't exist in real analysis.

I am using LabGym 2.4.3. I undertstand that I have to input these arguments.

    def train(
        self,
        annotation_path: str,
        training_images_path: str,
        max_num_iterations: int,
        inference_size: int,
    ) -> None:
        """Train this Detector.

        Args:
            annotation_path:
                The path to the .json annotation file in COCO format.
            training_images_path:
                The path to the folder containing the training images.
            max_num_iterations:
                The maximum number of iterations to do while training.
            inference_size:
                ???
        """

But the label is different from what you said in my version of LabGym? Right now I have problem with defining self though. It raises the error:
(env39) falbert@geometry:/usr/local/data/falbert/labgym/detectors$ python traindetector.py You need to install Detectron2 to use the Detector module in LabGym: https://detectron2.readthedocs.io/en/latest/tutorials/install.html Traceback (most recent call last): File "/usr/local/data/falbert/labgym/detectors/traindetector.py", line 7, in <module> Detector.train(self, File "/usr/local/data/tabish/test/env39/lib/python3.9/site-packages/LabGym/detector.py", line 168, in train if "LabGym_detector_train" in DatasetCatalog.list(): NameError: name 'DatasetCatalog' is not defined

You don't need to define 'self'. After you initiate python3, just from LabGym.detector import Detector, and then d=Detector(path_to_your_detector), and then d.train(annotation_path, training_images_path, max_num_iterations, inference_size). If this doesn't work for v2.4.3, upgrade LabGym to its latest v2.4.5.

But before that, make sure you have installed Detectron2 on that computer, since from the error message, it seems the Detectron2 hasn't been successfully installed. Please refer to this issue: #147 for installing correct version of Detectron2 as the newest commit of Detectron2 seemed not working.

Hi Yujia,

I think I have downloaded the right Detectron and upgraded LabGym to v2.4.5:
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/data/tabish/test/env39/lib/python3.9/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard->detectron2==0.6) (3.2.2)

But I still receive the same error trying to train a detector:
You need to install Detectron2 to use the Detector module in LabGym: https://detectron2.readthedocs.io/en/latest/tutorials/install.html Traceback (most recent call last): File "/usr/local/data/falbert/labgym/detectors/traindetector.py", line 7, in <module> d.train(training_images_path, File "/usr/local/data/tabish/test/env39/lib/python3.9/site-packages/LabGym/detector.py", line 168, in train if "LabGym_detector_train" in DatasetCatalog.list(): NameError: name 'DatasetCatalog' is not defined

Just to show my Detectron version:

(env39) falbert@geometry:/usr/local/data/falbert/labgym/detectors$ python -c "import detectron2; print(detectron2.__version__)" 0.6

Where is your detectron2 installed? Is it in the same environment as where LabGym is? You may use pip show detectron2 and pip show LabGym to verify.

Or you may try to execute the following one by one to see at which step it failed:

    from detectron2 import model_zoo
    from detectron2.checkpoint import DetectionCheckpointer
    from detectron2.config import get_cfg
    from detectron2.data import (
        DatasetCatalog,
        MetadataCatalog,
        build_detection_test_loader,
    )
    from detectron2.data.datasets import register_coco_instances
    from detectron2.engine import DefaultPredictor, DefaultTrainer
    from detectron2.evaluation import COCOEvaluator, inference_on_dataset
    from detectron2.modeling import build_model
    from detectron2.utils.visualizer import Visualizer

Hi Yujia,
I have tried several categroizers on the following defined behaviors in open field.
Screenshot 2024-05-16 at 12 20 40 PM

One issue I notice is that one behavior category is over-categorized/represented, so that in the analysis result, a lot of frames are categorized to be e.g. moving straight while it is something else indeed. In this case, should I go through my training examples for moving straight to narrow down the variability in the training examples?

Another issue is the fragmentation of categorization/annotation. For examples, all my training examples are 30 frames long, which convert to 1 second in real-life. This is because I observed that few behaviors defined by myself would be shorter than 1 second. However, my analysis output has a lot of switching of categories/annotations that are faster than 1 second. This leads to a lot of "very thin lines" in the raster plot, which is not ideal/realistic. What should I do to correct this?
behaviors_plot

Thanks!
Albert

Sorry, I would like to add another question:
The third issue is that the first 30 frames are always classified as NA in my case. I can guess that it's because my training examples are all 30 frames long so the categorizer wouldn't be able to "recognize" frame 0-29. However, I would expect that frame 1-31 could be annotated. But it's not the case. Is there a fix?

Thanks
Albert

Hi,
The fragmentation of the raster plot may be resulted from two causes:

  1. the way LabGym categorizes behavior is that it gets a set of frames, for example, 30 frames in your case, and then assess these frames and determines a behavior category at the end of the 30 frames. So the behavior categorization is "frame-wise". So there is always a delay (depends on the duration of one behavior example) in behavior categorization. This is also why the first 30 frames are always NA. And if the behavior lasts several episode (30-frame is one episode), you will see continuous categorizations, like the orange bar between 14.3-15.5 second.
  2. The Categorizer is not trained well (or the behaviors are too subtle for the Categorizer to distinguish). So it struggles in some frames during a behavior episode and categorizes them as another category, which inserts many "thin" bars of other color in one behavior episode. By your description, the Categorizer is too sensitive to "move straight" and caused a lot of false positives. You probably can discard some ambiguous / subtle examples in move straight and only keep those are definitely move straight. Meanwhile, you probably also need to refine other categories. For example, the two examples in "grooming" look very different from each other, and the right one looks very similar to a "turning". The rearing also looks similar the right example of turning. But all my judgements are only based on the "pattern images". If there are distinct features in the "animations" of these behaviors that can help to distinguish them, I suggest you put more weight on "Animation Analyzer" than "Pattern Recognizer" when training a Categorizer.