/YOLO-anime-hands

BSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

YOLO-anime-hands

Example with YOLOv8x: yolo

A model that is trained on only gwerns data with seems to struggle with gloves, handshakes and more complex hands. I thus added some custom data. I also tried to train nano and medium sized yolo, but that resulted in models with severe accuracy problems.

There is also adetailer which has multiple models for this task, but these models usually have low conficence detections in drawn images which are sometimes below 50% and are prone to misdetection.

Training code reference:

from ultralytics import YOLO

model = YOLO('model.pt')

# training will abort early due to early stopping
results = model.train(data='coco128.yaml', epochs=10000, imgsz=640, batch=20, amp=True)

Usage example:

from PIL import Image
import cv2
from ultralytics import YOLO

model = YOLO('model.pt')

results = model('test.jpg') # conf=0.5)

for r in results:
    im_array = r.plot()
    im = Image.fromarray(im_array[..., ::-1])
    img = cv2.cvtColor(im_array[..., ::-1], cv2.COLOR_BGR2RGB)
    cv2.imwrite("test_output.jpg", img) 

To process a folder with images:

import os
from PIL import Image
import cv2
from ultralytics import YOLO
from tqdm import tqdm

model = YOLO('model.pt')

input_folder = '/'
output_folder = '/'

if not os.path.exists(output_folder):
    os.makedirs(output_folder)

for file_name in os.listdir(input_folder):
    if file_name.endswith(('.jpg', '.png', '.webp')):
        image_path = os.path.join(input_folder, file_name)
        results = model(image_path)
        output_path = os.path.join(output_folder, file_name)

        for r in results:
            im_array = r.plot()
            im = Image.fromarray(im_array[..., ::-1])
            img = cv2.cvtColor(im_array[..., ::-1], cv2.COLOR_BGR2RGB)
            cv2.imwrite(output_path, img)

Graphs

Training with a 4090 and Prodigy optimizer set to 1. Using ultralytics/ultralytics commit db2af70d3910f168a62ecaae4d920e1440f08c7e because newer versions seem to have converging problems and train much slower. May be due to unsuitable defaults.

YOLOv8x_*: best last best onnx best dynamic onnx last onnx last dynamic onnx csv

  • 992 epochs
  • batch 30 (?)
  • dataset:

results

YOLOv8x_*_finetuned: best last best onnx best dynamic onnx last onnx last dynamic onnx csv

  • used gwern trained YOLOv8x as pretrain
  • 704 epochs
  • batch 30 (?)
  • dataset:
    • own custom data (924 images)

results

YOLOv9e_*: best last best onnx best dynamic onnx last onnx last dynamic onnx csv

  • 1181 epochs
  • 53.533 hours
  • batch 14
  • dataset:

results

YOLOv9e_*_finetuned: best last best onnx best dynamic onnx last onnx last dynamic onnx csv

  • used gwern trained YOLOv9e as pretrain
  • 725 epochs
  • 6.852 hours
  • batch 14
  • dataset:
    • own custom data (1069 images)

results

YOLOv9e_gwern+own_*: best last best onnx best dynamic onnx last onnx last dynamic onnx csv

  • 1177 epochs
  • 56.442 hours
  • batch 14
  • dataset (6440 images):
    • gwern (5371 images)
    • own data (1069 images)

results

YOLOv9e_all_*: best last best onnx best dynamic onnx last onnx last dynamic onnx csv

results

Dataset graphs:

Gwerns dataset (5371 images):

labels

My custom dataset (1069 images):

  • own data (1069 images)

labels

Gwerns + own data (6440 images):

  • gwern (5371 images)
  • own data (1069 images)

labels

All combined (17392 images):

labels