/yolo2video

Primary LanguagePythonMIT LicenseMIT

Yolo Model에 ddkit 적용


Usage

Use --help to see usage of yolo_video.py:

usage: yolo_video.py [-h] [--model MODEL] [--anchors ANCHORS]
                     [--classes CLASSES] [--gpu_num GPU_NUM] [--image]
                     [--input] [--output]

positional arguments:
  --input        Video input path
  --output       Video output path

optional arguments:
  -h, --help         show this help message and exit
  --model MODEL      path to model weight file, default model_data/yolo.h5
  --anchors ANCHORS  path to anchor definitions, default
                     model_data/yolo_anchors.txt
  --classes CLASSES  path to class definitions, default
                     model_data/coco_classes.txt
  --gpu_num GPU_NUM  Number of GPU to use, default 1
  --image            Image detection mode, will ignore all positional arguments

Config 수정

yolo.py 파일 내에 아래와 같이 수정하여 원하는 설정 적용 가능:

_defaults = {
    "model_path": '[모델파일 경로]',
    "anchors_path": 'model_data/yolo_anchors.txt',
    "classes_path": '[class list]',
    "score" : [최소 컨피던스 값],
    "iou" : [최소 iou],
    "model_image_size" : (416, 416),
    "gpu_num" : 1,
    # "selected_objects": [2, 5, 7]
}

Some issues to know

  1. The test environment is

    • Python 3.5.2
    • Keras 2.1.5
    • tensorflow 1.6.0
  2. Default anchors are used. If you use your own anchors, probably some changes are needed.

  3. The inference result is not totally the same as Darknet but the difference is small.

  4. The speed is slower than Darknet. Replacing PIL with opencv may help a little.

  5. Always load pretrained weights and freeze layers in the first stage of training. Or try Darknet training. It's OK if there is a mismatch warning.

  6. The training strategy is for reference only. Adjust it according to your dataset and your goal. And add further strategy if needed.

  7. For speeding up the training process with frozen layers train_bottleneck.py can be used. It will compute the bottleneck features of the frozen model first and then only trains the last layers. This makes training on CPU possible in a reasonable time. See this for more information on bottleneck features.