/FastSegFormer-pyqt

FastSegFormer-QTUI && Jetson Nano Deployment

Primary LanguagePythonMIT LicenseMIT

FastSegFormer-pyqt

中文

Navel orange defect segmentation model video detection UI.

Update

  • Create PyQT interface for navel orange defect segmentation. (May/10/2023)
  • Produce 30 frames of navel orange assembly line simulation video. (May/13/2023)
  • Support onnx format for video detection. (May/14/2023)
  • Using multi-threaded processing, the main thread updates the UI and the sub-threads are used to process the video frames and improve the FPS to 48~60.(May/25/2023)
  • Deployed on Jetson Nano(4G), an edge computing device with ONNXRuntime and TensorRT.(May/30/2023)
  • Acceleration with DeepStream Framework on Jetson Nano(4G).(June/16/2023)

Demo

Cityscapes Cityscapes
Navel orange simulation line detection video

Usage

  • Environment Configuration:
$ conda activate 'your anaconda environment'
$ pip install -r requirements.txt 
  • Run project:
python run_gui.py
  • Jetson Nano Deployment: Usage

Testing performance comparison

All the following tests are not only network inference, but also include pre-processing and post-processing, and the processing of video frames may be different for different methods.

  • System: Windows 10, CPU: Intel(R) Core(TM) i5-10500 CPU @ 3.10GHz GPU: NVIDIA GeForce RTX 3060(12G)
FastSegFormer-pyqt
Task Video input Inference input Inference framework GPU computing capability Quantification Video processing Average FPS
Video Detection $512\times 512$ $224\times 224$ PyTorch 12.74 TFLOPS FP32 Single frame 32.62
ONNXRuntime 32.64
PyTorch 32.24
ONNXRuntime 32.66
PyTorch Multi-thread 46.94
ONNXRuntime 46.81

Conclusion:

  1. In the mode of GPU for inference, there is almost no difference in inference time between PyTorch and ONNXRuntime.
  2. There is almost no difference in GPU inference time using cards with the same FP32 and FP16 arithmetic power.
  3. The main thread handles the input and output of the video, and the secondary thread handles the inference of the single-frame image, which has a great improvement on the detection performance of the video.
  • System: Ubuntu 18.04 CPU: ARM Cortex-A57 @ 1.43GHz GPU: NVIDIA Maxwell @ 921MHz
Jetson-FastSegFormer
Task Video/Stream input Inference input Inference framework GPU computing capability Quantification Video processing Average FPS
Video Detection $512\times 512$ $224\times 224$ ONNXRuntime 0.4716 TFLOPS FP16 Single frame 10
TensorRT 15
ONNXRuntime Multi-thread ~
TensorRT ~
TensorRT DeepStream 23
CSI Camera Detection $1280\times 720$ ONNXRuntime Single frame 8
TensorRT 12
ONNXRuntime Multi-thread ~
TensorRT ~
TensorRT DeepStream 20
~:Can't run dual-threaded acceleration on Jetson nano (4G) because of lack of memory.