[Help]: 如何不再是传入图片或者目录,而是直接传入cv2的numpy数组

Question

[Help]: 如何不再是传入图片或者目录,而是直接传入cv2的numpy数组

woshiagan opened this issue 5 months ago · 7 comments

woshiagan commented 5 months ago

我的wgc截图直接可以返回numpy数组,但是没有办法直接用于你的代码里面,尝试去修改关于batch那里预处理图片的代码但是并没有成功,请问有什么办法,直接传numpy数据而不是完整的图片,或者后续会有这方面的更新吗

Answer 1 · 2024-05-03T13:19:22.000Z

给的Python demo只是一个粗略的演示，TRTYOLO.infer接收的就是numpy数组。

demo 中使用了ImageBatcher将图片读取为numpy数组，然后执行前处理操作后再传给TRTYOLO.infer。具体的你可以看一下源码。

Answer 2 · 2024-05-03T14:40:10.000Z

import time
from concurrent.futures import ThreadPoolExecutor
from pathlib import Path
import numpy as np
import rich_click as click
from loguru import logger
from rich.progress import track
from tensorrt_yolo import export
from tensorrt_yolo.infer import TRTYOLO, ImageBatcher, generate_labels_with_colors, visualize_detections
from torchvision.ops import nms

#@click.command("YOLO Series Inference Script.")
#@click.option('-e', '--engine', default='model.engine',required=True, type=str, help='The serialized TensorRT engine.')
#@click.option('-i', '--input', default='img',required=True, type=str, help="Path to the image or directory to process.")
#@click.option('-o', '--output', type=str, default='output', help='Directory where to save the visualization results.')
#@click.option("-l", "--labels", default="./labels.txt", help="File to use for reading the class labels from, default: ./labels.txt")
class TRTYOLOInference:
def init(self, engine, labels):
self.engine = engine
self.labels = labels
self.labels = generate_labels_with_colors(self.labels)
self.model = TRTYOLO(self.engine)
self.model.warmup()
def trt(self,img):
"""
YOLO Series Inference Script.
"""
for i in range(100):
detections = self.model.infer(img, [(img.shape[1], img.shape[0])])
print(detections)
import cv2
import numpy as np
from time import perf_counter as time
if name == 'main':
ab = TRTYOLOInference(engine='model.engine', labels="./labels.txt")
az=cv2.imread("1.jpg")

az=cv2.resize(az,(640,640))
input_image = np.transpose(az, (2, 0, 1))
input_image = np.expand_dims(input_image, axis=0)
da = time()

ab.trt(input_image )
print(round((time()-da)*1_000,6),'ms')
这是我的代码 成功推理了,但是推理结果好像不正确

[DetectInfo(num=100, boxes=array([[ 0, 0.017889, 0, 0.01875],
[ 1, 0.01778, 1, 0.01875],
[ 0.92656, 0.018146, 1, 0.01875],
[ 0.96289, 0.018146, 1, 0.01875],
[ 0, 0.015912, 0, 0.01875],
[ 1, 0.018292, 1, 0.01875],
[ 0.17949, 0.018338, 0.26094, 0.01875],
[ 1, 0.017505, 1, 0.01875],
[ 0, 0.017981, 0, 0.01875],
[ 0, 0.0029022, 0, 0.020435],
[ 0, 0.015912, 0, 0.01875],
[ 0, 0.018283, 0.040039, 0.01875],
[ 1, 0.017889, 1, 0.01875],
[ 0, 0.018283, 0, 0.01875],
[ 1, 0.017175, 1, 0.01875],
[ 1, 0.018338, 1, 0.01875],
[ 1, 0.017981, 1, 0.01875],
[ 0.33125, 0.01843, 0.41094, 0.01875],
[ 0.55273, 0.018283, 0.63711, 0.01875],
[ 1, 0.018292, 1, 0.01875],
[ 0, 0.11133, 0.18125, 0.13184],
[ 1, 0.018073, 1, 0.01875],
[ 1, 0.017889, 1, 0.01875],
[ 0.14258, 0.018219, 0.22344, 0.01875],
[ 1, 0.018073, 1, 0.01875],
[ 0.74023, 0.018219, 0.82461, 0.01875],
[ 1, 0.017175, 1, 0.01875],
[ 0.40391, 0.018384, 0.48359, 0.01875],
[ 0, 0.018073, 0, 0.01875],
[ 1, 0.017889, 1, 0.01875],
[ 0, 0.017651, 0, 0.01875],
[ 0.44258, 0.018283, 0.52227, 0.01875],
[ 0, 0.018219, 0, 0.01875],
[ 0, 0.018073, 0, 0.01875],
[ 1, 0.01778, 1, 0.01875],
[ 0, 0.0077179, 0, 0.018787],
[ 1, 0.015179, 1, 0.01875],
[ 0, 0.00867, 0, 0.01875],
[ 0, 0.018219, 0, 0.01875],
[ 0.06875, 0.018338, 0.14609, 0.01875],
[ 0.51758, 0.018219, 0.59727, 0.01875],
[ 1, 0.017651, 1, 0.01875],
[ 0, 0.017889, 0, 0.01875],
[ 0, 0.12949, 0, 0.13125],
[ 0.21406, 0.018384, 0.29609, 0.01875],
[ 0, 0.017981, 0, 0.01875],
[ 0, 0, 0, 0.091919],
[ 0, 0.017981, 0, 0.01875],
[ 0.19707, 0.18867, 0.28789, 0.2499],
[ 0.19707, 0.18867, 0.28789, 0.2499],
[ 0, 0.01778, 0, 0.01875],
[ 1, 0.01778, 1, 0.01875],
[ 0.89141, 0.018338, 0.97344, 0.01875],
[ 1, 0.018292, 1, 0.01875],
[ 1, 0.018073, 1, 0.01875],
[ 0.10391, 0.018283, 0.18594, 0.01875],
[ 1, 0.018219, 1, 0.01875],
[ 0.030664, 0.018283, 0.11035, 0.01875],
[ 0, 0.018155, 0, 0.01875],
[ 0.36758, 0.01843, 0.44961, 0.01875],
[ 1, 0.018292, 1, 0.01875],
[ 1, 0.12708, 1, 0.13191],
[ 0, 0.0104, 0, 0.01972],
[ 0, 0.0104, 0, 0.01972],
[ 0.97227, 0.11367, 1, 0.1314],
[ 0, 0.018219, 0, 0.01875],
[ 0.29258, 0.018384, 0.37227, 0.01875],
[ 0, 0.018155, 0.074023, 0.01875],
[ 0, 0.017651, 0, 0.01875],
[ 0.48125, 0.018283, 0.56094, 0.01875],
[ 0.66289, 0.018292, 0.7543, 0.01875],
[ 0, 0.018146, 0, 0.01875],
[ 1, 0.017889, 1, 0.01875],
[ 0, 0.12678, 0, 0.13132],
[ 0, 0.098291, 0, 0.13125],
[ 0.25391, 0.018338, 0.33359, 0.01875],
[ 0.62773, 0.018338, 0.71445, 0.01875],
[ 1, 0.018219, 1, 0.01875],
[ 0, 0.0077179, 0, 0.01897],
[ 1, 0.12576, 1, 0.13132],
[ 0, 0.0077179, 0, 0.01897],
[ 0.85273, 0.018338, 0.93477, 0.01875],
[ 0.70273, 0.018283, 0.78477, 0.01875],
[ 0, 0.0104, 0, 0.018951],
[ 0, 0.096387, 0, 0.13125],
[ 0, 0.12385, 0, 0.13132],
[ 0, 0.0104, 0, 0.018951],
[ 0.19707, 0.18867, 0.28789, 0.2499],
[ 1, 0.17402, 1, 0.24375],
[ 0, 0.1262, 0, 0.13132],
[ 1, 0.12869, 1, 0.13147],
[ 0.19707, 0.18867, 0.28789, 0.2499],
[ 0.51992, 0.10532, 0.63242, 0.13125],
[ 0.59141, 0.018292, 0.67344, 0.01875],
[ 0.77891, 0.018283, 0.86094, 0.01875],
[ 1, 0.12869, 1, 0.13132],
[ 1, 0.12722, 1, 0.13132],
[ 0, 0.12026, 0, 0.13184],
[ 0, 0.0104, 0, 0.018951]], dtype=float32), scores=array([ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1,
49, 64, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 32, 49, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45,
49, 45, 45, 45, 45, 32, 45, 45, 45, 32, 45, 45, 45, 47, 45, 45, 45, 45, 67, 45, 45, 49]))]

Answer 3 · 2024-05-03T14:50:37.000Z

不能直接reszie，比例就变了。
你的reisze操作可以直接使用ImageBatcher._preprocess_image

Answer 4 · 2024-05-03T14:58:35.000Z

好了我尝试把这个替换过去就行了现在返回的是[DetectInfo(num=2, boxes=array([[ 0, 0.39727, 1, 2.9297],
[ 0.046484, 1.3242, 0.33008, 2.5406]], dtype=float32), scores=array([ 0.89941, 0.73145], dtype=float32), classes=array([ 0, 27]))]这样的数据,应该是计算一下就可以得到结果了谢谢您

Answer 5 · 2024-05-03T15:09:08.000Z

好像还是并没有完全正确,返回的坐标值是超过1的,这应该没办法计算...
[[ 0, 0.39727, 1, 2.9297],
[ 0.046484, 1.3242, 0.33008, 2.5406]],比如这里面的1.32和最后一位2.54

Answer 6 · 2024-05-03T15:17:47.000Z

_preprocess_image中letterbox接收的宽高不是原始图片的宽高，而是模型输入的宽高。看看是不是这里有问题

Answer 7 · 2024-05-03T15:29:58.000Z

是的已经解决现在可以得到正确的坐标了好了我已经完全OK了,剩下的就是优化了