davisking/dlib

[Bug]: Failure to allocate cuda resources when using face_recognition library with "cnn" on Jetson Nano

marcjasner opened this issue · 6 comments

What Operating System(s) are you seeing this problem on?

Linux (aarch64)

dlib version

19.24

Python version

3.6.9

Compiler

gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)

Expected Behavior

I'm attempting to test face detection and landmark generation on a Jetson Nano (running Ubuntu 18.04 and Jetpack 4.6) using dlib (via the face_recognition library) with this simple python script to measure timings:

#!/usr/bin/python3
import cv2
import numpy as np
import face_recognition as faceRegLib
import time

def current_milli_time():
    return round(time.time() * 1000)

img_bgr = faceRegLib.load_image_file('willsmith.jpg')
img_rgb = cv2.cvtColor(img_bgr,cv2.COLOR_BGR2RGB)

for i in range(10):
  startTime=current_milli_time()
  face = faceRegLib.face_locations(img_rgb, model="cnn")[0]
  locTime=current_milli_time()-startTime
  demo_encode = faceRegLib.face_encodings(img_rgb)[0]
  elapsedTime = current_milli_time()-startTime
  print("Elapsed Time = {}ms    detection Time = {}ms" .format(elapsedTime, locTime))

When face_locations() is called with model="hog" then things work fine, with an average measurement of about 2.5 seconds to do face detection and landmark generation

When I changed the argument to model="cnn", which calls

cnn_face_detector = dlib.cnn_face_detection_model_v1(cnn_face_detection_model)

then the expectation is that it also works fine, but much faster as dlib (the latest github source: 19.24) is compiled with CUDA support enabled.

Current Behavior

Running the script with model="cnn" results in the following errors:

Traceback (most recent call last): File "./faceboxes.py", line 16, in <module> face = faceRegLib.face_locations(img_rgb, model="cnn")[0] File "/home/marc/.local/lib/python3.6/site-packages/face_recognition/api.py", line 119, in face_locations return [_trim_css_to_bounds(_rect_to_css(face.rect), img.shape) for face in _raw_face_locations(img, number_of_times_to_upsample, "cnn")] File "/home/marc/.local/lib/python3.6/site-packages/face_recognition/api.py", line 103, in _raw_face_locations return cnn_face_detector(img, number_of_times_to_upsample) RuntimeError: Error while calling cudnnFindConvolutionForwardAlgorithm( context(), descriptor(data), (const cudnnFilterDescriptor_t)filter_handle, (const cudnnConvolutionDescriptor_t)conv_handle, descriptor(dest_desc), num_possible_algorithms, &num_algorithms, perf_results.data()) in file /home/marc/src/dlib_github/dlib/dlib/cuda/cudnn_dlibapi.cpp:827. code: 2, reason: CUDA Resources could not be allocated. cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated cudaStreamDestroy() failed. Reason: the launch timed out and was terminated cudaFree() failed. Reason: the launch timed out and was terminated cudaFreeHost() failed. Reason: the launch timed out and was terminated
The cudaFree() and cudaFreeHots() errors repeat many times. Syslog contains a lot of messages like:

Apr 17 20:41:54 elroy kernel: [ 737.438691] 504-gm20b, pid 7436, refs 4: Apr 17 20:41:54 elroy kernel: [ 737.438693] channel status: not in use pending busy Apr 17 20:41:54 elroy kernel: [ 737.438699] RAMFC : TOP: 8000001f0005f8c0 PUT: 00000001001dbb3c GET: 00000001001dbb28 FETCH: 00000201001dbb3c

Steps to Reproduce

On a Jetson Nano (4gb) use git to clone the latest dlib repo. Then compile it using the following steps:

  1. cd dlib/
  2. sed -i 's,forward_algo = forward_best_algo;,//forward_algo = forward_best_algo;,g' dlib/cuda/cudnn_dlibapi.cpp
  3. mkdir build
  4. cd build
  5. cmake .. -DDLIB_USE_CUDA=1
  6. cmake --build .
  7. cd ..
  8. sudo python3 setup.py install
  9. pip3 install face_recognition

Step 2 fixes a known issue on Jetson Nano devices

Then run the python code above.

Anything else?

No response

Those cmake commands have no effect on the resulting python install. cmake outputs are not used when running setup.py. I would not do that sed command either, it is not recommended.

Anyway, your cuda install is probably not done correctly. The cuda toolkit is very particular about being installed just right. If you don't follow exactly the instructions nvidia gives for your platform it will often fail like this. Which is a common problem for people and not something dlib has any control over.

Thanks for the reply. I'll rebuild it without the sed change.

Cuda was pre-installed by NVIDIA in the Jetson Nano OS image. I'll double check the installation and see if anything sticks out though. Thanks again

In 2022 I used the Jetson Nano 4GB, I remember that cuDnn was not correctly installed in the image provided by NVidia, and there were problems when I compiled and ran programs made in C++. After reinstalling cuDnn, everything started working perfectly. I think you need to reinstall cuDnn.

Warning: this issue has been inactive for 37 days and will be automatically closed on 2024-06-28 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

Warning: this issue has been inactive for 44 days and will be automatically closed on 2024-06-28 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

Notice: this issue has been closed because it has been inactive for 45 days. You may reopen this issue if it has been closed in error.