shreyashampali/HOnnotate

ResourceExhaustedError: OutOfMemory error

Closed this issue · 6 comments

Hi @shreyashampali ,

Thank you for the code. I am trying to run the code with test folder you provided.

System Configuration: GTX 1050Ti 4GB RAM

WHen I run this command : "objectTrackingSingleFrame.py --seq 'test' " it is saying that GPU OutOfMemory. Is there a way to run in on 4GB graphics ?

are these any FLAGS that I could modify to lower the data ?

Thank you for your suggestions.

Error cleared

How was this error resolved? I am encountering this very same error right now, and I am using the following two gpu's.
rtx3090
quadro k2200

Specify the two gpu's as follows
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0, 1"

I took the following steps to address the gpu memory error.
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.5
config.gpu_options.allow_growth = True
session = tf.Session(config=config)

However, a memory error occurs. This is a typical error statement.
2024-01-05 15:52:04.495217: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_ EXECUTION_FAILED

How did it work in your case?

This is the full text of the error

(HOnnotate) initial@d3050eb4fa4f:~/workspace/HOnnotate/optimization$ python objectTrackingSingleFrame.py --seq 'test'
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/IPython/lib/pretty.py:91: DeprecationWarning: IPython.utils.signatures backport for Python 2 is deprecated in IPython 6, which only supports Python 3
from IPython.utils.signatures import signature
0-0
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
return _inspect.getargspec(target)
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
return _inspect.getargspec(target)
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
return _inspect.getargspec(target)
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
return _inspect.getargspec(target)
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
return _inspect.getargspec(target)
WARNING:tensorflow:From /home/initial/workspace/HOnnotate/optimization/ghope/loss.py:225: Print (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2018-08-20.
Instructions for updating:
Use tf.print instead of tf.Print. Note that tf.print returns a no-output operator that directly prints the output. Outside of defuns or eager mode, this operator will not be executed unless it is directly specified in session.run or used as a control dependency for other operators. This is only a concern in graph mode. Below is an example of how to ensure tf.print executes in graph mode:

    sess = tf.Session()
    with sess.as_default():
        tensor = tf.range(10)
        print_op = tf.print(tensor)
        with tf.control_dependencies([print_op]):
          out = tf.add(tensor, tensor)
        sess.run(out)
    ```
Additionally, to use tf.print in python 2.7, users must make sure to import
the following:

  `from __future__ import print_function`

/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/chumpy/ch.py:1203: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  want_out = 'out' in inspect.getargspec(func).args
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/numpy/matrixlib/defmatrix.py:68: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  return matrix(data, dtype=dtype, copy=False)
/home/initial/workspace/HOnnotate/optimization/object/batch_object.py:65: RuntimeWarning: invalid value encountered in true_divide
  vn = mesh.vn / np.expand_dims(np.linalg.norm(mesh.vn, ord=2, axis=1), 1) # normalize to unit vec
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
WARNING:tensorflow:From /home/initial/workspace/HOnnotate/optimization/ghope/icp.py:112: arg_min (from tensorflow.python.ops.gen_math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `argmin` instead
2024-01-05 16:08:16.293375: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2024-01-05 16:08:16.509855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: NVIDIA GeForce RTX 3090 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:02:00.0
totalMemory: 23.70GiB freeMemory: 22.74GiB
2024-01-05 16:08:16.603334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties: 
name: Quadro K2200 major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:03:00.0
totalMemory: 3.95GiB freeMemory: 3.91GiB
2024-01-05 16:08:16.603402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1
2024-01-05 16:17:48.391952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-01-05 16:17:48.392019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 1 
2024-01-05 16:17:48.392030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N N 
2024-01-05 16:17:48.392037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1:   N N 
2024-01-05 16:17:48.429163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 12132 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:02:00.0, compute capability: 8.6)
2024-01-05 16:17:48.429938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 2021 MB memory) -> physical GPU (device: 1, name: Quadro K2200, pci bus id: 0000:03:00.0, compute capability: 5.0)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
maskPC for Image test/0/00002 is 0.132587
[Loading New frame ][0/00002]
2024-01-05 16:19:51.326590: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[{{node obj6/MatMul_1}} = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "objectTrackingSingleFrame.py", line 356, in <module>
    objectTracker(w, h, paramInit, camProp, mesh, out_dir, configData)
  File "objectTrackingSingleFrame.py", line 154, in objectTracker
    opti1.runOptimization(session, 1, {loadData:True})
  File "/home/initial/workspace/HOnnotate/optimization/ghope/utils.py", line 159, in timed
    result = method(*args, **kw)
  File "/home/initial/workspace/HOnnotate/optimization/ghope/optimization.py", line 68, in runOptimization
    session.run(self.optOp, feed_dict=feedDict)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[node obj6/MatMul_1 (defined at /home/initial/workspace/HOnnotate/optimization/object/batch_object.py:51)  = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

Caused by op 'obj6/MatMul_1', defined at:
  File "objectTrackingSingleFrame.py", line 356, in <module>
    objectTracker(w, h, paramInit, camProp, mesh, out_dir, configData)
  File "objectTrackingSingleFrame.py", line 73, in objectTracker
    finalMesh = scene.getFinalMesh()
  File "/home/initial/workspace/HOnnotate/optimization/ghope/scene.py", line 718, in getFinalMesh
    objMesh = self.objModel(prop.mesh, rotObj, transObj, prop.segColor, name=prop.ID)
  File "/home/initial/workspace/HOnnotate/optimization/object/batch_object.py", line 51, in __call__
    verts = tf.matmul(vertices, self.objPoseMat) # NxMx4
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 2019, in matmul
    a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1245, in batch_mat_mul
    "BatchMatMul", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InternalError (see above for traceback): Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[node obj6/MatMul_1 (defined at /home/initial/workspace/HOnnotate/optimization/object/batch_object.py:51)  = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

How was this error resolved? I am encountering this very same error right now, and I am using the following two gpu's. rtx3090 quadro k2200

Specify the two gpu's as follows os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0, 1"

I took the following steps to address the gpu memory error. config = tf.ConfigProto() config.gpu_options.per_process_gpu_memory_fraction = 0.5 config.gpu_options.allow_growth = True session = tf.Session(config=config)

However, a memory error occurs. This is a typical error statement. 2024-01-05 15:52:04.495217: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_ EXECUTION_FAILED

How did it work in your case?

This is the full text of the error

(HOnnotate) initial@d3050eb4fa4f:~/workspace/HOnnotate/optimization$ python objectTrackingSingleFrame.py --seq 'test' /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/IPython/lib/pretty.py:91: DeprecationWarning: IPython.utils.signatures backport for Python 2 is deprecated in IPython 6, which only supports Python 3 from IPython.utils.signatures import signature 0-0 /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) WARNING:tensorflow:From /home/initial/workspace/HOnnotate/optimization/ghope/loss.py:225: Print (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2018-08-20. Instructions for updating: Use tf.print instead of tf.Print. Note that tf.print returns a no-output operator that directly prints the output. Outside of defuns or eager mode, this operator will not be executed unless it is directly specified in session.run or used as a control dependency for other operators. This is only a concern in graph mode. Below is an example of how to ensure tf.print executes in graph mode:

    sess = tf.Session()
    with sess.as_default():
        tensor = tf.range(10)
        print_op = tf.print(tensor)
        with tf.control_dependencies([print_op]):
          out = tf.add(tensor, tensor)
        sess.run(out)
    ```
Additionally, to use tf.print in python 2.7, users must make sure to import
the following:

  `from __future__ import print_function`

/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/chumpy/ch.py:1203: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  want_out = 'out' in inspect.getargspec(func).args
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/numpy/matrixlib/defmatrix.py:68: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  return matrix(data, dtype=dtype, copy=False)
/home/initial/workspace/HOnnotate/optimization/object/batch_object.py:65: RuntimeWarning: invalid value encountered in true_divide
  vn = mesh.vn / np.expand_dims(np.linalg.norm(mesh.vn, ord=2, axis=1), 1) # normalize to unit vec
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
WARNING:tensorflow:From /home/initial/workspace/HOnnotate/optimization/ghope/icp.py:112: arg_min (from tensorflow.python.ops.gen_math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `argmin` instead
2024-01-05 16:08:16.293375: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2024-01-05 16:08:16.509855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: NVIDIA GeForce RTX 3090 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:02:00.0
totalMemory: 23.70GiB freeMemory: 22.74GiB
2024-01-05 16:08:16.603334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties: 
name: Quadro K2200 major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:03:00.0
totalMemory: 3.95GiB freeMemory: 3.91GiB
2024-01-05 16:08:16.603402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1
2024-01-05 16:17:48.391952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-01-05 16:17:48.392019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 1 
2024-01-05 16:17:48.392030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N N 
2024-01-05 16:17:48.392037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1:   N N 
2024-01-05 16:17:48.429163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 12132 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:02:00.0, compute capability: 8.6)
2024-01-05 16:17:48.429938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 2021 MB memory) -> physical GPU (device: 1, name: Quadro K2200, pci bus id: 0000:03:00.0, compute capability: 5.0)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
maskPC for Image test/0/00002 is 0.132587
[Loading New frame ][0/00002]
2024-01-05 16:19:51.326590: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[{{node obj6/MatMul_1}} = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "objectTrackingSingleFrame.py", line 356, in <module>
    objectTracker(w, h, paramInit, camProp, mesh, out_dir, configData)
  File "objectTrackingSingleFrame.py", line 154, in objectTracker
    opti1.runOptimization(session, 1, {loadData:True})
  File "/home/initial/workspace/HOnnotate/optimization/ghope/utils.py", line 159, in timed
    result = method(*args, **kw)
  File "/home/initial/workspace/HOnnotate/optimization/ghope/optimization.py", line 68, in runOptimization
    session.run(self.optOp, feed_dict=feedDict)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[node obj6/MatMul_1 (defined at /home/initial/workspace/HOnnotate/optimization/object/batch_object.py:51)  = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

Caused by op 'obj6/MatMul_1', defined at:
  File "objectTrackingSingleFrame.py", line 356, in <module>
    objectTracker(w, h, paramInit, camProp, mesh, out_dir, configData)
  File "objectTrackingSingleFrame.py", line 73, in objectTracker
    finalMesh = scene.getFinalMesh()
  File "/home/initial/workspace/HOnnotate/optimization/ghope/scene.py", line 718, in getFinalMesh
    objMesh = self.objModel(prop.mesh, rotObj, transObj, prop.segColor, name=prop.ID)
  File "/home/initial/workspace/HOnnotate/optimization/object/batch_object.py", line 51, in __call__
    verts = tf.matmul(vertices, self.objPoseMat) # NxMx4
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 2019, in matmul
    a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1245, in batch_mat_mul
    "BatchMatMul", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InternalError (see above for traceback): Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[node obj6/MatMul_1 (defined at /home/initial/workspace/HOnnotate/optimization/object/batch_object.py:51)  = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

I have the same error. Have you resolved it?

Error cleared

Hello brother, I have the same problem as you. Have you solved it now?

How was this error resolved? I am encountering this very same error right now, and I am using the following two gpu's. rtx3090 quadro k2200
Specify the two gpu's as follows os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"] = "0, 1"
I took the following steps to address the gpu memory error. config = tf.ConfigProto() config.gpu_options.per_process_gpu_memory_fraction = 0.5 config.gpu_options.allow_growth = True session = tf.Session(config=config)
However, a memory error occurs. This is a typical error statement. 2024-01-05 15:52:04.495217: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_ EXECUTION_FAILED
How did it work in your case?

This is the full text of the error
(HOnnotate) initial@d3050eb4fa4f:~/workspace/HOnnotate/optimization$ python objectTrackingSingleFrame.py --seq 'test' /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/IPython/lib/pretty.py:91: DeprecationWarning: IPython.utils.signatures backport for Python 2 is deprecated in IPython 6, which only supports Python 3 from IPython.utils.signatures import signature 0-0 /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) /home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead return _inspect.getargspec(target) WARNING:tensorflow:From /home/initial/workspace/HOnnotate/optimization/ghope/loss.py:225: Print (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2018-08-20. Instructions for updating: Use tf.print instead of tf.Print. Note that tf.print returns a no-output operator that directly prints the output. Outside of defuns or eager mode, this operator will not be executed unless it is directly specified in session.run or used as a control dependency for other operators. This is only a concern in graph mode. Below is an example of how to ensure tf.print executes in graph mode:

    sess = tf.Session()
    with sess.as_default():
        tensor = tf.range(10)
        print_op = tf.print(tensor)
        with tf.control_dependencies([print_op]):
          out = tf.add(tensor, tensor)
        sess.run(out)
    ```
Additionally, to use tf.print in python 2.7, users must make sure to import
the following:

  `from __future__ import print_function`

/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/chumpy/ch.py:1203: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  want_out = 'out' in inspect.getargspec(func).args
/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/numpy/matrixlib/defmatrix.py:68: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
  return matrix(data, dtype=dtype, copy=False)
/home/initial/workspace/HOnnotate/optimization/object/batch_object.py:65: RuntimeWarning: invalid value encountered in true_divide
  vn = mesh.vn / np.expand_dims(np.linalg.norm(mesh.vn, ord=2, axis=1), 1) # normalize to unit vec
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
/home/initial/workspace/HOnnotate/optimization/ghope/loss.py:106: DeprecationWarning: Both axis > a.ndim and axis < -a.ndim - 1 are deprecated and will raise an AxisError in the future.
  gaussKernel = np.expand_dims(gaussKernel, 3)
WARNING:tensorflow:From /home/initial/workspace/HOnnotate/optimization/ghope/icp.py:112: arg_min (from tensorflow.python.ops.gen_math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `argmin` instead
2024-01-05 16:08:16.293375: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2024-01-05 16:08:16.509855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: NVIDIA GeForce RTX 3090 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:02:00.0
totalMemory: 23.70GiB freeMemory: 22.74GiB
2024-01-05 16:08:16.603334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties: 
name: Quadro K2200 major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:03:00.0
totalMemory: 3.95GiB freeMemory: 3.91GiB
2024-01-05 16:08:16.603402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1
2024-01-05 16:17:48.391952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-01-05 16:17:48.392019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 1 
2024-01-05 16:17:48.392030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N N 
2024-01-05 16:17:48.392037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1:   N N 
2024-01-05 16:17:48.429163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 12132 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:02:00.0, compute capability: 8.6)
2024-01-05 16:17:48.429938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 2021 MB memory) -> physical GPU (device: 1, name: Quadro K2200, pci bus id: 0000:03:00.0, compute capability: 5.0)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
maskPC for Image test/0/00002 is 0.132587
[Loading New frame ][0/00002]
2024-01-05 16:19:51.326590: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[{{node obj6/MatMul_1}} = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "objectTrackingSingleFrame.py", line 356, in <module>
    objectTracker(w, h, paramInit, camProp, mesh, out_dir, configData)
  File "objectTrackingSingleFrame.py", line 154, in objectTracker
    opti1.runOptimization(session, 1, {loadData:True})
  File "/home/initial/workspace/HOnnotate/optimization/ghope/utils.py", line 159, in timed
    result = method(*args, **kw)
  File "/home/initial/workspace/HOnnotate/optimization/ghope/optimization.py", line 68, in runOptimization
    session.run(self.optOp, feed_dict=feedDict)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[node obj6/MatMul_1 (defined at /home/initial/workspace/HOnnotate/optimization/object/batch_object.py:51)  = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

Caused by op 'obj6/MatMul_1', defined at:
  File "objectTrackingSingleFrame.py", line 356, in <module>
    objectTracker(w, h, paramInit, camProp, mesh, out_dir, configData)
  File "objectTrackingSingleFrame.py", line 73, in objectTracker
    finalMesh = scene.getFinalMesh()
  File "/home/initial/workspace/HOnnotate/optimization/ghope/scene.py", line 718, in getFinalMesh
    objMesh = self.objModel(prop.mesh, rotObj, transObj, prop.segColor, name=prop.ID)
  File "/home/initial/workspace/HOnnotate/optimization/object/batch_object.py", line 51, in __call__
    verts = tf.matmul(vertices, self.objPoseMat) # NxMx4
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 2019, in matmul
    a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1245, in batch_mat_mul
    "BatchMatMul", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/home/initial/miniconda3/envs/HOnnotate/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InternalError (see above for traceback): Blas xGEMM launch failed : a.shape=[1,262146,4], b.shape=[1,4,4], m=262146, n=4, k=4
         [[node obj6/MatMul_1 (defined at /home/initial/workspace/HOnnotate/optimization/object/batch_object.py:51)  = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](obj6/Tile, obj6/MatMul)]]

I have the same error. Have you resolved it?

Hello brother, I have the same problem as you. Have you solved it now?