uncomplicate/neanderthal

Call to 'native_sin' is ambiguous

pfeodrippe opened this issue · 6 comments

I'm in a macbook pro mid 2012 with a Intel 4000 graphics card with OpenCL 1.2.

I'm trying to run this code

(ns playground
  (:use [uncomplicate.neanderthal core native])
  (:require             
            [uncomplicate.clojurecl.info :as cl-info]
            [uncomplicate.clojurecl.core :as cl]
            [uncomplicate.commons.core :as unc-commons.core :refer [with-release let-release]]
            [uncomplicate.neanderthal
             [native :refer [fv fge native-float]]
             [core :refer [submatrix native asum]]
             [math :refer [sqrt log sin pi sqr]]
             [opencl :refer [clv clge] :as opencl]
             [random :refer [rand-normal! rand-uniform! rng-state]]]))

(cl/with-platform (first (cl/platforms))
    (let [dev (first (cl/sort-by-cl-version (cl/devices :gpu)))]
      (cl/with-context (cl/context [dev])
        (cl/with-queue (cl/command-queue-1 dev)
          (opencl/with-default-engine
            23)))))

Error is

{:name "CL_BUILD_PROGRAM_FAILURE",
 :code -11,
 :type :opencl-error,
 :details
 ({:build-status :error,
   :build-options
   "-DREAL=float -DREAL4=float4 -DWGS=512 -I /var/folders/xl/0gx4mcfd1qv1wvcxvcntqzfw0000gn/T/uncomplicate_10950292996318470328/",
   :build-log "REDACTED"
   :binary-type :none,
   :global-variable-total-size "CL_INVALID_OPERATION"})}

It appears to be that some errors are happening when trying to compile "random.cl", you can see the build log at https://gist.github.com/pfeodrippe/3e093390c5a93d59cc55077edb2011e9.

Well, I've pulled neanderthal locally and I've replaces all doubles with floats at random.cl, it's working now, I've just discovered that I don't have support for double precision in my GPU.

If it's interesting for you to solve this, maybe it should be better to create 2 .cl files, one for older generation GPUs like mine and the actual one, I could tackle this soon if you don't have time or a old GPU o/

If you would like to know about performance using this GPU, to generate one hundred million random numbers, it takes ~200 ms. When I try to generate 1 billion random numbers, it explodes with an out of memory error.

Thanks for your great library, it's awesome!

On the other hand, why use a slow and old GPU in this case when Neanderthal's CPU engine will do the same task faster, more reliably, and without any setup/context?

Good question, I’m introducing myself to finite element analysis using Neanderthal, it has been a wonderful approach, and I’d like to make faster numerical differentiation/integration using GPU.

I’ll test it first and develop everything in my old MacBook Pro (actually it’s very expensive to buy Apple products here in Brazil) and then maybe rent some Amazon machines to see, among other things, the real gain in performance.

Does it make sense for you?

But the code you'd write for fast Nvidia or AMD GPUs will be different than the code you write for low-end Intel's integrated card. Nvidia and AMD have similar architecture, but Intel's HD is quide different.

Hunn, understood, I'll start to read "OpenCL in Action" to understand more about it.

Thank you @blueberry, I'll be closing this Issue o/

And as you've said, based in your examples at https://dragan.rocks/articles/19/Billion-random-numbers-blink-eye-Clojure, the native code is only slower at the scale of one hundred million of random number compared to this GPU (1.5x slower), it appears to not be worthy to support it.