mn416/QPULib

Proposal: Document for Code Usage Notes

Opened this issue · 9 comments

I would appreciate an overall document that lists the special attributes of the library. The goal is to make the code more understandable and to get any potential up to speed quicker.

@mn416 Would like to hear if you think this is a good idea, and if the format is OK. If so, I'll flesh it out with the items I have pending for it


Code Usage Notes

This document contains specific things to know, gotcha's and limitations of the QPULib library. By being aware of these things, it is hoped that the code will be easier to use for your own purposes.

Function compile() is not Thread-Safe

Function compile() is used to compile a kernel from a class generator definition into a format that is runnable on a QPU. This uses global heaps internally for e.g. generating the AST and for storing the resulting statements.

Because the heaps are global, running compile() parallel on different threads will lead to problems. The result of the compile, however, should be fine, so it's possible to have multiple kernel instances on different threads.

As long a you run compile() on a single thread at a time, you're OK.

....<more to come>...

mn416 commented

Hi @wimrijnders,

I agree this is a good thing to document. "Code Usage Notes" sounds a bit vague to me. The particular example you give seems to fit better into a "Known Limitations" document. With more examples of the things you want to include, we might come up with a better name, or even decide that multiple documents (or sections) are required, rather than one.

OK. I have more things to document over various subjects and I was looking for a general tital. This particular thing was indeed a 'known limitation', but I also want to add parts on for example the code generators and the working of the QPU, as far as I understand it until now.

If you have a better term, please let me know. 'Developer Notes' perhaps?

How about I start it out with a 'Known Limitations' page? And see where it leads from there?

This would be another item I would like to add to 'Known Limitations'.


Float multiplication on the QPU always rounds downwards

Most CPU's make an effort to round up or down to the value nearest to the actual result of a multiplication. The ARM is one of those. The QPU's of the VideoCore, however, do not make such an effort: they always round downward.

This means that there will be small differences in the outputs of the exact same calculation on the CPU and a QPU; at first only in the least significant bits, but if you continue calculating, the differences will accumulate.

Expect results to differ between CPU and QPU calculations.

Of special note, the results between the QPULib interpreter and the actual hardware VideoCore will likely be different.

Another chapter which find useful, in light of recent experiences.


Handling privileges

In order to use the VideoCore, special privileges are required to access certain devices files. The default way is to run the applications with sudo.

You might run into the following situation (e.g.):

> obj-qpu/bin/detectPlatform 
Detected platform: Raspberry Pi 2 Model B Rev 1.1
Can't open device file: /dev/vcio
Try creating a device file with: sudo mknod /dev/vcio c 100 0

The solution for this is to become a member of group video:

> sudo useradd -g video <user>

Where you fill in a relevant user name for <user>. To enable this, logout and login, or start a new shell.

Unfortunately, this solution will not work for access to /dev/mem. You will still need to run with sudo for any application that uses the VideoCore hardware.

@mn416 Note the edits in previous comment "Handling privileges".

Another chapter. Putting it here to get rid of the postit's littering my desk.


Heap sizes are fixed

Heaps are used in several locations of the QPULib code. The heaps are a library-specific implementatioin, their main objective is to limit the amount of memory used during execution.

The heaps have an upper limit in size; If you application is complex enough, you're are likely to run into them. Error messages will be shown that indicate the problem. Heap sizes can be adjusted easily enough by adjusting the heap size and recompiling.

New chapter, based on recent insights.


Known limitations for old distributions

Following is known to occur with Raspbian wheezy.

Certain expected functions are not defined yet

Notably, missing in in /opt/vc/include/bcm_host.h:

  • bcm_host_get_peripheral_address()
  • bcm_host_get_peripheral_size()

There is a check on the presence of these function definitions in the Makefile; if not detected,
drop-in local functions will be used instead.

However, the detection is not foolproof. If you get compilation errors anyway due to absence of these bcm functions, force the following makefile variable to 'no':

#USE_BCM_HEADERS:= $(shell grep bcm_host_get_peripheral_address /opt/vc/include/bcm_host.h && echo "no" || echo "yes")
USE_BCM_HEADERS:=no       # <---- use this instead

Known limitations with compiler

Raspbian wheezy uses gcc version (@mn416 please supply your gcc version here!).
At time of writing, the code is compiled with -std=c++0x.

  • This gcc version does not compile inline initialization of class variables (C++11standard):
class Klass {
   Klass(): m_value(0) {}   // <-- Use this instead

  int m_value{0};           // <-- This won't compile
}
  • Some function definitions, which are found automatically in c++11, need explicit includes.

Known cases (there may be more):

Function Needs include
exit(int) #include <stdlib.h>
errno() #include <errno.h>
printf() etc #include <stdio.h>

@mn416 Can you agree that a 'Known Limitations' document is a good idea? Can I add it to Docs?