Ricks-Lab/gpu-utils

User Guide - Contributors Needed

Ricks-Lab opened this issue · 12 comments

I have started a new markdown format file as a User Guide. If you would like to contribute, just edit this file: USER_GUIDE.md and do a pull request. If you are not that familiar with markdown, don't worry about format too much, and I will tune the look and feel of the document. Thanks!

@csecht I have made some progress in writing a user guide. It would be great if you could look it over and make modification that you think appropriate. You should be able to fork it, make edits, and make a pull request.

Yes, that worked. Something new for me! I only made a few edits, mostly typographical. in the amdgpu-pac section, I feel like something should be said about how, in current version, fan speed with decrease with each PAC Save but that there are stable values, but don’t know whether that's specific to my system or general to other AMD cards, drivers, etc.

I saw the same effect for fans on the Radeon VII, so I think it is a good idea the mention it.

Have you done a pull request yet? It would be good to make sure we can merge your changes before we get to far.

Yes, I have a pull request open under my cecht fork.

I think that pull request only applies to your fork. You need to push to mine. I think the easiest approach is to click the top right icon for edit when viewing in my repository. This will create a fork and put you in edit mode with a button at the bottom to request pull.

Here is an article on how to push a change from a clone of your fork to the original repository. I think that is the best approach. fork

@csecht
I was considering 2 more sections to the guide:

  1. GPU Type - a discussion on the characteristics and amdgpu-utils handling of pre and post Radeon vii GPUs.
  2. Describe how to have change effective on boot up.

Number requires some investigation.

In the Getting Started section of the Users Guide, it says to check that an amdgpu driver package is installed with this command:

dpkg -l amdgpu-core amdgpu amdgpu-pro

On my system, Ubuntu 18.04.3, it reported:

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                     Version           Architecture      Description
+++-========================-=================-=================-=====================================================
ii  amdgpu-core              19.20-812932      all               Core meta package for unified amdgpu driver.
dpkg-query: no packages found matching amdgpu
dpkg-query: no packages found matching amdgpu-pro

which is actually sufficient, but only so long as the opencl drivers have been installed from within a amdgpu-pro-19.20-* package directory using this command, for example:

<user>:~/amdgpu-pro-19.20-812932-ubuntu-18.04$ ./amdgpu-pro-install --opencl=legacy --headless

For Polaris and earlier AMD GPUs, the 5.0.0 Linux kernel has the necessary amdgpu drivers, so no additional amdgpu driver installation is needed, just the OpenCL components from the amdgpu-pro package. (I'm assuming for Vega cards that the amdgpu-pro driver stack does need to be installed, but am unsure.)
In fact, when I install 19.20 amdgpu drivers from a downloaded AMD package, something in that installation prevents Ubuntu Desktop from loading after a reboot, resulting in a login screen loop (while fixable, is a bit of a pain, and somewhat panic-inducing for inexperienced users.)
I don't know how it works with other AMD GPU configurations, but it seems like more information or clarification is needed concerning the dpkg check for amdgpu drivers. For example, when it reports "no packages found matching amdgpu", folks shouldn't think they need to install amdgpu or amdgpu-pro when amdgpu-core alone is sufficient. I'm not sure how this should be generally worded in the User Guide. Is it okay if only one of the three packages are reported as installed?
Also, perhaps users should check whether OpenCL is installed? I know this is needed for Einstein@Home crunching, but not required for amdgpu-utils. This may be too much detail for the User Guide, but on the other hand that section is about amdgpu installation.

Good point. The original comment of checking all three packages is how it is implemented in the code. It checks all 3, one at a time, and verifies that at least one is valid. I have modified the user_guide to indicate the execution of the following to check:

dpkg -l 'amdgpu*'