Gridengine
This repository contains a fork of the Son of Grid Engine project in conjunction with some documentation and fixes to get the gridengine working on more recent Linux systems.
Building
Before building make sure you are relaxed and your cup of coffee (or filling of your choice) is full and well temperated.
Then take a deep breath and be prepared for turbulence.
In general the build process consists of the following steps
- Build the dependency tool and create dependencies with
aimk
- Compile with
aimk
The suggested (working) build options are: aimk -no-herd -no-java
OpenSUSE 15.0 LEAP
Instructions updated on 10.01.2019
Install Requirements with
zypper install java-1_8_0-openjdk java-1_8_0-openjdk-devel gcc ant automake hwloc-devel libopenssl-1_0_0-devel libdb-4_8-devel pam-devel libXt-devel motif-devel xorg-x11-devel
# Notes: * for the openjdk you can also use a more recent version
* The version libopenssl-1_0_0-devel is required and needs to uninstall the (by default) installed version 1.1
Prepare the environment by executing the bootstrap.sh
script
cd sge-8.1.9/source
./scripts/bootstrap.sh
Then build the SGE using
./aimk -no-herd -no-java
# No HADOOP support and no Java support
# Note Java is not needed for qmon!
The build process takes some time. The generated binaries are (in my case) in the LINUXAMD64
folder in sources
Now install the binaries to SGE_ROOT
:
export SGE_ROOT="/opt/sge/" # Or whereever you want to install the grid engine to
scripts/distinst -local -allall -noexit # asks for confirmation
cd $SGE_ROOT
./inst_sge -m -x -csp # or run ./start_gui_installer
Done
CentOS 7 (1810)
Instructions updated on 18.01.2019
IMPORTANT: Please build the SGE not under root! I encountered some cryptic linker errors as root, that disappeared when building as unprivileged user. Also ... (shame on me!) you should never build as root anyways ...
Install Requirements with
yum install csh java-1.8.0-openjdk java-1.8.0-openjdk-devel gcc ant automake hwloc-devel openssl-devel libdb-devel pam-devel libXt-devel motif-devel ncurses-libs ncurses-devel
Then, as unprivileged user, go into a tmux
or screen
session and start the building process with
cd sge-8.1.9/source
./scripts/bootstrap.sh
./aimk -no-herd -no-java
# No HADOOP support and no Java support
# Note Java is not needed for qmon!
If you encounter some cryptic linker errors (undefined reference to tputs, tgoto, ecc.) make sure you build as unprivileged user!
The build process takes some time. The generated binaries are (in my case) in the LINUXAMD64
folder in sources
Now install the binaries to SGE_ROOT
:
export SGE_ROOT="/opt/sge/" # Or whereever you want to install the grid engine to
scripts/distinst -local -allall -noexit # asks for confirmation
cd $SGE_ROOT
./inst_sge -m -x -csp # or run ./start_gui_installer
Done.
Build with Java
For the graphical installer, you need to run aimk
with java support. For that you will need the following additional dependencies
yum install ant-junit junit javacc
Then building should work with
./scripts/bootstrap.sh
./aimk -no-herd
If you get Java version errors, please adjust build.properties
for your needs.
Firewall
In order to make SGE run, you will need to open the following ports
firewall-cmd --add-port=992/udp --permanent
firewall-cmd --add-port=6444/tcp --permanent
firewall-cmd --add-port=6445/tcp --permanent
firewall-cmd --reload
Configuration
OpenMPI
In case you want to use OpenMPI, make sure to compile OpenMPI with --with-sge
support.
In case you are using Spack, compile OpenMPI with schedulers="sge" on
spack install openmpi%gcc@8.2.0 schedulers="sge"
You will also need to set control_slaves
and job_is_first_task
to true
$ qconf -sp openmpi
pe_name openmpi
slots 1024
user_lists NONE
xuser_lists NONE
start_proc_args NONE
stop_proc_args NONE
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task TRUE
urgency_slots min
accounting_summary FALSE
qsort_args NONE
Known issues
storage size of ‘w’ isn’t known
../sh.proc.c:153:16: error: storage size of ‘w’ isn’t known
union wait w;
This error was the whole reason for forking the repository. Comment out line 51 in ``sge-8.1.9/source/3rdparty/qtcsh/sh.proc.c` as follows:
50: #if defined(_BSD) || (defined(IRIS4D) && __STDC__) || defined(__lucid) || defined(linux) || defined(__GNU__) || defined(__GLIBC__)
51: //# define BSDWAIT
52: #endif /* _BSD || (IRIS4D && __STDC__) || __lucid || glibc */
undefined reference to tputs, tgoto, ecc.
Linker errors: I encountered this error when building as root. Try building as unprivileged user (which you should do anyways!)
Java version errors
Some weird java version not supported errors occurred to me, when building on OpenSuSE 15 LEAP. Edit the file build.properties
and put there a more recent Java version like
# sge-8.1.9/source/build.properties
javac.debug=true
javac.deprecated=true
default.sge.javac.source=1.6
default.sge.javac.target=1.6
jgdi.javac.source=1.6
jgdi.javac.target=1.6
jjsv.javac.source=1.6
jjsv.javac.target=1.6
hadoop.javac.source=1.6
hadoop.javac.target=1.6
That should fix the issue.