pbj4/polycubes

Windows setup guide, Windows & Linux Compile scripts and RAM drive optimisation

HakMe2Deth opened this issue · 6 comments

Not actually issues. Just proposing some enhancements...

Following on from Mike Pounds Opencube project and expanding on your RUST work as well as Loïc Damien's RUST compile shell script I have created a few updates for your review...

Project References

  1. Original video - Mike's Cube Code - Computerphile - https://www.youtube.com/watch?v=g9n0a0644B4
  2. Mike Pound's original attempt in python - https://github.com/mikepound/cubes/
    02.01 Mike references other projects
  3. Loïc Damien converts to RUST with optimisations - https://gitlab.com/dzamlo/polycubes2
    03.01 Loïc References other projects
    03.02 Includes shell script for processor optimisation compilation
  4. pbj4 converts to RUST and modifies and optimises calculations
    04.01 pb4j references other contributors projects

Below are 4 sections...

  1. Windows Install Guide for RUST
  2. RUST Compile script for Windows
  3. RUST Compile script for Linux
  4. RUST optimisation for working with a RAM drive

Windows Install Guide for RUST

  1. Install Visual Studio Build Tools - https://visualstudio.microsoft.com/downloads/
    01.01 Scroll down, click Tools for Visual Studio, Dowload Build Tools for Visual Studio
    01.02 Install
    01.03 Once complete, Installer will give options for other components
    01.04 Choose Desktop development with C++ (over 6GB download by default)
    01.05 Install
    01.06 (Optional) - Install Visual Studio Code and RUST addons or install Notepad ++
    01.07 Close installer
    01.08 In the start menu run the new shortcut - "x64 Native Tools Command Prompt for VS 2022" - standard user. No need for admin
    01.09 Minimise window for now

  2. Download RUST for Windows - https://www.rust-lang.org/tools/install
    02.01 Restore the previous development command prompt window
    02.02 At the visual studio prompt run the rustup-init.exe
    (usually) c:\users<username>[.domain]\Downloads\rustup-init.exe
    02.03 If you have followed the above it should detect the VS path and build environment
    (if not - check - search for tutorials if it fails)
    02.04 Choose option 1 to Proceed with installation. Rust and components will download and install
    02.05 Close the command shell as recommended after install then re-open from the start menu
    02.06 type "rustup --version" to test if environment is working
    02.07 Add the llvm components for optimisation of the code later - type "rustup component add llvm-tools-preview"
    02.08 Once complete, we need to add the llvm tools to the path. So we need to find the location... so lets search for it with this command...
    dir %homepath% /s /b |find "llvm-profdata.exe"
    should be something like "c:\Users<username>[.domain]\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\bin"
    02.09 Now either add this at a system level or type the command manually (may also want to create a batch script for this - say in "%HOMEPATH%\.rustup")
    set PATH=%PATH%;%HOMEPATH%\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\bin\

RUST compile script for Windows - Windows batch file

`
@echo off
cls

REM debug switch
set scriptdebug=0

REM ---
REM ensure llvm tools are in the path -
REM >> set PATH=%PATH%;%homepath%\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\bin
REM ---

REM Setup the CPU optimisation variables

REM set EXTRA_RUSTFLAGS=--C target-cpu=native --C opt-level=3 --C lto=yes --C codegen-units=1 --C embed-bitcode=yes
set EXTRA_RUSTFLAGS=--C target-cpu=native --C opt-level=3 --C lto=yes --C embed-bitcode=y --C codegen-units=1 --C code-model=small --C debuginfo=0

REM Setup the compile working area - folder of the script

set tmpdir=%cd%\prof-work
if EXIST %tmpdir% del /s /q %tmpdir%

REM Point RUST profile output to the working area

set RUSTFLAGS=-Cprofile-generate=%tmpdir% %EXTRA_RUSTFLAGS%

REM debug check variables
IF %scriptdebug%==0 goto skipdebug1
echo Working Area : %tmpdir%
echo Compile Flags: %EXTRA_RUSTFLAGS%
echo Build Mode : %RUSTFLAGS%

pause

:skipdebug1

REM compile the code

cargo build --release --target=x86_64-pc-windows-msvc
IF %scriptdebug%==0 goto skipdebug2
pause
:skipdebug2

REM Now run the compiled code to generate the raw profile of the CPU environment

%CD%\target\x86_64-pc-windows-msvc\release\polycubes 12
IF %scriptdebug%==0 goto skipdebug3
pause
:skipdebug3

REM merge/convert the raw profile to optimisation format

llvm-profdata merge -o %tmpdir%\merged.profdata %tmpdir%
IF %scriptdebug=%=0 goto skipdebug4
pause
:skipdebug4

REM set the compile optimiser to use the formatted profile data

set RUSTFLAGS=-Cprofile-use=%tmpdir%\merged.profdata %EXTRA_RUSTFLAGS%

REM Rebuild the code for final use

cargo build --release --target=x86_64-pc-windows-msvc

IF %scriptdebug%==0 goto skipdebug5
pause
:skipdebug5

REM clean the working area
del /s /q %tmpdir%
`

RUST compile script for Linux save as something like "optimize_v3.sh"

`
#!/bin/sh
set -e

EXTRA_RUSTFLAGS="--C target-cpu=native --C opt-level=3 --C lto=yes --C embed-bitcode=y --C codegen-units=1 --C code-model=small --C debuginfo=0"

tmpdir=$(mktemp -d)
rm -rf /tmp/pgo-data

RUSTFLAGS="-Cprofile-generate=$tmpdir $EXTRA_RUSTFLAGS" cargo build --release --target=x86_64-unknown-linux-gnu

./target/x86_64-unknown-linux-gnu/release/polycubes 11 8
./target/x86_64-unknown-linux-gnu/release/polycubes 11
./target/x86_64-unknown-linux-gnu/release/polycubes 11 1
./target/x86_64-unknown-linux-gnu/release/polycubes 12 8
./target/x86_64-unknown-linux-gnu/release/polycubes 13

llvm-profdata merge -o "$tmpdir/merged.profdata" "$tmpdir"

RUSTFLAGS="-Cprofile-use=$tmpdir/merged.profdata $EXTRA_RUSTFLAGS" cargo build --release --target=x86_64-unknown-linux-gnu

rm -rf "$tmpdir"

`

RUST optimisation for working with a RAM disk on LINUX

Ensure there is a decent amount of memory - 4GB may be a normal amount...
Clean out the "target" data if the scripts have been run before... just type "cargo clean" to do that.

Next create a new shell script in the folder to Create the ram disk...

`
#!/bin/sh

if [ ! -d /tmp/ramdisk ]; then
mkdir /tmp/ramdisk
chmod 777 /tmp/ramdisk
fi

mount | awk '{if ($3 == "/tmp/ramdisk") { exit 0}} ENDFILE{exit -1}' || mount -t tmpfs -o size=4G myramdisk /tmp/ramdisk

rsync -ravz --exclude 'target' * /tmp/ramdisk

`

Next change to the ramdisk folder - "cd /tmp/ramdisk"
then compile the RUST script with "./optimise_v3.sh" from previous step

you can then run the compiled exe with "/tmp/ramdisk/target/x86_64-unknown-linux-gnu/release/polycubes 12" changing the 12 to be the cube itterations

pbj4 commented

@HakMe2Deth

This is great! Sorry I couldn't get back to you sooner, I wasn't expecting to get any activity on this project after a while.

I'm getting between a 15 and 20% performance improvement, which is pretty significant compared to the 5% improvements I was chasing towards the end of this project. If I had had this earlier, I could have shaved 3 days off my 17 day run, so this is essential for anyone who wants to try n = 20.

I've made this PR (#2) using your guide to create 2 build scripts, similar to how @dzamlo has it in his repo. They're mostly the same as your linux optimize_v3.sh, but I made some small modifications and set it up to compile the client as well since that's needed for the distributed system. Unfortunately, I don't have a windows machine on hand to test the windows version, but I've added a link to your guide in case anyone wants to use this on windows.

Sorry again for keeping you waiting for so long. I'll merge the PR in a few days, so let me know if there are any problems!

@pbj4

No need to apologize! And thanks for giving me the credit. I have only expanded on the excellent and interesting work done by yourself and @dzamlo

I came to this project quite late in the day and wasn't expecting much, if any, movement. I only picked up on it to start to learn some of the newer programming languages like Python and Rust.

I really appreciate you picking it up and then expanding on it even more.

Once I get my head around your distributed system processes I will see if I can bundle in any optimizations from using a ram disk for the client and/or server components. (if possible) As well as try and understand the multi-threading you have set up.

Not that there is anything wrong with your code. Exactly the opposite. Your code and this project as a whole is how I am learning.