x86optimiser: A Lua repository from KTH Theoretical Computer Science Department - KTH Theoretical Computer Science Department

Project Part 1: Super_Optimization Algorithm

Step 1. You will need docker installed on your system either Linux or Ubuntu. Latest version can be found here: https://www.docker.com/get-started

Step 2. You will need to ssh to the latest image of the stoke project. For this you first need to pull the image from the server which can be done as:

 sudo docker pull stanfordpl/stoke:latest

Step 3. This will pull the latest images from the already published contents for the stanfordpl project on the server. Then you need to give a name to your container and start run. sudo docker run -d -P --name yourownname stanfordpl/stoke:latest

Step 4. Then you can SSH to the container as: This will give you output for port number XXXXX

sudo docker port yourownname 22

Step 5.

   ssh -pXXXXX stoke@127.0.0.1

Note: Password is stoke

Step 6 (incase you get an error message that your docker container is already running). To stop and remove the docker container follow the steps as below:

     docker system prune
     docker system prune --volumes
     sudo docker container ls -a

Check for the number of the container and then stop using

   docker stop (number)

And finally remove it using

   docker container rm (number)

Alternatively you can also make sure that you prune all the volumes by:

docker system prune --volumes

-Examples on Assembly codes

https://www.nasm.us/doc/nasmdoc2.html#section-2.1.23

https://www3.nd.edu/~dthain/courses/cse40243/fall2015/intel-intro.html

https://www.imada.sdu.dk/~kslarsen/dm546/Material/IntelnATT.html

https://github.com/Dman95/SASM/issues

www.rosettacode.org

Some Instructions on assembly codes

Step 1: Please use any of the above mentioned methods for compiling your files in the .s format. I would suggest nasm since it is simple to work with on the Linux based Systems and comes with a wide range of applications.

To compile your code create a asm file and run as follows:

nasm -f elf myfile.asm

This will help you to assemble your code

Then,

nasm -f bin myfile.asm -o myfile.com

Output file can be given in any of the formats supported by nasm. Complete list of commands for testing and results can be found on this link:

https://www.nasm.us/doc/nasmdoc2.html#section-2.1.23

If you happen to have gcc and want to test your code, run:

nm hello.o

This will give you a run time analysis of the executed code snippets.

The output for the Stoke search Result sample

Things to note: The Statistics update and the progress update are the two types of results. The progress update will give you the lowest cost result whereas the Statistics update will give you the Result table. Finally run make check and test time ./a.out to see actually whether optimisation is successful.

The Result is stored in resulttext.txt

Explanation is stored in explanation.txt

OPTIMISATIONS

1. -O0 This level (that is the letter "O" followed by a zero) turns off optimization entirely and is the default if no -O level is specified in CFLAGS or CXXFLAGS. This reduces compilation time and can improve debugging info, but some applications will not work properly without optimization enabled. This option is not recommended except for debugging purposes

2. -O1 the most basic optimization level. The compiler will try to produce faster, smaller code without taking much compilation time. It is basic, but it should get the job done all the time.

3. -O2 A step up from -O1. The recommended level of optimization unless the system has special needs. -O2 will activate a few more flags in addition to the ones activated by -O1. With -O2, the compiler will attempt to increase code performance without compromising on size, and without taking too much compilation time. SSE or AVX may be be utilized at this level but no YMM registers will be used unless -ftree-vectorize is also enabled.

4. -O3 the highest level of optimization possible. It enables optimizations that are expensive in terms of compile time and memory usage. Compiling with -O3 is not a guaranteed way to improve performance, and in fact, in many cases, can slow down a system due to larger binaries and increased memory usage. -O3 is also known to break several packages. Using -O3 is not recommended. However, it also enables -ftree-vectorize so that loops in the code get vectorized and will use AVX YMM registers

5. -O4 Implemented in Binaryen : github.com/binaryen

kth-tcs/x86optimiser

Project Part 1: Super_Optimization Algorithm

Some Instructions on assembly codes

The output for the Stoke search Result sample