Introduction

This is the final project of System programming university course. Professor's directions say it should be a server-client system for remote files ciphering (and/or deciphering), following a notorious malware path known as ransomware. Once installed to the attacked machine, the client will be the rudder for the attack. The application needs to support both Linux and Windows systems, operating exactly the same way on both.

Name

Name choice has not been imposed by the professor, and descends from two fundamental reasons:

the paronomasia with cryptolocker, one of the most famous ransomware attacks since 2013;
gives a playful idea if the project, developed for university-related cognitive purposes, marking the simplicity of the alphabetic encryption system used.

Ramification

The project contains two main elements:

src (folder): contains the source code, both for Linux and Windows platforms;
Makefile (refular file): contains all the rules needed to automate the build processes.

Structure

Physical files and folders structure has been defined that way it is to handle the distinction of every modulem, communication edge and host platform:

communication edge:
- server
- client
host platform:
- Linux
- Windows

Communication edge one gets done with client and server folders (immediately after src one); common folder is used to contain code used by both edges.

Any of this containers - server and/or client - contain the latter distinction, via linux and windows folders and, if needed, common one, too (similarly to the one already mentioned, it's used for cross-platform code).

Code

server module

Specifications

As per professor's directions, server is made of several parts:

a TCP socket needs to be instanciated to be listening on any network interface on conventional 8888 port (otherwise specified by -p input flag);
the handling of any request gets delegated to a thread. The n-threads (conventrionally 4, or specified by the -n flag) are handled by a threadpool (more details below);
server operations are limited to the elements (recursively) contained into the folder mandatorily specified with -c flag.
supported operations are the following:
1. LSTF
  
  Lists the files contained in the folder, along with their bytes size;
2. LSTR
  
  Recursively lists the files contained in the folder, along with their bytes size;
3. ENCR seed file
  
  Ciphers, with a key generated using seed, the content of file into the file_enc and then removes file;
4. DECR seed file_enc
  
  Deciphers, with a key generated using seed, the content of file_enc into the file and then removes file_enc;
it's possible to specify a configuration files, with -f flag, within which to indicate the values corresponding to -p and-n flags.

It implementation follows some simple steps, explained below.

Arguments parsing

First operation that gets done is the input arguments scan and their parsing. In order to do this, getopt library has been mainly used, because of its use simplicity in the flag-value association and in reporting facoltatives / mandatory input combinations. Later, parameters get validated, instead:

the existence of configuration file: if existing, its values get parsed;
the existence of the folder within which to relegate the application execution;
specified port number validity;
specified maximum threads number validity for threadpool.

Instance and configuration of data structures and global variables

Once user input has been validated and handled, application effectively configures itself:

threadpool init() function gets called to initialize the maximum threads number (more details below);
WSADATA structure get instanciated: it's actually used for socket handling procedures (only on Windows platform);
passive TCP socket gets instanciated on the specified port and fired to be listening for eventual connections.

Connection management

Once an active socket is generated, its pointer gets passed as paramenter to the handle_connection() function, delegated of the server-client conversation management, for all its duration. The implementation of this method, within a for cycle, scans and reacts to every command requested by the other side of the communication:

LSTF/LSTR: these two commands use the same ricorsive list(char *ret_out, int recursive) function (that respectively calls list_opt(char *ret_out, int recursive, char *folder, char *folder_suffix) function): while calling it, LSTF indirectly sets the boolean (an integer) recursive to 0, while LSTR does the same to 1. In both cases, *folder parameter will match with the *arg_folder pointer (the folder within which the application is in execution) and *folder_suffix will be NULLed. So, while scanning the *folder folder content, if another folder is met and recursive is true, then the function will call itself again, populating *folder_suffix variable adequately, to indicate the suffix that needs to be added to the initial forlder to construct the path of the just met folder.
ENCR/DECR: exploiting the peculiarity of the ciphering made using the exclusive disjunction (XOR) operator (given a k key and ciphered a characters sequence applying the XOR bitwise operation with k key, reapplying the same operation with the same k, the the same initial sequence will be obtained), the implementation of the two commmands has been unified. In fact, they execute the same procedure (described more in detail below), except for the configuration of the input/output file.

threadpool module

threadpool uses custom data structures to simplify both the complexity of the problems that it handles, and multi-platform. job_t is the most atomic structures, it contains all the informations about a task that needs to be executed within the threadpool: a function and its arguments pointers, and a pointer to the next job_t. Then, threadpool_t structure gathers useful informations for the correct functioning of the threadpool, such as the maximum threads number, or the threads list itself, or the mutex and the condition variable, both used to regulate the interactions with internal fields of the structure itself.

There're several ways to interace with the threadpool from the outside:

threadpool_init()

This function initializes the data structures used by the module, and it's used to specify the maximum threads number usable in the threadpool context.
threadpool_add_job()

This function is delegated to handle the new operations - that will be marked as in pending - adding procedure to the threadpool queue.
threadpool_bye()

Finally, this function is invoked to make all the memory cleaning operations, before stopping the threadpool.

Every thread is configured to execute the module static function thread_boot, which do nothing but executing a task, or better, a job_t, actually in pending status on the threadpool queue. This gets done once it has acquired the lock on the mutex, so to update the informations about the next task and about the number of pending tasks.

cipher module

cipher module follows a very simple structure, composed by only a cipher() function which gets multiple arguments: two char pointers - which correspond to the paths of the input/output files of the ciphering procedure - and an unsigned int used as seed for the key generation. The body of this function has three steps:

initialization of file descriptor (s) (on Linux platform) or HANDLE (s) (on Windows platform) and of memory maps of the input/output files, once obtained the lock on the first one;
effective ciphering of the first file to the latter;
closing of file descriptor (s) (on Linux platform) or HANDLE (on Windows platform) and of the memory maps of the input/output files, once released the lock on the first one.

Parallelization

Regarding the parallelization problem, has been allowed to use the OpenMP API. This choice is motivated by two fundamental reasons:

it's a multi-platform API, so it won't need any code difference between the systems;
it's extremely simple to use.

As for the game rules imposed by the professor, the parallelization needs to be applied only if the file that will be ciphered is greater than 256 Kbyte. This is the reason why it has been chosen make the process work with two nested for cycles. The first one is parallelized using OpenMP, and will iterate on every 256 Kbyte long portion of the file. This way, if the file is less than 256 Kbyte, the for will include only a cycle, as it's executed sequentially. On the other side, the nested for cycle will iterate on every 4 byte, corresponding to an integer, that compose the 256 Kbyte. For any of these, ciphering will be calculated.

one-time pad and parallelization problem

One of the encountered problems was about the needing of finding the simplest way to let the one-time pad ciphering method and the parallelization on files greater than 256 Kbyte coexist. In fact, although the partitioning of the file into 256 Kbytes long blocks has simplified the parallelization problem, the parallelization itself has generated a new problem, because of its missing systematic of execution. In a sequential scenario it's provably true that for every iteration element, starting from the same seed, always the same key will be generated. In a parallelized scenario, this is not provable, as there's no way to foresee the for cycle execution order. In order to solve this problem, a new additional memory map is used to preventively and sequentially generate the ciphering keys. In fact, before executing the two nested for cycles for the effective ciphering procedure, a new memory map (of the same size of the input file) is instanciated and populated - with a new for cycle - with the keys generated using rand_r() (or cipher_rand(), on Windows platform). On the old next for cycles, instead of dinamically generate the keys, every memory map key item corresponding to the iteration element will be used.

`cipher_rand()` implementation on Windows platform

As easily deductible reading Windows module's code variant, there's a function which is not present on the same module's Linux implementation: cipher_rand(). It consists of a pseudo-random number generator, used to generate the key from the seed. The reason behind this choice is about the lack of a system implementation of such a function on Windows platform. So, in this case, the method - taken from the implementation of rand_r() offered by MinGW (more details below) - has been provided.

static int cipher_rand(unsigned int *seed) {
        long k;
        long s = (long)(*seed);
        if (s == 0) {
                s = 0x12345987;
        }
        k = s / 127773;
        s = 16807 * (s - k * 127773) - 2836 * k;
        if (s < 0) {
                s += 2147483647;
        }
        (*seed) = (unsigned int)s;
        return (int)(s & RAND_MAX);
}

client module

client is the simplest code portion of the project. It's made of three parts:

arguments parsing

Unlike the server implementation, in this case no external auxiliary library has been used to handle the arguments parsing problem; instead, a more artisanal method that could fit around the case needs has been preferred. In fact, the ambiguity between the input flags that don't need arguments and the ones that actually do, between flags that indicate analogues commands, or that - server side - need arguments, has brought to this choice.
creation of socket, needed to connect to a server

Regarding the socket management, it's the most reduntant code portion, if compared to the one from server module; they only have a difference: the server socket is waiting for connections, the client one is requesting a connection to a previously server allocated one, instead.
back-and-forth with server

In this phase, what happens will follow this simple scheme:
- translation of the client input flags into server supported commands (e.g., client flag -l gets translated into server command LSTF);
- sending command to server via socket;
- receiving a reply from server and printing the reply itself.

How to use

Compilation

Linux

The configuration of the development environment and the subsequent compilation on Linux platform is relatively simple, as most of Linux distribution provides a base packages group for the development. In the case of the environment where the code has been written:

# eopkg it -c system.devel

Now, just a make is enough:

# make [server|client]

Tested on

This software has been written and tested on the following environment:

Linux	4.9.45 x86_64
Distribution	Solus Project
RAM	20 Gb
CPU	Intel Core i7-4770k Haswell
Type	Phisical machine

Windows

Configuration and compilation of project on a Windows environment is a little bit more complicated. Auxiliary libraries have been used to simplify the compilation phase, and reduce the differentiation of compilation template listed in the Makefile to the minimum: it's the reason why msys2 and mingw-w64 have been adopted. The configuration proceeds this way:

msys2 installation via official site: http://www.msys2.org
Application launching and subsequent update of packages database: # pacman -Syu
Effective packages update: # pacman -Su
minGW-w64 installation: # pacman -S mingw-w64-x86_64-gcc (on 32 bit architecture, install mingw-w64-i686-gcc instead)
Base development dependencies installation: # pacman -S base-devel
optional: in order to use the packages installed above even from the PowerShell, adding the path of the binary files to the global Windows PATH variable is required: C:\path\to\msys2\usr\bin e C:\path\to\msys2\mingw64\bin.

Tested on

This software has been written and tested on the following environment:

Windows	10 Pro x86_64
RAM	4096 Mb
CPU	Intel Core i7-4770k Haswell
Type	Virtual machine

How to use

Server

server can be used the following ways (assuming pwd is the root folder of the project, once it has been compiled):

# ./bin/cryptoloackerd -c /path [-n max-threads -p port]

You can specify the /path and the port nnumber in a configuration file and let the server load those values reading the configuration file itself, specifying it as parameter:

# ./bin/cryptoloackerd -f /path/file.txt [-n max-threads]

In this case, file will be populated following this template:

# cat /path/file.txt
folder = /path
port = 8888

Client

client can be used the following ways (assuming pwd is the root folder of the project, once it has been compiled):

To execute LSTF or LSTR:

# ./bin/cryptoloacker -h server-ip -p port [-l|-R]
To execute ENCR or DECR:

# ./bin/cryptoloacker -h server-ip -p port [-e|-d] seed /path/file

streambinder/cryptoloacker

Introduction

Name

Ramification

Structure

Code

server module

Specifications

Arguments parsing

Instance and configuration of data structures and global variables

Connection management

threadpool module

cipher module

Parallelization

one-time pad and parallelization problem

cipher_rand() implementation on Windows platform

client module

How to use

Compilation

Linux

Tested on

Windows

Tested on

How to use

Server

Client

`cipher_rand()` implementation on Windows platform