This file describes how to compile and run findingaleph to solve the problem described in [article]. If you are interested only in reading the resulting files, check readpfile.m. 0 - Files in this package ------------------------- Included in this package are the following files and file structure: - LICENSE file - README.txt: this file - cleanpfile.py: python script used to clean up the pefiles (when the optimization solver failed to solve a given problem, the result is written in the pefile as '**********', which is not read properly by MATLAB. We preprocessed these files using python to replace these characters for 'Inf'. (The original files are backed up) - compile.sh: bash script used to compile the whole findingaleph program. - findingaleph.f90: main program. - pmodule.f90: module for reading/writing input and output files. - readbacterial.m: MATLAB script for manipulating the output of the optimization solver on bacterial data. - readpfile.m: MATLAB script for manipulating the output of the optimization solver on random data - runall.sh: bash script used to run the optimization solver on a set of problems defined by random data (pefiles) - tests.m: dummy MATLAB script where statistical tests can be executed for a batch of outputs (treats a chunk of one pefile at a time) The input and results files can be found as follows: * input: folder containing the following files: *- matrizes01.txt -- matrizes10.txt, each containing the random data generated by randprobvector.m (original code) *- pairs.mat, pairs.txt, containing the original bacterial data * results: folder containing the following files: *- pefile01.txt -- pefile10.txt, each containing the result of the optimization problem solved by findingaleph. These results are stored as lines of 20 numbers in f10.7 format delimited by whitespaces (more details in readpfile.m). Each pefile has 10.000.000 (10 million) lines. *- pefilepairs.txt, containing the results of the same optimization problem applied to the bacterial data read from pairs.txt (stored in the same format as the pefiles above) 1 - Requirements ---------------- This software was used on Ubuntu Linux 64-bit, with gcc/gfortran version 5.4.0, and Galahad version 2.60004 (21/October/2015). To run it, you need gcc/gfortran >= 4.9.0. 1.1 - Galahad ------------- Before compiling, it is necessary to install Galahad, a "thread-safe library of Fortran 2003 packages for solving nonlinear optimization problems" [1]. Galahad is available at http://www.galahad.rl.ac.uk/ and it is free of charge for academic purposes (but the user must register before downloading). Galahad comes with a few auxiliary packages, but we only need to install ARCHDefs. SIFDecode and CUTEst are not needed to run the Galahad solver on a problem written in Fortran (our case). Before installing, check the README files for instructions. 1.2 - HSL-Galahad ----------------- A library of performance-improving HSL codes are also needed for our package. This is available at http://www.hsl.rl.ac.uk/download/hsl-galahad/2.60000.2/d/ and, again, one must register before downloading it. To install it, follow the instructions provided by the Galahad documentation. 2 - Installation ---------------- After installing Galahad and the HSL libraries, you can use the bash script compile.sh (you might have to give it execution permissions) to compile findingaleph: $ ./compile.sh This includes compiling and linking the module pmodule.f90, responsible for reading and writing the input and output files, and linking the correct libraries for compilation, including BLAS, Lapack, Metis and HSL-Galahad. 3 - Usage --------- Given some 4x4 real matrix P, we must find some real vector x = (x(1), x(2), x(3), x(4)) and some real 4x4 matrix Epsilon such that x and Epsilon solve the following optimization problem: minimize sum(Epsilon(i,j)^2) subject to: x(4) = 1; diag(x)*(P+Epsilon) is persymmetric. In our code, x is a 20-element vector with Epsilon stored in x as follows: x(4+j+(i-1)*4) = Epsilon(i,j), for all i=1,...,4, j=1,...,4 and such that Epsilon is stored by rows: x(5) x(6) x(7) x(8) x(9) x(10) x(11) x(12) x(13) x(14) x(15) x(16) x(17) x(18) x(19) x(20) The constraints of the optimization problem above are implemented accordingly. 3.1 - Bacterial data -------------------- If you want to run the optimization solver for the bacterial data, contained in file pairs.txt, you must, after compiling, create an empty file called pefilepairs.txt and execute the following command: $ ./findingaleph pairs There are 1428 problems. The result is written in file pefilepairs.txt, with each line containing x(1)--x(4) in the first four entries, and Epsilon in the last 16 entries, such that for each line in the file, for i = 1:4 for j = 1:4 Epsilon(i,j) = line(4+j+(i-1)*4) end end 3.2 - Random data ----------------- If you want to run the optimization solver for the random data, contained in files matrizes01.txt through matrizes10.txt, you must, after compiling, create an empty file called pefileXX.txt (where XX is the 01 -- 10) and execute $ ./findingaleph pmatf There are 100 million problems. Results are written in file pefileXX.txt, with each line containing x(1)--x(4) in the first four entries, and Epsilon in the last 16 entries, such that for each line in the file, for i = 1:4 for j = 1:4 Epsilon(i,j) = line(4+j+(i-1)*4) end end 4 - Results ----------- To read the results and produce any statistic or analysis, one can run readpfile.m or readbacterial.m. This reads the relevant results files, and performs whatever operations the user sets up in tests.m Example: >>> [meannormdif, meanerrors, outC, inC] = readpfile(pefile01.txt, matrizes01.txt, 100000) where outC and inC are cells produced by MATLAB's textscan function. References ---------- [1] N. I. M. Gould, D. Orban, and Ph. L. Toint, "GALAHAD, a library of thread-safe Fortran 90 packages for large-scale nonlinear optimization", ACM Trans. Math. Software 29(4):353-372, 2004. Download link: http://www.galahad.rl.ac.uk/doc/galahad.pdf (included in the docs directory) [2] Documentation for Lancelot-Simple. Download link: http://www.galahad.rl.ac.uk/doc/lancelot_simple.pdf (included in the docs directory) -------------------------- Revised December 9, 2018. Melissa Weber Mendonca melissa.mendonca@ufsc.br
melissawm/findingaleph
Research project involving biomathematics and an optimization problem.
FortranMIT