JuliaSmoothOptimizers/HSL.jl

Compile HSL with -03 optimization

Closed this issue · 1 comments

After some discussions about performances with @geoffroyleconte, I compared OpenBLAS32 / MKL backend and a basic compilation with a " -O3" compilation.
We should add -03 after gfortran and gcc by default.

const HSL_FC = haskey(ENV, "HSL_FC") ? ENV["HSL_FC"] : "gfortran -O3"
const HSL_F77 = haskey(ENV, "HSL_F77") ? ENV["HSL_F77"] : HSL_FC
const HSL_CC = haskey(ENV, "HSL_CC") ? ENV["HSL_CC"] : "gcc -O3"
# Current version of HSL
using HSL, MatrixMarket, SuiteSparseMatrixCollection
using LinearAlgebra, Printf, BenchmarkTools

ssmc = ssmc_db(verbose=false)
matrix = ssmc_matrices(ssmc, "Boeing", "pwtk")
path = fetch_ssmc(matrix, format="MM")

n = matrix.nrows[1]
A = MatrixMarket.mmread(joinpath(path[1], "$(matrix.name[1]).mtx"))
b = ones(n)
b_norm = norm(b)

# Solve Ax = b.
LDL = @btime Ma57($A)           #  7.566 s (36 allocations: 343.44 MiB)
@btime ma57_factorize($LDL)     # 39.155 s (2 allocations: 851.30 KiB)
@btime ma57_solve($LDL, $b)     # 497.909 ms (6 allocations: 4.16 MiB)

import LinearAlgebra, MKL_jll
LinearAlgebra.BLAS.lbt_forward(MKL_jll.libmkl_rt_path, clear=true, verbose=true)

# Solve Ax = b.
LDL = @btime Ma57($A)          #  7.466 s (36 allocations: 343.44 MiB)
@btime ma57_factorize($LDL)    # 25.038 s (2 allocations: 851.30 KiB)
@btime ma57_solve($LDL, $b)    # 230.605 ms (6 allocations: 4.16 MiB)
# HSL compiled with -O3
using HSL, MatrixMarket, SuiteSparseMatrixCollection
using LinearAlgebra, Printf, BenchmarkTools

ssmc = ssmc_db(verbose=false)
matrix = ssmc_matrices(ssmc, "Boeing", "pwtk")
path = fetch_ssmc(matrix, format="MM")

n = matrix.nrows[1]
A = MatrixMarket.mmread(joinpath(path[1], "$(matrix.name[1]).mtx"))
b = ones(n)
b_norm = norm(b)

# Solve Ax = b.
LDL = @btime Ma57($A)           #  3.123 s (36 allocations: 343.44 MiB)
@btime ma57_factorize($LDL)     # 14.857 s (2 allocations: 851.30 KiB)
@btime ma57_solve($LDL, $b)     # 314.188 ms (6 allocations: 4.16 MiB)

import LinearAlgebra, MKL_jll
LinearAlgebra.BLAS.lbt_forward(MKL_jll.libmkl_rt_path, clear=true, verbose=true)

# Solve Ax = b.
LDL = @btime Ma57($A)          # 3.345 s (36 allocations: 343.44 MiB)
@btime ma57_factorize($LDL)    # 9.488 s (2 allocations: 851.30 KiB)
@btime ma57_solve($LDL, $b)    # 186.227 ms (6 allocations: 4.16 MiB)
dpo commented

We could add -O3 directly to the build_*.jl files, couldn't we?