Use of Matrix Market routines to distributed matrix reading across MPI processes
allaffa opened this issue · 1 comments
Hello,
I am trying to read a matrix from a file with Matrix Market (MM) format within a distributed computing framework using MPI.
I familiarized with the methodology explained in the following example:
https://github.com/ddemidov/amgcl/blob/master/examples/mpi/mpi_solver.cpp
Following the example above, I implemented the following function that is supposed to read the matrix in parallel using MPI, distributing the matrix row-wise in chunks across the MPI processes:
ptrdiff_t read_matrix_market(
amgcl::mpi::communicator comm,
const std::string &A_file,
const std::string &rhs_file,
int block_size,
std::vector<ptrdiff_t> &ptr,
std::vector<ptrdiff_t> &col,
std::vector<double> &val,
std::vector<double> &rhs
)
{
amgcl::io::mm_reader A_mm(A_file);
ptrdiff_t n = A_mm.rows();
ptrdiff_t chunk = (n + comm.size - 1) / comm.size;
if (chunk % block_size != 0) {
chunk += block_size - chunk % block_size;
}
ptrdiff_t row_beg = std::min(n, chunk * comm.rank);
ptrdiff_t row_end = std::min(n, row_beg + chunk);
std::cout<<"Rank: "<<comm.rank<<" - Min row: "<<row_beg<<std::endl;
std::cout<<"Rank: "<<comm.rank<<" - Max row: "<<row_end<<std::endl;
chunk = row_end - row_beg;
A_mm(ptr, col, val, row_beg, row_end);
try{
amgcl::io::mm_reader rhs_mm(rhs_file);
rhs_mm(rhs, row_beg, row_end);
}
catch(std::exception const& e){
if(comm.rank == 0){
std::cout<<"Following error occurred opening file b.mtx for RHS: "<<e.what()<<std::endl;
std::cout<<"The code will continue setting all entries of RHS to 1.0"<<std::endl;
}
rhs.resize(chunk);
std::fill(rhs.begin(), rhs.end(), 1.0);
}
return chunk;
}
and later on in the main script I call the function as follows:
using amgcl::prof;
int block_size = prm.get("precond.coarsening.aggr.block_size", 1);
prof.tic("read problem");
std::vector<ptrdiff_t> ptr;
std::vector<ptrdiff_t> col;
std::vector<double> val;
std::vector<double> rhs;
ptrdiff_t chunk = read_matrix_market(
world, A_file, rhs_file, block_size, ptr, col, val, rhs
);
prof.toc("read problem");
However, I notice that the time spent to read the matrix from the MM increases with the number of processes instantiated, rather than decreasing.
Am I doing something wrong? If you are interested, I can also share the entire main.cpp
file with you.
Thank you very much in advance for your attention to this issue.
My guess is that reading a single file by many processes stresses your io subsystem. I prefer to convert the matrix and the rhs to a binary format, which significantly decreases the read times.
You can convert the system with
./build/examples/mm2bin -i A.mtx -o A.bin
./build/examples/mm2bin -i b.mtx -o b.bin
And the example of reading the matrix in this format may be found here: https://amgcl.readthedocs.io/en/latest/tutorial/poisson3DbMPI.html