Use of Matrix Market routines to distributed matrix reading across MPI processes

Question

Use of Matrix Market routines to distributed matrix reading across MPI processes

allaffa opened this issue 2 years ago · 1 comments

Hello,

I am trying to read a matrix from a file with Matrix Market (MM) format within a distributed computing framework using MPI.
I familiarized with the methodology explained in the following example:
https://github.com/ddemidov/amgcl/blob/master/examples/mpi/mpi_solver.cpp

Following the example above, I implemented the following function that is supposed to read the matrix in parallel using MPI, distributing the matrix row-wise in chunks across the MPI processes:

ptrdiff_t read_matrix_market(
        amgcl::mpi::communicator comm,
        const std::string &A_file, 
	const std::string &rhs_file, 
	int block_size,
        std::vector<ptrdiff_t> &ptr,
        std::vector<ptrdiff_t> &col,
        std::vector<double>    &val,
        std::vector<double>    &rhs
	)
{
    amgcl::io::mm_reader A_mm(A_file);
    ptrdiff_t n = A_mm.rows();

    ptrdiff_t chunk = (n + comm.size - 1) / comm.size;
    if (chunk % block_size != 0) {
        chunk += block_size - chunk % block_size;
    }

    ptrdiff_t row_beg = std::min(n, chunk * comm.rank);
    ptrdiff_t row_end = std::min(n, row_beg + chunk);

    std::cout<<"Rank: "<<comm.rank<<" - Min row: "<<row_beg<<std::endl;
    std::cout<<"Rank: "<<comm.rank<<" - Max row: "<<row_end<<std::endl;

    chunk = row_end - row_beg;

    A_mm(ptr, col, val, row_beg, row_end);

    try{
        amgcl::io::mm_reader rhs_mm(rhs_file);
        rhs_mm(rhs, row_beg, row_end);
    }
    catch(std::exception const& e){
	if(comm.rank == 0){
	   std::cout<<"Following error occurred opening file b.mtx for RHS: "<<e.what()<<std::endl;
	   std::cout<<"The code will continue setting all entries of RHS to 1.0"<<std::endl;
	}
        rhs.resize(chunk);
        std::fill(rhs.begin(), rhs.end(), 1.0);
    }

    return chunk;
}

and later on in the main script I call the function as follows:

    using amgcl::prof;

    int block_size = prm.get("precond.coarsening.aggr.block_size", 1);

    prof.tic("read problem");
    std::vector<ptrdiff_t> ptr;
    std::vector<ptrdiff_t> col;
    std::vector<double>    val;
    std::vector<double>    rhs;

    ptrdiff_t chunk = read_matrix_market(
            world, A_file, rhs_file, block_size, ptr, col, val, rhs
            );

    prof.toc("read problem");

However, I notice that the time spent to read the matrix from the MM increases with the number of processes instantiated, rather than decreasing.
Am I doing something wrong? If you are interested, I can also share the entire main.cpp file with you.

Thank you very much in advance for your attention to this issue.

Answer 1 · 2023-01-25T18:43:31.000Z

My guess is that reading a single file by many processes stresses your io subsystem. I prefer to convert the matrix and the rhs to a binary format, which significantly decreases the read times.

You can convert the system with

./build/examples/mm2bin -i A.mtx -o A.bin
./build/examples/mm2bin -i b.mtx -o b.bin

And the example of reading the matrix in this format may be found here: https://amgcl.readthedocs.io/en/latest/tutorial/poisson3DbMPI.html