Informations

Auteur(s) Erik-Jan Lingen, Matthias Möller
Date limite Pas de date limite
Limite de soumission Pas de limite

Se connecter

[Threads] Parallel sparse matrix-vector multiplication

This material is part of the course Practical Introduction to Parallel Programming and meant for educational purposes only.


Implement a program that reads a sparse matrix \(\mathbf{A}\) from a file, stores it in a Compressed Sparse Row (CSR) format, and performs a series of parallel, sparse matrix-vector products of the form

\begin{equation*} \mathbf{x}_{i} = \mathbf{A} \, \mathbf{x}_{i-1} \end{equation*}

The first vector \(\mathbf{x}_{0}\) is generated randomly.

This code given below implements the sequential version of the program. It defines a struct sparse_matrix_t that represents a sparse matrix.

The program reads a sparse matrix from a file named matrix.dat.

You should (try to) implement three parallel versions of sparse matrix-vector product: one with MPI; one with OpenMP and one with threads.

Hints

Partition the matrix row wise and assign each batch of rows to a different thread or process.

Read the matrix on the master thread/process.

Start with the parallel implementation based on OpenMP as this is the most straightforward.

Compare the parallel performance of each implementation. Is there a significant difference?


Question 1: Sparse matrix-vector multiplication
Question 2: Number of MPI processes

Specify the number of MPI processes (1-64) that should be used for running your program (mpirun -np <nproc>)

Question 3: OpenMP environment variables

Specify the OpenMP environment variables that should be used for running your program (mpirun -x OMP_NUM_THREADS=1 -x ...). Use semicolons to separate multiple variables

Question 4: Command Line Argument

Specify the command line arguments that should be passed to your program when it is run. Arguments must be given in quotes and separated by commas. Leave this filed empty if you do not want to specify command line arguments.

Example: "1","int","arg=2" is interpreted as three arguments 1, int, and arg=2.