The Block Vecchia Algorithm is a computational method for efficient spatial statistics calculations, designed for high-performance computing environments. This project includes implementations for both estimation and prediction tasks, leveraging GPU acceleration and optimized libraries.
Ensure you have the following libraries installed and their paths included in your system's PATH and LD_LIBRARY_PATH:
- GCC 10.2.0 or later
- CUDA 11.4 or later
- Intel MKL 2022.2.1 or later
- MAGMA 2.7.0 or later
- GSL 2.6 or later
- NLopt v2.7.1 or later
- OpenMP 4.1.0 or later
- BLAS
On Ubuntu or Debian-based systems, you can install some of these dependencies with (prediction only):
sudo apt-get install libgsl-dev libblas-dev libopenmpi-dev
To compile the project, navigate to the project root directory and run:
make
This will create the necessary executables in the bin
directory.
The main executable for the Block Vecchia Algorithm estimation is test_dvecchia_batch
. Here are some example use cases:
- View Help Information:
cd estimation
./bin/test_dvecchia_batch --help
- Performance Test:
./bin/test_dvecchia_batch --ikernel 1.5:0.1:0.5 --kernel univariate_matern_stationary_no_nugget --num_loc 20000 --perf --vecchia_cs 300 --vecchia_bc 1500 --knn --seed 0
- Simulated Data or Real Dataset:
./bin/test_dvecchia_batch --ikernel ?:?:? --vecchia_bc 300 --kernel univariate_matern_stationary_no_nugget --num_loc 20000 --vecchia_cs 150 --knn --xy_path /path/to/locations --obs_path /path/to/observations
Results are stored in the ./log
directory after execution.
This project is licensed under the terms of the Apache License 2.0. See the LICENSE file for details.
The following scripts and notebooks are used to generate the corresponding figures. In our experiments, we use 40 CPU cores (Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz) and NVIDIA V100 GPU 32GB.
- Step 0: choose estimation or prediction, and download the bash and jupyter notebooks.
- Step 1: use the bash script to run the experiments.
- Step 2: use the notebook to process the results and generate the figures.
Note:
- The bash scripts and jupyter notebooks should be in the same folder with
estimation
orprediction
. - If there is no
log
andfig
folder in the current directory, please create one. - The well-trained model will be saved in the
log
folder, you can put them in correspondinglog
folders and visualize, see here.
For example of estimation
cd estimation
wget -O exe-estimation.zip https://www.dropbox.com/scl/fo/tkwnos3to038z945ys64u/ABqbicVRNfrXwaCu9N7CnsY?rlkey=mjytdqyadqzbksrkq86uyrr99&st=6620p253&dl=0
unzip exe-estimation.zip -d ./
rm exe-estimation.zip
mv exe-estimation/* ./
bash test_kl.sh
Lastly, please use the ./test_kl_plot.ipynb
to visualize the results. Prediction bash scripts are the same as the estimation and its link is here.
blockVecchia
is used for block Vecchia estimation.
Script/Notebook | Corresponding Figure(s) | Training Time | Comments |
---|---|---|---|
cluster.sh / cluster-reordering.ipynb |
Figure 2 | Instantly | |
memory.ipynb |
Figure 4 | Instantly | |
test_kl.sh / test_kl_plot.ipynb |
Figure 5(a), Figures S1, S2, S3 | 30 mins | |
test_kl.sh / test_kl_plot_blockcount_cs.ipynb |
Figure 5(b) | 30 mins | |
test_kl.sh / test_kl_plot.ipynb |
Figure 6 | 30 mins | |
test_kl.sh / test_kl_plot_time.ipynb |
Figure 7, Figure S4 | 30 mins | |
test_simulation.sh / test_simulation.ipynb |
Figure 8, Figures S6, S7, S8 | 28 hours | |
real_soil.sh / real_wind.sh / real_dataset_soil_wind.ipynb |
Figures 11, 12 | 3 hours | set folder in Jupyter notebook for soil/wind results |
real_windspeed_3D.sh / real_windspeed_3D.ipynb |
Figure 13 (a), (b), (c) | 12 hours | |
test_kl_largeN.sh / test_kl_largeN.ipynb / test_kl_largeN_time.ipynb |
Figure S5 | 15 mins | |
real_soil_full.sh / real_dataset_soil-2m.ipynb |
Figure S11 | 24 hours |
Script/Notebook | Corresponding Figure(s) | Training Time |
---|---|---|
pred_interval_simu.sh / plot_simu_pred.ipynb |
Figure 9, Figure S9 | 3.5 hours |
prediction_3d.sh / plot_realdata_pred.ipynb |
Figure 13 (d) | 10 hours |
prediction_2d.sh / plot_realdata_pred.ipynb |
Figures S12, S13 | 7.5 hours |