This repository contains the official authors' implementation associated with the paper ThermalNeRF: Thermal Radiance Fields.
Abstract: Thermal imaging has a variety of applications, from agricultural monitoring to building inspection to imaging under poor visibility, such as in low light, fog, and rain. However, reconstructing thermal scenes in 3D presents several challenges due to the comparatively lower resolution and limited features present in long-wave infrared (LWIR) images. To overcome these challenges, we propose a unified framework for scene reconstruction from a set of LWIR and RGB images, using a multispectral radiance field to represent a scene viewed by both visible and infrared cameras, thus leveraging information across both spectra. We calibrate the RGB and infrared cameras with respect to each other, as a preprocessing step using a simple calibration target. We demonstrate our method on real-world sets of RGB and LWIR photographs captured from a handheld thermal camera, showing the effectiveness of our method at scene representation across the visible and infrared spectra. We show that our method is capable of thermal super-resolution, as well as visually removing obstacles to reveal objects that are occluded in either the RGB or thermal channels.
This work's codebase is built on top of the Nerfstudio project. As a result, the official Nerfstudio documentation might also be a helpful resource for any additional questions unanswered by this document, which also adapts parts of the Nerfstudio README.
You must have an NVIDIA video card with CUDA installed on the system. This project has been tested with version 11.8 of CUDA.
This project requires python >= 3.8
. We recommend using conda to manage dependencies.
conda create --name thermalnerf -y python=3.8
conda activate thermalnerf
python -m pip install --upgrade pip
Install PyTorch with CUDA (this repo has been tested with CUDA 11.8) and tiny-cuda-nn.
cuda-toolkit
is required for building tiny-cuda-nn
.
For CUDA 11.8:
pip install torch==2.1.2+cu118 torchvision==0.16.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
pip install ninja git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
Additionally, install COLMAP:
conda install -c conda-forge colmap
git clone git@github.com:yvette256/nerfstudio-thermal.git
cd nerfstudio-thermal
pip install --upgrade pip setuptools
pip install -e .
To process FLIR images (in a directory FLIR_DATA_PATH
) for training, run the following command.
This command computes the RGB and thermal camera intrinsics/extrinsics using the calibration
images in CALIBRATION_DATA_PATH
and populates the directory DATA_PATH
.
python ns-process-data rgbt --data FLIR_DATA_PATH --output-dir DATA_PATH --calibration-data CALIBRATION_DATA_PATH
IMPORTANT: This script does currently hard-code some assumptions. Changing the hard-coded nature of these quirks is on our to-do list, but in the meantime, users should know about the following:
- The calibration pattern of the target we used (4 x 11 asymmetric grid of circular cutouts, each with a diameter of 15mm and a center-center distance of 38mm) is hard-coded in this function. If you wish to use your own custom calibration pattern, you will need to edit the code in the function.
- We hard-code the assumption that the 3rd and 4th (in lexicographic order) images in
FLIR_DATA_PATH
are taken from camera positions 1m apart here. This is to resolve the global scale ambiguity in the output of COLMAP, which is used to estimate RGB pose. Precisely, to resolve this ambiguity, it is sufficient to know the distance between camera positions of any two images, so this can be edited to reflect any two images and any distance when using custom data.
Use the --help
command to see the full list of configuration options.
This includes additional configuration options stemming from the original Nerfstudio image processing scripts on which
our script is built.
Most of these should still work, but have not been extensively tested.
python ns-process-data rgbt --help
To train the thermal-nerfacto model on the (processed) thermal data, run
python ns-train thermal-nerfacto --data DATA_PATH
Configuration options for thermal-nerfacto
How to treat density between RGB/T (rgb_only
only reconstructs RGB field).
Density loss (L1 norm of <rgb density> - <thermal density>
) multiplier.
Relative influence on RGB density in the L1 density loss (applied on top of density_loss_mult
).
Cross-channel gradient loss multiplier.
Thermal pixel-wise reconstruction loss multiplier.
Pixelwise thermal TV loss multiplier.
We support four different methods to track training progress, using the viewer: tensorboard, Weights and Biases, and Comet. You can specify which visualizer to use by appending --vis {viewer, tensorboard, wandb, comet viewer+wandb, viewer+tensorboard, viewer+comet}
to the training command.
It is possible to load a pretrained model by running
python ns-train nerfacto --data DATA_PATH --load-dir MODEL_PATH
Use the --help
command to see the full list of configuration options.
This includes additional configuration options stemming from the original Nerfstudio nerfacto model on which
thermal-nerfacto is built.
Most of these should still work, but have not been extensively tested with thermal-nerfacto.
python ns-train thermal-nerfacto --help
When training, navigating to the link at the end of the terminal will load the webviewer. Otherwise, given a pretrained model checkpoint, you can start the viewer by running
python ns-viewer --load-config {outputs/.../config.yml}
In the viewer, the thermal outputs are named similarly to the RGB outputs, but with _thermal
appended.
So rgb
is the RGB view while rgb_thermal
is the thermal view, depth
is the RGB depth while depth_thermal
is the
thermal depth, etc.
(We know this results in some unintuitive names, sorry about that, it may be changed in the future.)
To render outputs from train/test views, run
python ns-render dataset --load-config {outputs/.../config.yml}
Command-line arguments
Split to render. (default: test)
Name of the renderer outputs to use. As described previously, the thermal outputs are named similarly to the RGB outputs, but with _thermal
appended.
So rgb
is the RGB view while rgb_thermal
is the thermal view, depth
is the RGB depth while depth_thermal
is the
thermal depth, etc.
Removing hidden objects
We can remove occluding objects from RGB or thermal views, thus revealing objects hidden behind other objects, by
rendering only the parts of the scene with RGB and thermal densities sufficiently similar to each other.
To demonstrate this, use --rendered-output-names removal
to render hidden RGB objects and/or
--rendered-output-names removal_thermal
to render hidden thermal objects.
If desired, use --removal_min_density_diff FLOAT
to specify the minimum difference between rgb and thermal densities
allowed for removal rendering.
This material is based upon work supported by the National Science Foundation under award number 2303178 to SFK. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
If you find this repo useful for your projects, please consider citing:
@inproceedings{lin2024thermalnerf,
title = {{ThermalNeRF}: Thermal Radiance Fields},
author = {Lin, Yvette Y and Pan, Xin-Yi and Fridovich-Keil, Sara and Wetzstein, Gordon},
year = {2024},
booktitle = {IEEE International Conference on Computational Photography (ICCP)},
organization = {IEEE}
}