/real-acoustic-fields

Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark

OtherNOASSERTION

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark [CVPR 2024]

Project Page Paper PDF

We present the Real Acoustic Fields (RAF) dataset that captures real acoustic room data from multiple modalities. The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms.

Updates

  • 2024/08/29: Uploaded details of the data split we used for experiments reported in the paper.
  • 2024/08/29: The visual data is now integrated in nerfstudio

Dataset

The Real Acoustic Fields dataset is hosted on AWS S3. We recommend using the AWS command line interface (AWS CLI), see AWS CLI installation instructions.

RIR

The RIR data is organized as follows:

    ├───data
    │   ├───000000
    │          rir.wav
    │          rx_pos.txt
    │          tx_pos.txt
    │   ├───000001
    :   :
    ├───metadata
    │      all_rx_pos.txt
    │      all_tx_pos.txt

The rx_pos.txt file contains the receiver's 3D location (xyz) in the room. As we used omni-directional microphones, the orientation isn't required.

The tx_pos.txt file contains the transmitter's (loudspeaker's) 3D orientation and 3D location (xyz) in the room. The orientation is given in quaternions, using the real-part last format: xyzW. The first four values in the file represent the loudspeaker's orientation, while the last three values indicate its 3D location.

The metadata subfolder consolidates all the receiver and transmitter data into two CSV files. Each row in these files corresponds to a subfolder in the data section. For instance, the first row corresponds to the folder data\000000, while row 4568 corresponds to the folder data\004568.

All the 3D positions are in meters. World coordinate system: X: front, Y-Up, Z-left, y=0 is ground plane.

🔽 Download Room Impulse Respose (RIR) Data

Preview link

Step 1. Increase the number of concurrent requests from 10 to 100 to download smaller files faster.

$ aws configure set default.s3.max_concurrent_requests 100

Step 2. Download the entire RAF RIR dataset (~21.6 GB).

$ mkdir raf_dataset && cd raf_dataset
$ aws s3 sync --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/real_acoustic_fields/rir .

Step 3. Use the zip command to combine the split zip files into a single zip archive.

$ zip -F raf_emptyroom.zip --out single-archive_emptyroom.zip
$ zip -F raf_furnishedroom.zip --out single-archive_furnishedroom.zip

Step 4. Now we can use unzip to open our combined archive.

$ unzip single-archive_emptyroom.zip
$ unzip single-archive_furnishedroom.zip

Step 5. Clean up

$ rm  raf_*room.z??

Visual Data

We have released the visual data from "Emptyroom" and "Furnishedroom" as part of another dataset release. Please refer to the Eyeful dataset for more details.

Visual Data Organization

Visual Data Download Instructions

3D Models (Reconstructed 3D Mesh)

Textured mesh in OBJ format, exported from Metashape and created from the full-resolution JPEG images. World coordinate system: X: front, Y-Up, Z-left. y=0 is ground plane, units are in meters.

You can download obj files of the rooms here: Empty Room and Furnished Room

Citation

If you find this repository and dataset useful in your research, please consider giving a star ⭐ and cite our CVPR 2024 paper by using the following BibTeX entrys.

@inproceedings{chen2024RAF,
      author    = { Chen, Ziyang and
                    Gebru, Israel D. and
                    Richardt, Christian and
                    Kumar, Anurag and
                    Laney, William and
                    Owens, Andrew and
                    Richard, Alexander},
      title     = {Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark},
      journal   = {The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR)},
      year      = {2024},
    }

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, as found in the LICENSE file.