This repository contains the Birds-to-Words dataset, a collection of paragraph-length descriptions of the differences between pairs of iNaturalist bird photographs.
The Birds-to-Words dataset was introduced in the paper:
Neural Naturalist: Generating Fine-Grained Image Comparisons
Maxwell Forbes, Christine Kaeser-Chen, Piyush Sharma, Serge Belongie
EMNLP 2019
Please see the Neural Naturalist project page for an overview of the research project and publication.
The data is provided in the file birds-to-words-v1.0.tsv
in this repository.
Animal 1 | Animal 2 |
---|---|
photo: John Ratzlaff (CC BY-NC-ND 4.0) |
photo: Jessica (CC BY-NC 4.0) |
Comparative Descriptions (four different writers):
-
Animal 1 is brown and white with a squatty body with a light brown head. Animal 2 is multi-colored with a light blue and black head.
-
Animal 1 has a brown head and wings, with a pale breast. The breast also has darker brown speckles on it. Animal 2 has a bright blue area around its eye, with a black patch right along the eye. Animal 2 also has a darker brown breast and greenish wings and back of its head.
-
Animal 1 has a brown and white face, animal 2 has a black and bright blue face. Animal 1 has a white breast with black spots, animal 2 has a brown breast. Animal 1 has brown wings, animal 2 has green wings.
-
Animal 1 is much smaller and shorter. Animal 2 has a larger head and longer tail feathers. Animal 1 has extensive spotting on the neck, chest, and belly. Animal 2 has turquoise head patches and brown coloring on the chest and belly.
The tsv
file is tab-separated and contains the following eleven columns:
Column | Name | Type | Description |
---|---|---|---|
1 | img1ObservationURL |
string | URL of the iNaturalist photo record (including metadata) corresponding to the left image in the pair |
2 | img1ImgURL |
string | URL of the left image itself |
3 | img1Species |
string | Scientific species name for the animal in the left image |
4 | img1Selection |
string | How the left image was selected in the "pivot-branch" stratified sampling procedure described in the paper. Value is one of: {base, visual, sameSpecies, sameGenus, sameFamily, sameOrder, sameClass} |
5 | img2ObservationURL |
string | URL of the iNaturalist photo record (including metadata) corresponding to the right image in the pair |
6 | img2ImgURL |
string | URL of the right image itself |
7 | img2Species |
string | Scientific species name for the animal in the right image |
8 | img2Selection |
string | How the right image was selected in the "pivot-branch" stratified sampling procedure described in the paper. Value is one of: {base, visual, sameSpecies, sameGenus, sameFamily, sameOrder, sameClass} |
9 | split |
string | Split for training models and reporting results. One of: {train, val, test} |
10 | annN |
int | We collect up to five annotations of each image pair. This is the annotation number of this instance. Value is one of: {1,2,3,4,5} |
11 | description |
string | A natural language paragraph describing the differences between the animals in the two photographs |
The Birds-to-Words dataset is released under the Creative Commons Attribution-ShareAlike 4.0 International License. For the full license, see LICENSE.txt
.