Vision science - particularly machine vision - is being
revolutionized by large-scale datasets. State-of-the-art artificial
vision models critically depend on large-scale datasets to achieve high
performance. In contrast, although large-scale learning models (e.g., AlexNet) have
been applied to human neuroimaging data, the stimuli for such
neuroimaging experiments include significantly fewer images. The small size of these stimulus sets also translates to limited image
diversity. Here we dramatically increase the stimulus set size deployed
in an fMRI study of visual scene processing. We scanned
four participants in a slow-evented related design that incorporated
4,916 unique scenes. Data was collected over 16 sessions, 15 of which
were task-related sessions, plus an additional session for acquiring
high resolution anatomical scans. In 8 of the 15 task-related sessions, a
functional localizer was run in order to independently define
scene-selective cortex. In each scanning session, participants filled
out a questionnaire (Daily Intake) about their daily routine, including:
current status regarding food and beverage intake, sleep, exercise, ibuprofen,
and comfort in the scanner. During BOLD scanning, physiological data
(heart rate and respiration) was also acquired.
The
experiment including 4,803 images presented on a single trial
throughout the experiment, and 112 images repeated four times, and one
image repeated three times, throughout the experiment, yielding a total
of 5,254 stimuli trials. The stimuli were drawn from three datasets: 1)
1000 images from Scene Images (250 scene categories, based on SUN
categories, with four exemplars each); 2) 2000 images from the COCO
dataset; and 3) 1916 images from the ImageNet dataset. In the
experiment, images were presented for 1 second, with 9 seconds of
fixation between trials. Participants were asked to judge whether they
liked, disliked, or were neutral about the image.
In sum,
our dataset is unique in three ways: it is 1) significantly larger than
existing slow-event neural datasets by an order of magnitude, 2)
extremely diverse in stimuli, 3) considerably overlapping with existing
computer vision datasets. Our large-scale dataset enables novel neural
network training and novel exploration of benchmark computer vision datasets through
neuroscience. Finally, the scale advantage of our dataset and the use
of a slow event-related design enables, for the first time, joint
computer vision and fMRI analyses that span a significant and diverse
region of image space using high-performing models.
Please refer to our website for more details and future news and releases: BOLD5000.org
Corresponding paper published in Scientific Data:
Chang N., Pyles, J., Marcus, A., Gupta, A., Tarr, M., Aminoff, E. (2019). BOLD5000, a public fMRI dataset while viewing 5000 visual images. Scientific Data, 6:49 https://doi.org/10.1038/s41597-019-0052-3 arXiv preprint: https://arxiv.org/abs/1809.01281
v2: Added BOLD5000_ROIs.zip (9/7/18)
v3: Added BOLD5000_MRI-Protocols.zip (9/11/18)
v4: Added Austin Marcus as author and image stimuli files moved to a different location (see bold5000.org)