4  Raw MRI Data Starter Kit

This page describes how to work with the raw MRI data from A2CPS. For background on collection and availability, see the “Imaging Data” section of a2cps.org. The data are provided in a format called the Brain Imaging Data Structure (BIDS)1. The raw data are most useful when you need to run custom preprocessing pipelines. If you do not need custom pipelines, please review the preface for a list of imaging products that have already been calculated.

4.1 On Starting Project

Once downloaded, the data will be stored deep inside the image03 folder.

$ ls image03/corral-secure/projects/A2CPS/products/consortium-data/pre-surgery/mris/bids/ | head
dataset_description.json
participants.json
participants.tsv
README
scans.json
sessions.json
sub-10003
sub-10005
sub-10008
sub-10010

4.1.1 Extract Data

The raw MRI data are organized according to v1.9.0 of the BIDS standard.

In BIDS data for individual subjects are stored in folders named “sub-[record_id]”. For an example A2CPS session this format results in the following2:

2 The tree command is just used here to display the folder structure, and is not required for you to have.

$ tree sub-10003
sub-10003
├── ses-V1
   ├── anat
   │   ├── sub-10003_ses-V1_T1w.json
   │   └── sub-10003_ses-V1_T1w.nii.gz
   ├── dwi
   │   ├── sub-10003_ses-V1_dwi.bval
   │   ├── sub-10003_ses-V1_dwi.bvec
   │   ├── sub-10003_ses-V1_dwi.json
   │   └── sub-10003_ses-V1_dwi.nii.gz
   ├── fmap
   │   ├── sub-10003_ses-V1_acq-dwib0_dir-AP_epi.json
   │   ├── sub-10003_ses-V1_acq-dwib0_dir-AP_epi.nii.gz
   │   ├── sub-10003_ses-V1_acq-dwib0_dir-PA_epi.json
   │   ├── sub-10003_ses-V1_acq-dwib0_dir-PA_epi.nii.gz
   │   ├── sub-10003_ses-V1_acq-dwib0_epi.bval
   │   ├── sub-10003_ses-V1_acq-dwib0_epi.bvec
   │   ├── sub-10003_ses-V1_acq-fmrib0_dir-AP_epi.json
   │   ├── sub-10003_ses-V1_acq-fmrib0_dir-AP_epi.nii.gz
   │   ├── sub-10003_ses-V1_acq-fmrib0_dir-PA_epi.json
   │   └── sub-10003_ses-V1_acq-fmrib0_dir-PA_epi.nii.gz
   ├── func
   │   ├── sub-10003_ses-V1_task-cuff_run-01_bold.json
   │   ├── sub-10003_ses-V1_task-cuff_run-01_bold.nii.gz
   │   ├── sub-10003_ses-V1_task-cuff_run-01_events.tsv
   │   ├── sub-10003_ses-V1_task-rest_run-01_bold.json
   │   ├── sub-10003_ses-V1_task-rest_run-01_bold.nii.gz
   │   ├── sub-10003_ses-V1_task-rest_run-02_bold.json
   │   └── sub-10003_ses-V1_task-rest_run-02_bold.nii.gz
   └── sub-10003_ses-V1_scans.tsv
└── sub-10003_sessions.tsv
5 directories, 25 files

The raw imaging data are in the (gzip compressed) NIfTI files, and each image is associated with a .json sidecar containing metadata about the scan and acquisition parameters (metadata that subsumes the NIfTI header).

The “session” refers to the visit. Currently only the baseline (pre-surgery) data is included, so all sessions have the label “V1”. Information about each participant’s visit is stored in the sub-[record_id]/sub-[record_id]_sessions.tsv file.

The pressures that were applied during the CUFF scans are recorded in the sub-[record_id]_ses-[ses_id]_task-cuff_run-[run_id]_events.tsv tables.

BIDS requires that the diffusion gradient information is stored according to the FSL format, in *bval and *bvec files in the /dwi directory. For details see: Magnetic Resonance Imaging - Brain Imaging Data Structure.

Files at the top level of the bids folder contain information that applies to multiple participants. This includes

File Documentation
README (A2CPS file) modality agnostic files
participants.tsv (dictionary) participants file
sub-[record_id]_ses-[ses_id]_scans.tsv tables (dictionary) scans file
sub-[record_id]_sessions.tsv tables (dictionary) sessions file
sub-[record_id]_ses-[ses_id]_task-cuff_run-[run_id]_events.tsv tables (dictionary) task events
dataset_description.json (A2CPS file) dataset descriptions

4.1.2 Data Quality

All raw data have undergone quality review. For details on the review process see A2CPS Imaging Quality Assurance.

The resulting overall quality ratings (red/yellow/green) are included in the *scans.tsv files. For example

$ cat sub-10003/ses-V1/sub-10003_ses-V1_scans.tsv 
filename    rating
func/sub-10003_ses-V1_task-cuff_run-01_bold.nii.gz  green
dwi/sub-10003_ses-V1_dwi.nii.gz green
func/sub-10003_ses-V1_task-rest_run-01_bold.nii.gz  green
func/sub-10003_ses-V1_task-rest_run-02_bold.nii.gz  green
anat/sub-10003_ses-V1_T1w.nii.gz    green
fmap/sub-10003_ses-V1_acq-fmrib0_dir-AP_epi.nii.gz  n/a
fmap/sub-10003_ses-V1_acq-fmrib0_dir-PA_epi.nii.gz  n/a
fmap/sub-10003_ses-V1_acq-dwib0_dir-AP_epi.nii.gz   n/a
fmap/sub-10003_ses-V1_acq-dwib0_dir-PA_epi.nii.gz   n/a

For a description of the reviews, see the scans.json data dictionary. Re-evaluation of quality and inclusion/exclusion criteria in the context of a specific analysis is always reasonable, but we would advise against including scans labeled “red” in most analyses.

Note that some “cuff” scans were collected without cuff inflation. These scans can be identified by the “applied_pressure” column of the *events.tsv file. For example

$ grep -b1 -HP "0\t450\t0" sub-*/ses-V1/func/*events.tsv | head -n 9
sub-10077/ses-V1/func/sub-10077_ses-V1_task-cuff_run-01_events.tsv-0-onset  duration    applied_pressure
sub-10077/ses-V1/func/sub-10077_ses-V1_task-cuff_run-01_events.tsv:32:0 450 0
--
sub-10077/ses-V1/func/sub-10077_ses-V1_task-cuff_run-02_events.tsv-0-onset  duration    applied_pressure
sub-10077/ses-V1/func/sub-10077_ses-V1_task-cuff_run-02_events.tsv:32:0 450 0
--
sub-10103/ses-V1/func/sub-10103_ses-V1_task-cuff_run-01_events.tsv-0-onset  duration    applied_pressure
sub-10103/ses-V1/func/sub-10103_ses-V1_task-cuff_run-01_events.tsv:32:0 450 0

4.2 Considerations While Working on Project

4.2.1 Data Generation

Each scanner’s manufacturer-specific algorithms and software are used for the reconstruction of image data from k-space (k-space data were not saved). The reconstructed images are exported in the Digital Imaging and Communications in Medicine (DICOM) format and sent electronically from participating sites to the Texas Advanced Computing Center (TACC). Following the Organization for Human Brain Mapping Committee on Best Practices in Data Analysis and Sharing report (Nichols et al., 2017), the DICOM files are converted into the Neuroimaging Informatics Technology Initiative (NIfTI) 1 format (Cox et al., 2004) and organized according to the BIDS specification (Gorgolewski et al., 2016). For this conversion, the pipeline uses HeuDiConv (Halchenko et al., 2024), which in turn uses dcm2niix (Li et al., 2016). Conversion relies on the ReproIn heuristic file (Visconti di Oleggio Castello et al., 2023), modified to encode the sequence and file naming conventions developed for the A2CPS project. The original DICOMs are stored for posterity but are not included in data releases. For additional details, see Sadil et al. (2024).

Nichols, T. E., Das, S., Eickhoff, S. B., Evans, A. C., Glatard, T., Hanke, M., Kriegeskorte, N., Milham, M. P., Poldrack, R. A., Poline, J.-B., et al. (2017). Best practices in data analysis and sharing in neuroimaging using MRI. Nature Neuroscience, 20(3), 299–303. https://doi.org/10.1038/nn.4500
Cox, R. W., Ashburner, J., Breman, H., Fissell, K., Haselgrove, C., Holmes, C. J., Lancaster, J. L., Rex, D. E., Smith, S. M., Woodward, J. B., et al. (2004). A (sort of) new image data format standard: NiFTI-1. 10th Annual Meeting of the Organization for Human Brain Mapping, 22, 01. https://nifti.nimh.nih.gov/pub/dist/doc/hbm_nifti_2004.pdf
Gorgolewski, K. J., Auer, T., Calhoun, V. D., Craddock, R. C., Das, S., Duff, E. P., Flandin, G., Ghosh, S. S., Glatard, T., Halchenko, Y. O., et al. (2016). The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Scientific Data, 3(1), 1–9. https://doi.org/10.1038/sdata.2016.44
Halchenko, Y. O., Goncalves, M., Ghosh, S., Velasco, P., Oleggio Castello, M. V. di, Salo, T., Wodder, J. T., Hanke, M., Sadil, P., Gorgolewski, K. J., et al. (2024). HeuDiConv—flexible DICOM conversion into structured directory layouts. Journal of Open Source Software, 9(99), 5839. https://doi.org/10.21105/joss.05839
Li, X., Morgan, P. S., Ashburner, J., Smith, J., & Rorden, C. (2016). The first step for neuroimaging data analysis: DICOM to NIfTI conversion. Journal of Neuroscience Methods, 264, 47–56. https://doi.org/10.1016/j.jneumeth.2016.03.001
Visconti di Oleggio Castello, M., Dobson, J. E., Sackett, T., Kodiweera, C., Haxby, J. V., Goncalves, M., Ghosh, S., & Halchenko, Y. O. (2023). ReproNim/reproin: 0.11.6.2. Zenodo. https://doi.org/10.5281/ZENODO.7975330

All MRI pipeline code is available through github: https://github.com/a2cps/mri_imaging_pipeline. Note that the code used to generate the BIDS derivatives has been updated several times (e.g., the same version of dcm2niix was not used to convert all participants). The specific version of the conversion code is stored in the GeneratedBy field of the dataset_description.json, with entries like

{
  "CodeUrl": "https://github.com/a2cps/mri_imaging_pipeline",
  "Container": {
    "Tag": "221121",
    "Type": "Docker",
    "URI": "docker://psadil/heudiconv:221121"
  },
  "Description": "WS20207V1",
  "Name": "heudiconv_app",
  "Version": "urrutia-heudiconv-0.11.6"
} 

This indicates that the presurgery scan session for participant 20207 (who happens to have been collected at Wayne State) was converted with the heudiconv_app version 0.11.6. That version is recorded here. This version of the app used the docker image psadil/heudiconv:221121.

4.2.2 Skull-Stripping for Privacy

Prior to release, all T1w images are skull-stripped using brain masks from the fMRIPrep pipeline. All DIRC processing is carried out using unmodified raw images, which is recommended for many measures like cortical thickness.

4.2.3 Acquisition Protocol Variation

Over the course of the project, individual scanners have occasionally needed upgrading. For a timeline of acquisition changes, see: Timeline of Imaging Acquisition Changes.

Occasionally, scans were collected with acquisition parameters that differed from the standards listed in the A2CPS Tech Manual. For a review of the kinds of deviations that have been observed, see Imaging Log - Problem Cases. In general, preprocessing pipelines should not assume that there is consistency in the acquisition parameters across participants (even in participants from the same site), and in all cases, the metadata within the json sidecars should be referenced as the best source of information on these parameters.

4.2.4 Software Packages for Working with BIDS

Several packages in multiple languages have been developed to simplify interacting with BIDS. For a complete list, see: Benefits, a subset of which is copied below

  • bids-matlab: MATLAB/Octave tools to interact with datasets conforming to the BIDS format
  • bidser: Working with Brain Imaging Data Structure in R
  • PyBIDS: Python package to quickly parse / search the components of a BIDS dataset. It also contains functionality for running analyses on your data

4.2.5 Joining with Other Modalities

A2CPS relies on several different schemes for labeling and describing metadata entities. For example, across modalities, participants are identified by Record ID, Subject GUID, Individual ID, and NDA GUID. For more details on how these identifiers relate, see Appendix B.

4.2.6 Citations

If you use the imaging data in your research, please cite the imaging pipeline preprint: Sadil et al. (2024).

Sadil, P., Arfanakis, K., Bhuiyan, E. H., Caffo, B., Calhoun, V. D., Clauw, D. J., DeLano, M. C., Ford, J. C., Gattu, R., Guo, X., Harris, R. E., Ichesco, E., Johnson, M. A., Jung, H., Kahn, A. B., Kaplan, C. M., Leloudas, N., Lindquist, M. A., Luo, Q., … Chronic Pain Signatures Consortium, T. A. to. (2024). Image processing in the acute to chronic pain signatures (A2CPS) project. bioRxiv. https://doi.org/10.1101/2024.12.19.627509

In publications or presentations including data from A2CPS, please include the following statement as attribution:

Data were provided [in part] by the A2CPS Consortium funded by the National Institutes of Health (NIH) Common Fund, which is managed by the Office of the Director (OD)/ Office of Strategic Coordination (OSC). Consortium components and their associated funding sources include Clinical Coordinating Center (U24NS112873), Data Integration and Resource Center (U54DA049110), Omics Data Generation Centers (U54DA049116, U54DA049115, U54DA049113), Multi-site Clinical Center 1 (MCC1) (UM1NS112874), and Multi-site Clinical Center 2 (MCC2) (UM1NS118922).

Note

The following published papers should be cited when referring to A2CPS Protocol and Biomarkers: Sluka et al. (2023) Berardi et al. (2022)

Berardi, G., Frey-Law, L., Sluka, K. A., Bayman, E. O., Coffey, C. S., Ecklund, D., Vance, C. G. T., Dailey, D. L., Burns, J., Buvanendran, A., McCarthy, R. J., Jacobs, J., Zhou, X. J., Wixson, R., Balach, T., Brummett, C. M., Clauw, D., Colquhoun, D., Harte, S. E., … Wandner, L. D. (2022). Multi-site observational study to assess biomarkers for susceptibility or resilience to chronic pain: The acute to chronic pain signatures (A2CPS) study protocol. Frontiers in Medicine, 9. https://doi.org/10.3389/fmed.2022.849214
Sluka, K. A., Wager, T. D., Sutherland, S. P., Labosky, P. A., Balach, T., Bayman, E. O., Berardi, G., Brummett, C. M., Burns, J., Buvanendran, A., et al. (2023). Predicting chronic postsurgical pain: Current evidence and a novel program to develop predictive biomarker signatures. Pain, 164(9), 1912–1926. https://doi.org/10.1097/j.pain.0000000000002938