12  Prior Neural Pain Signature Responses

12.1 Starting Project

12.1.1 Locate data

On TACC, the neuroimaging data are stored underneath the releases. For example, data release v2.#.# is underneath

pre-surgery/mris

The signature response are underneath derivatives/signatures

$ ls derivatives/signatures/
cleaned    confounds.json      signatures-by-part-diff       signatures-by-part.json  signatures-by-run-diff       signatures-by-run.json  signatures-by-tr-diff       signatures-by-tr.json
confounds  signatures-by-part  signatures-by-part-diff.json  signatures-by-run        signatures-by-run-diff.json  signatures-by-tr        signatures-by-tr-diff.json

Signature responses are stored either “by-run” (that is, one response per scan), “by-part” (three responses per run corresponding to the three parts for which participants provided pain ratings), or “by-tr” (one response for every volume). The biomarker corresponds to the values that are “by-run”. Additionally, responses may be calculated with only the data from a single run (e.g., a “by-run” response for REST1, CUFF1, CUFF2, and REST2), or they may be calculated as a difference (“diff”) between one of the CUFF scans and one of the REST scans.

Each signature response folder contains a table of values, and *.json sidecars are data dictionaries that conform to BIDS. The data dictionary for responses “by-run” is copied below.

{
    "signature": {
        "LongName": "Signature",
        "Description": "Index for of Signature. See signature_labels.json"
    },
    "correlation": {
        "LongName": "Correlation",
        "Description": "Signature as Estimated by Correlation"
    },
    "dot": {
        "LongName": "Dot Product",
        "Description": "Signature as Estimated by Dot Product"
    },
    "cosine": {
        "LongName": "Cosine Similarity",
        "Description": "Signature as Estimated by Cosine Similarity"
    },
    "sub": {
        "LongName": "Subject",
        "Description": "Study Participant, BIDS Subject ID",
        "TermURL": "https://bids-specification.readthedocs.io/en/v1.9.0/appendices/entities.html#sub"
    },
    "ses": {
        "LongName": "Session",
        "Description": "Visit, Protocol, BIDS Session ID",
        "Levels": {
            "V1": "baseline_visit",
            "V3": "3mo_postop"
        },
        "TermURL": "https://bids-specification.readthedocs.io/en/v1.9.0/appendices/entities.html#ses"
    },
    "task": {
        "LongName": "Task",
        "Description": "Functional MRI Task, BIDS Task ID",
        "Levels": {
            "cuff": "cuff pressure scan",
            "rest": "resting state scan"
        },
        "TermURL": "https://bids-specification.readthedocs.io/en/v1.9.0/appendices/entities.html#task"
    },
    "run": {
        "LongName": "Run",
        "Description": "Task Run Number, BIDS Run ID",
        "TermURL": "https://bids-specification.readthedocs.io/en/v1.9.0/appendices/entities.html#run"
    }
}

Note: the table mentions “session”, but in this release only V1 (baseline) results are available.

12.1.2 Extract data

The tabular data comprise parquet files that have been partitioned in a hive-style. That is, subfolder names contain column information – in this case subject ID (REDCap Record ID), task, and run.

$ tree signature-by-run
signature-by-run
├── sub=10003
  └── ses=V1
     ├── task=cuff
     │  └── run=1
     │     └── part-0.parquet
     └── task=rest
        ├── run=1
        │  └── part-0.parquet
        └── run=2
           └── part-0.parquet
├── sub=10008
  └── ses=V1
     ├── task=cuff
     │  └── run=1
     │     └── part-0.parquet
     └── task=rest
        ├── run=1
        │  └── part-0.parquet
        └── run=2

The biomarker is based on the CUFF1 (task=cuff/run=1) scan. The other scans are available for secondary analyses, but please note that not all participants have all tasks and runs available.

To load the whole dataset, the parquet files may be read individually or using a tool that is aware of the hive-partitioning structure. In R, a good choice is the arrow library.

library(arrow)
library(dplyr)
library(tidyr)

open_dataset("data/signatures-by-run") |>
  filter(signature %in% c("grouppred_cvpcr", "137subjmap_weighted_mean")) |>
  filter(task == "cuff", run == 1) |>
  select(signature, value = correlation, sub, ses) |>
  collect() |>
  pivot_wider(names_from = signature) |>
  rename(
    SIIPS1 = `137subjmap_weighted_mean`,
    NPS = `grouppred_cvpcr`
  )
# A tibble: 716 × 4
     sub ses     SIIPS1      NPS
   <int> <chr>    <dbl>    <dbl>
 1 10003 V1    -0.00162  0.0331 
 2 10010 V1    -0.0611  -0.0200 
 3 10011 V1    -0.00652 -0.0740 
 4 10008 V1    -0.0407  -0.0581 
 5 10013 V1    -0.0279   0.0234 
 6 10014 V1    -0.00120 -0.0146 
 7 10017 V1    -0.0165   0.0263 
 8 10015 V1    -0.0576  -0.0178 
 9 10020 V1    -0.0103  -0.0530 
10 10023 V1     0.0116  -0.00789
# ℹ 706 more rows

In python, a good choice is the polars library

import polars as pl

pl.read_parquet("data/signatures-by-run").filter(
    pl.col("signature").is_in(["grouppred_cvpcr", "137subjmap_weighted_mean"])
).filter(pl.col("task") == "cuff", pl.col("run") == 1).rename(
    {"correlation": "value"}
).select(
    "signature",
    "value",
    "sub",
    "ses",
).pivot(
    on="signature", index=["sub", "ses"]
).rename(
    {"grouppred_cvpcr": "NPS", "137subjmap_weighted_mean": "SIIPS1"}
)
shape: (716, 4)
┌───────┬─────┬───────────┬───────────┐
 sub   ┆ ses ┆ SIIPS1    ┆ NPS       │
 ------------
 i64   ┆ str ┆ f64       ┆ f64       │
╞═══════╪═════╪═══════════╪═══════════╡
 10003 ┆ V1  ┆ -0.001619 ┆ 0.033091  │
 10008 ┆ V1  ┆ -0.040698-0.058134
 10010 ┆ V1  ┆ -0.061107-0.019997
 10011 ┆ V1  ┆ -0.006518-0.074017
 10013 ┆ V1  ┆ -0.027933 ┆ 0.023413  │
 …     ┆ …   ┆ …         ┆ …         │
 25266 ┆ V1  ┆ -0.058184-0.032597
 25271 ┆ V1  ┆ -0.039983-0.092517
 25273 ┆ V1  ┆ -0.038248-0.02306
 25275 ┆ V1  ┆ -0.052387-0.06732
 25277 ┆ V1  ┆ 0.023143  ┆ -0.063882
└───────┴─────┴───────────┴───────────┘

12.2 Considerations While Working on the Project

12.2.1 Variability Across Scanners

Many MRI biomarkers exhibit variability across the scanners, which may confound some analyses. For an up-to-date assessment of the issue and overview of current thinking, please see Confluence.

12.2.2 Data Quality

As with any MRI derivative, all pipeline derivatives have been included. This means that products were included regardless of their quality, and so some products may have been generated from images that are known to have poor quality—rated “red”, or incomparable. For details on the ratings and how to exclude them, see Appendix A. Additionally, extensive QC has not yet been performed on the derivatives themselves, and so there may be cases where pipelines produced atypical outputs. For an overview of planned checks, see Confluence.

12.2.3 Signature Response Measure

The signature responses were extracted using best-practices, but the imaging DIRC is currently exploring alternative ways of calculating signature responses in the CUFF tasks.

12.2.4 Intermediate Outputs

The other folders contain intermediate outputs that may be useful for digging into a participant’s results

  • confounds
    • The nuisance timeseries that were used during denoising
  • cleaned
    • The NifTI files of functional MRI after denoising (e.g., temporal filter, nuisance regression)

12.3 Citations

In publications or presentations including data from A2CPS, please include the following statement as attribution:

Data were provided [in part] by the A2CPS Consortium funded by the National Institutes of Health (NIH) Common Fund, which is managed by the Office of the Director (OD)/ Office of Strategic Coordination (OSC). Consortium components and their associated funding sources include Clinical Coordinating Center (U24NS112873), Data Integration and Resource Center (U54DA049110), Omics Data Generation Centers (U54DA049116, U54DA049115, U54DA049113), Multi-site Clinical Center 1 (MCC1) (UM1NS112874), and Multi-site Clinical Center 2 (MCC2) (UM1NS118922).

Note

The following published papers should be cited when referring to A2CPS Protocol and Biomarkers: Sluka et al. (2023) Berardi et al. (2022)

Berardi, G., Frey-Law, L., Sluka, K. A., Bayman, E. O., Coffey, C. S., Ecklund, D., Vance, C. G. T., Dailey, D. L., Burns, J., Buvanendran, A., McCarthy, R. J., Jacobs, J., Zhou, X. J., Wixson, R., Balach, T., Brummett, C. M., Clauw, D., Colquhoun, D., Harte, S. E., … Wandner, L. D. (2022). Multi-site observational study to assess biomarkers for susceptibility or resilience to chronic pain: The acute to chronic pain signatures (A2CPS) study protocol. Frontiers in Medicine, 9. https://doi.org/10.3389/fmed.2022.849214
Sluka, K. A., Wager, T. D., Sutherland, S. P., Labosky, P. A., Balach, T., Bayman, E. O., Berardi, G., Brummett, C. M., Burns, J., Buvanendran, A., et al. (2023). Predicting chronic postsurgical pain: Current evidence and a novel program to develop predictive biomarker signatures. Pain, 164(9), 1912–1926. https://doi.org/10.1097/j.pain.0000000000002938