Using CoSMoMVPA measures

Rationale

Previous exercises discussed how to compute both split-half correlations (Split-half correlation-based MVPA with group analysis) and classification accuracies using cross-validation (Classification analysis with cross-validation). Both types of analysis share a common input and output pattern:

  • The input is a dataset struct (with at least .sa.chunks and .sa.targets properly set)

  • The output consists of one or more values, which can be stored into a new output dataset

  • Optionally, there may be some miscellaneous options that specify specifics on the operations.

    • for split-half correlations, these could be how to normalize the correlations (the default is a Fisher-transform using the atanh function)

    • for cross-validation, these are the partitioning scheme for cross-validation (e.g. odd-even or n-fold) and the classifier (e.g. LDA, SVM, or Naive Bayes) to use.

This pattern is captured by the CoSMoMVPA dataset measure concept. An important reason for using measures is that it allows for a flexible implementation of searchlights, which involve the repeated application of the same measure to subsets of features.

The measure concept

A dataset measure is a function with the following signature:

output = dataset_measure(dataset, args)

where:

  • dataset is a dataset struct.

  • args are options specific for that measure.

  • output must be a struct with fields .samples (in column vector format) and optionally a field .sa

    • it should not have fields .fa.

    • usually it has no field .a (except for some complicated cases where it can have an .a.sdim field, if the measure returns data in a dimensional format).

Split-half correlations using a measure

Before starting this exercise, please make sure you have read about:

As a first exercise, load two datasets using subject s01’s T-statistics for odd and even runs (glm_T_stats_even.nii and glm_T_stats_odd.nii) and the VT mask. Assign targets and chunks, then join the two datasets using cosmo_stack. Then use cosmo correlation measure to:

  • compute the correlation information (Fisher-transformed average of on-diagonal versus off-diagonal elements).

  • the raw correlation matrix.

Then compute the correlation information for each subject, and perform a t-test against zero over subjects.

Template: run correlation measure skl

Check your answers here: run correlation measure / Matlab output: run_correlation_measure

Classifier with cross-validation using a measure

As a second exercise, load a dataset using subject s01’s T-statistics for every run (glm_T_stats_perrun.nii) and the VT mask.

Assign targets and chunks, then use the LDA classifier (cosmo classify lda) and n-fold partitioning (cosmo nfold partitioner) to compute classification accuracy using n-fold cross-validation, using cosmo crossvalidation measure.

Then compute confusion matrices using different classifiers, such as cosmo classify lda, cosmo classify nn, and cosmo classify naive bayes. If LIBSVM or the Matlab statistics toolbox are available, you can also use cosmo classify svm.

Template: run crossvalidation measure skl

Check your answers here: run crossvalidation measure / Matlab output: run_crossvalidation_measure