.. # For CoSMoMVPA's license terms and conditions, see # # the COPYING file distributed with CoSMoMVPA # .. _`ex_measures`: Using CoSMoMVPA *measures* ========================== Rationale +++++++++ Previous exercises discussed how to compute both split-half correlations (:ref:`ex_splithalf_correlations`) and classification accuracies using cross-validation (:ref:`ex_nfold_crossvalidation`). Both types of analysis share a common input and output pattern: - The input is a dataset struct (with at least ``.sa.chunks`` and ``.sa.targets`` properly set) - The output consists of one or more values, which can be stored into a new output dataset - Optionally, there may be some miscellaneous options that specify specifics on the operations. + for split-half correlations, these could be how to normalize the correlations (the default is a Fisher-transform using the ``atanh`` function) + for cross-validation, these are the partitioning scheme for cross-validation (e.g. odd-even or n-fold) and the classifier (e.g. LDA, SVM, or Naive Bayes) to use. This pattern is captured by the CoSMoMVPA dataset :ref:`measure ` concept. An important reason for using measures is that it allows for a flexible implementation of :ref:`searchlights `, which involve the repeated application of the same measure to subsets of features. The measure concept +++++++++++++++++++ A dataset measure is a function with the following signature: .. code-block:: matlab output = dataset_measure(dataset, args) where: - ``dataset`` is a dataset struct. - ``args`` are options specific for that measure. - ``output`` must be a ``struct`` with fields ``.samples`` (in column vector format) and optionally a field ``.sa`` + it should not have fields ``.fa``. + usually it has no field ``.a`` (except for some complicated cases where it can have an ``.a.sdim`` field, if the measure returns data in a dimensional format). Split-half correlations using a measure +++++++++++++++++++++++++++++++++++++++ Before starting this exercise, please make sure you have read about: - :ref:`matlab_octave_function_handles` - :ref:`cosmomvpa_measure` As a first exercise, load two datasets using subject ``s01``'s T-statistics for odd and even runs (``glm_T_stats_even.nii`` and ``glm_T_stats_odd.nii``) and the VT mask. Assign targets and chunks, then join the two datasets using ``cosmo_stack``. Then use :ref:`cosmo_correlation_measure` to: - compute the correlation information (Fisher-transformed average of on-diagonal versus off-diagonal elements). - the raw correlation matrix. Then compute the correlation information for each subject, and perform a t-test against zero over subjects. Template: :ref:`run_correlation_measure_skl` Check your answers here: :ref:`run_correlation_measure` / :pb:`correlation_measure` .. _`ex_measures_crossvalidation`: Classifier with cross-validation using a measure ++++++++++++++++++++++++++++++++++++++++++++++++ As a second exercise, load a dataset using subject ``s01``'s T-statistics for every run (``glm_T_stats_perrun.nii``) and the VT mask. Assign targets and chunks, then use the LDA classifier (:ref:`cosmo_classify_lda`) and n-fold partitioning (:ref:`cosmo_nfold_partitioner`) to compute classification accuracy using n-fold cross-validation, using :ref:`cosmo_crossvalidation_measure`. Then compute confusion matrices using different classifiers, such as :ref:`cosmo_classify_lda`, :ref:`cosmo_classify_nn`, and :ref:`cosmo_classify_naive_bayes`. If LIBSVM or the `Matlab statistics` toolbox are available, you can also use :ref:`cosmo_classify_svm`. Template: :ref:`run_crossvalidation_measure_skl` Check your answers here: :ref:`run_crossvalidation_measure` / :pb:`crossvalidation_measure` .. include:: links.txt