cosmo stat hdrΒΆ

function stat_ds=cosmo_stat(ds, stat_name, output_stat_name)
% compute t-test or F-test (ANOVA) statistic
%
% stat_ds=cosmo_stats(ds, stat_name[, output_stat_name])
%
% Inputs:
%   ds                dataset struct with
%                       .samples PxQ, for P observations on Q features
%                       .sa.targets Px1 observation conditions (classes)
%                       .sa.chunks  Px1 observation chunks (e.g. subjects).
%   stat_name         One of [*]:
%                     't' : one-sample t-test against zero (nclasses==1),
%                           or paired t-test (nclasses==2)
%                     't2': two-sample t-test with equal variance,
%                           (nclasses==2) contrasting samples with unq(1)
%                           minus unq(2) where unq=unique(ds.sa.targets)
%                     'F' : one-way ANOVA or repeated measures ANOVA
%                           (nclasses>=2)
%                     [*] nclasses is the number of unique values in
%                         ds.sa.targets
%   output_stat_name  (optional) 'left', 'right', 'both', 'z', 'p', or
%                      the empty string '' (default).
%                     - 'left', 'right', and 'both' return a p-value with
%                        the specified tail.
%                     - 'p' returns a p-value, with tail='right' if
%                        stat_name='F' and tail='both' otherwise.
%                     - 'z' returns a z-score corresponding to the p-value
%                     - '' (empty) returns the t or F statistic.
%                     Missing values can be indicated by NaNs; if these are
%                     present and a p-value or z-score is returned, then
%                     these values are computed with a possible variable
%                     number of degrees of freedom across features.
%
% Returns:
%   stat_ds          dataset struct with fields:
%     .samples       1xQ statistic value, or (if output_stat_name is
%                    non-empty) z-score or p-value. See the Notes below
%                    for interpreting p-values.
%     .sa            struct with field:
%        .stats      One of 'Ftest(df1,df2)', 'Ttest(df)', 'Zscore()', or
%                    'Pval()', where df* are the degrees of freedom
%     .[f]a          identical to the input ds.[f]a, if present.
%
% Examples:
%     % one-sample t-test
%     % make a simple dataset
%     ds=struct();
%     ds.samples=reshape(mod(1:7:(12*3*7),13)',[],3)-3;
%     ds.sa.targets=ones(12,1);
%     ds.sa.chunks=(1:12)';
%     cosmo_disp(ds.samples);
%     %||   [ -2         4        -3
%     %||      5        -2         4
%     %||     -1         5        -2
%     %||      :         :         :
%     %||      9         2         8
%     %||      3         9         2
%     %||     -3         3         9 ]@12x3
%     %
%     % run one-sample t-test
%     s=cosmo_stat(ds,'t');
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ 2.49      3.36      2.55 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Ttest(11)' }
%     %
%     % compute z-score of t-test
%     s=cosmo_stat(ds,'t','z');
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ 2.17      2.73      2.21 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Zscore()' }
%     %
%     % compute (two-tailed) p-value of t-test
%     s=cosmo_stat(ds,'t','p');
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ 0.03   0.00633    0.0268 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Pval()' }
%     %
%     % compute left-tailed p-value of t-test
%     s=cosmo_stat(ds,'t','left');
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ 0.985     0.997     0.987 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Pval()' }
%
%     % one-way ANOVA
%     % each observation is independent and thus each chunk is unique;
%     % there are three conditions with four observations per condition
%     ds=struct();
%     ds.samples=reshape(mod(1:7:(12*3*7),13)',[],3)-3;
%     ds.sa.targets=repmat(1:3,1,4)';
%     ds.sa.chunks=(1:12)';
%     s=cosmo_stat(ds,'F');
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ 0.472    0.0638      0.05 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Ftest(2,9)' }
%     % compute z-score
%     s=cosmo_stat(ds,'F','z'); % convert to z-score
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ -0.354     -1.54     -1.66 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Zscore()' }
%
%
%     % two-sample t-test
%     % each observation is independent and thus each chunk is unique;
%     % there are two conditions with four observations per condition
%     ds=struct();
%     ds.samples=reshape(mod(1:7:(12*3*7),13)',[],3)-3;
%     ds.sa.targets=repmat(1:2,1,6)';
%     ds.sa.chunks=(1:12)';
%     s=cosmo_stat(ds,'t2','p'); % return p-value
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ 0.0307  0.000242  7.07e-05 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Pval()' }
%     %
%     % for illustration, this test gives the same p-values as a
%     % repeated measures ANOVA
%     s=cosmo_stat(ds,'F','p');
%     cosmo_disp(s);
%     %|| .samples
%     %||   [ 0.0307  0.000242  7.07e-05 ]
%     %|| .sa
%     %||   .stats
%     %||     { 'Pval()' }
%
% Notes:
%  - If output_stat_name is not provided or empty, then this function runs
%    considerably faster than the builtin matlab functions
%    (ttest, ttest2, or anova1).
%  - When output_stat_name=='p' then the p-values returned are the same as
%    the builtin matlab functions anova1, ttest, and ttest2 with the
%    default tails.
%  - To run a one-sample t-tests against x (if x~=0), one has to
%    subtract x from ds.samples before using ds as input to this function
%  - The .sa.chunks and .sa.targets determine which test is performed:
%    * statname=='t': all chunks are unique => one-sample t-test
%                   : each chunk present twice => paired-sample t-test
%    * statname=='F': all chunks are unique => one-way ANOVA
%                   : each chunk present N times => repeated measures ANOVA
%    See cosmo_montecarlo_cluster_stat for examples on how .sa.targets and
%    .sa.chunks should be set for different statistics.
%  - Missing values can be indicated by NaNs, and if the output is a
%    p-value (or a z-score based on the p-value), then the output is
%    computed for different features using varying degrees of freedom.
%    For example, if the dataset has 10 samples and a one-sample t-test is
%    used, z-scores for samples with no NaN values is based on the
%    t-statistic with df=9, but those with two missing values (NaN values)
%    are based on the t-statistic with df=7. A use case is computing fMRI
%    group statistics where overlap across brains is not perfect at the
%    voxel-by-voxel level, in as imilar approach as AFNI's 3dttest++ with
%    the '-toz' option.
%  - This function computes feature-wise statistics that are not corrected
%    for multiple comparisons. To correct for multiple comparisons, see
%    cosmo_montecarlo_cluster_stat.
%
% See also: anova1, ttest, ttest2, cosmo_montecarlo_cluster_stat
%
% #   For CoSMoMVPA's copyright information and license terms,   #
% #   see the COPYING file distributed with CoSMoMVPA.           #