function stat_ds=cosmo_stat(ds, stat_name, output_stat_name)
% compute t-test or F-test (ANOVA) statistic
%
% stat_ds=cosmo_stats(ds, stat_name[, output_stat_name])
%
% Inputs:
% ds dataset struct with
% .samples PxQ, for P observations on Q features
% .sa.targets Px1 observation conditions (classes)
% .sa.chunks Px1 observation chunks (e.g. subjects).
% stat_name One of [*]:
% 't' : one-sample t-test against zero (nclasses==1),
% or paired t-test (nclasses==2)
% 't2': two-sample t-test with equal variance,
% (nclasses==2)
% 'F' : one-way ANOVA or repeated measures ANOVA
% (nclasses>=2)
% [*] nclasses is the number of unique values in
% ds.sa.targets
% [**] if nclasses==2, the constrast of unq(1) minus
% unq(2) is returned, where
% unq=unique(ds.sa.targets);
% Because unique returns the results in sorted
% order from small to large, this means that
% the contrast is computed of the smaller value
% of targets minus the larger. This may be somewhat
% confusing, in the sense that the output statistics
% (t- or z-score) gives a positive value if samples
% associated with the lower .sa.targets value have a
% higher mean than those associated with the
% higher .sa.targets value.
% output_stat_name (optional) 'left', 'right', 'both', 'z', 'p', or
% the empty string '' (default).
% - 'left', 'right', and 'both' return a p-value with
% the specified tail.
% - 'p' returns a p-value, with tail='right' if
% stat_name='F' and tail='both' otherwise.
% - 'z' returns a z-score corresponding to the p-value
% - '' (empty) returns the t or F statistic.
% Missing values can be indicated by NaNs; if these are
% present and a p-value or z-score is returned, then
% these values are computed with a possible variable
% number of degrees of freedom across features.
%
% Returns:
% stat_ds dataset struct with fields:
% .samples 1xQ statistic value, or (if output_stat_name is
% non-empty) z-score or p-value. See the Notes below
% for interpreting p-values.
% .sa struct with field:
% .stats One of 'Ftest(df1,df2)', 'Ttest(df)', 'Zscore()', or
% 'Pval()', where df* are the degrees of freedom
% .[f]a identical to the input ds.[f]a, if present.
%
% Examples:
% % one-sample t-test
% % make a simple dataset
% ds=struct();
% ds.samples=reshape(mod(1:7:(12*3*7),13)',[],3)-3;
% ds.sa.targets=ones(12,1);
% ds.sa.chunks=(1:12)';
% cosmo_disp(ds.samples);
% %|| [ -2 4 -3
% %|| 5 -2 4
% %|| -1 5 -2
% %|| : : :
% %|| 9 2 8
% %|| 3 9 2
% %|| -3 3 9 ]@12x3
% %
% % run one-sample t-test
% s=cosmo_stat(ds,'t');
% cosmo_disp(s);
% %|| .samples
% %|| [ 2.49 3.36 2.55 ]
% %|| .sa
% %|| .stats
% %|| { 'Ttest(11)' }
% %
% % compute z-score of t-test
% s=cosmo_stat(ds,'t','z');
% cosmo_disp(s);
% %|| .samples
% %|| [ 2.17 2.73 2.21 ]
% %|| .sa
% %|| .stats
% %|| { 'Zscore()' }
% %
% % compute (two-tailed) p-value of t-test
% s=cosmo_stat(ds,'t','p');
% cosmo_disp(s);
% %|| .samples
% %|| [ 0.03 0.00633 0.0268 ]
% %|| .sa
% %|| .stats
% %|| { 'Pval()' }
% %
% % compute left-tailed p-value of t-test
% s=cosmo_stat(ds,'t','left');
% cosmo_disp(s);
% %|| .samples
% %|| [ 0.985 0.997 0.987 ]
% %|| .sa
% %|| .stats
% %|| { 'Pval()' }
%
% % one-way ANOVA
% % each observation is independent and thus each chunk is unique;
% % there are three conditions with four observations per condition
% ds=struct();
% ds.samples=reshape(mod(1:7:(12*3*7),13)',[],3)-3;
% ds.sa.targets=repmat(1:3,1,4)';
% ds.sa.chunks=(1:12)';
% s=cosmo_stat(ds,'F');
% cosmo_disp(s);
% %|| .samples
% %|| [ 0.472 0.0638 0.05 ]
% %|| .sa
% %|| .stats
% %|| { 'Ftest(2,9)' }
% % compute z-score
% s=cosmo_stat(ds,'F','z'); % convert to z-score
% cosmo_disp(s);
% %|| .samples
% %|| [ -0.354 -1.54 -1.66 ]
% %|| .sa
% %|| .stats
% %|| { 'Zscore()' }
%
%
% % two-sample t-test
% % each observation is independent and thus each chunk is unique;
% % there are two conditions with four observations per condition
% ds=struct();
% ds.samples=reshape(mod(1:7:(12*3*7),13)',[],3)-3;
% ds.sa.targets=repmat(1:2,1,6)';
% ds.sa.chunks=(1:12)';
% s=cosmo_stat(ds,'t2','p'); % return p-value
% cosmo_disp(s);
% %|| .samples
% %|| [ 0.0307 0.000242 7.07e-05 ]
% %|| .sa
% %|| .stats
% %|| { 'Pval()' }
% %
% % for illustration, this test gives the same p-values as a
% % repeated measures ANOVA
% s=cosmo_stat(ds,'F','p');
% cosmo_disp(s);
% %|| .samples
% %|| [ 0.0307 0.000242 7.07e-05 ]
% %|| .sa
% %|| .stats
% %|| { 'Pval()' }
%
% Notes:
% - If output_stat_name is not provided or empty, then this function runs
% considerably faster than the builtin matlab functions
% (ttest, ttest2, or anova1).
% - In the case of two classes, the sign of the result may seem counter-
% intuitive (see the [**]) comment above).
% - When output_stat_name=='p' then the p-values returned are the same as
% the builtin matlab functions anova1, ttest, and ttest2 with the
% default tails.
% - To run a one-sample t-tests against x (if x~=0), one has to
% subtract x from ds.samples before using ds as input to this function
% - The .sa.chunks and .sa.targets determine which test is performed:
% * statname=='t': all chunks are unique => one-sample t-test
% : each chunk present twice => paired-sample t-test
% * statname=='F': all chunks are unique => one-way ANOVA
% : each chunk present N times => repeated measures ANOVA
% See cosmo_montecarlo_cluster_stat for examples on how .sa.targets and
% .sa.chunks should be set for different statistics.
% - Missing values can be indicated by NaNs, and if the output is a
% p-value (or a z-score based on the p-value), then the output is
% computed for different features using varying degrees of freedom.
% For example, if the dataset has 10 samples and a one-sample t-test is
% used, z-scores for samples with no NaN values is based on the
% t-statistic with df=9, but those with two missing values (NaN values)
% are based on the t-statistic with df=7. A use case is computing fMRI
% group statistics where overlap across brains is not perfect at the
% voxel-by-voxel level, in as imilar approach as AFNI's 3dttest++ with
% the '-toz' option.
% - This function computes feature-wise statistics that are not corrected
% for multiple comparisons. To correct for multiple comparisons, see
% cosmo_montecarlo_cluster_stat.
%
% See also: anova1, ttest, ttest2, cosmo_montecarlo_cluster_stat
%
% # For CoSMoMVPA's copyright information and license terms, #
% # see the COPYING file distributed with CoSMoMVPA. #