function ds=cosmo_fmri_dataset(filename, varargin)
% load an fmri volumetric dataset
%
% ds = cosmo_fmri_dataset(filename, [,'mask',mask],...
% ['targets',targets],...
% ['chunks',chunks])
%
% Inputs:
% filename One of:
% * filename of fMRI dataset, it should end with one of:
% .nii, .nii.gz NIfTI
% .hdr, .img ANALYZE
% +{orig,tlrc}.{HEAD,BRIK[.gz]} AFNI
% .vmr, .vmp, .vtc, .glm, .msk BrainVoyager
% .mat SPM (SPM.mat)
% .mat:beta SPM beta
% .mat:con SPM contrast
% .mat:spm SPM stats
% .mat Matlab file with
% CoSMoMVPA or PyMVPA
% dataset.
% * xff structure (from neuroelf's xff)
% * nifti structure (from load_untouch_nii)
% * FieldTrip source MEEG structure
% * SPM structure
% * CoSMoMVPA fMRI or MEEG source dataset structure
% * PyMVPA fMRI dataset structure (exported using PyMVPA's
% cosmo.map2cosmo function)
% 'mask', m Any input as for filename (in which case the output must
% contain a single volume), or one of:
% '-all' exclude features where all values are
% zero or NaN
% '-any' exclude features where any value is
% zero or NaN
% '-auto' require that '-all' and '-any' exclude the
% same features, and exclude the
% corresponding features
% true equivalent to '-auto'
% false do not apply a mask
% The mask must have voxels at the same coordinates as the
% data indicated by filename, although it may
% have a different orientation (e.g. RAI, LPI, AIR).
% Only voxels that are non-zero and not NaN are selected
% from the data indicated by filename.
% If 'mask' is not given, then no mask is applied and a
% warning message (suggesting to use a mask) is printed if
% at least 5% of the values are non{zero,finite}.
% 'targets', t optional Tx1 numeric labels of experimental
% conditions (where T is the number of samples (volumes)
% in the dataset)
% 'chunks, c optional Tx1 numeric labels of chunks, typically indices
% of runs of data acquisition
% 'volumes', v optional vector with indices of volumes to load. If
% empty or not provided, then all volumes are loaded.
% 'block_size', b optional block size by which data is read (if
% supported by the format; currently NIfTI, ANALYZE, AFNI
% and SPM. If this option is provided *and* a mask is
% provided, then data is loaded in chunks (subsets of
% volumes) that contain at most block_size elements each;
% only data that survives the mask is then selected before
% the next block is loaded.
% The default value is 20,000,000, corresponding to ~160
% megabytes of memory required for a block (using
% numbers with double (64 bit) precsision).
% The rationale for this option is to reduce memory
% requirements, at the expensive of a possible increase of
% duration of disk reading operations.
%
%
% Returns:
% ds dataset struct with the following fields:
% .samples NxM matrix containing the data loaded from filename, for
% N samples (observations, volumes) and M features (spatial
% locations, voxels).
% If the original nifti file contained data with X,Y,Z,T
% elements in the three spatial and one temporal dimension
% and no mask was applied, then .samples will have
% dimensions N x M, where N = T and M = X*Y*Z. If a mask
% was applied then M (M<=X*Y*Z) is the number of non-zero
% voxels in the mask input dataset.
% .a struct with dataset-relevent data.
% .a.fdim.labels dimension labels, set to {'i','j','k'}
% .a.fdim.values dimension values, set to {1:X, 1:Y, 1:Z}
% .a.vol.dim 1x3 vector indicating the number of voxels in the 3
% spatial dimensions.
% .a.vol.mat 4x4 voxel-to-world transformation matrix (base-1).
% .a.vol.dim 1x3 number of voxels in each spatial dimension
% .sa struct for holding sample attributes (e.g., sa.targets,
% sa.chunks)
% .fa struct for holding feature attributes
% .fa.{i,j,k} indices of voxels (in voxel space).
%
% Notes:
% - Most MVPA applications require that .sa.targets (experimental
% condition of each sample) and .sa.chunks (partitioning of the samples
% in independent sets) are set, either by using this function or
% manually afterwards.
% - Data can be mapped to the volume using cosmo_map2fmri
% - SPM data can also be specified as filename:format, where format
% can be 'beta', 'con' or 'spm' (e.g. 'SPM.mat:beta', 'SPM.mat:con', or
% 'SPM.mat:spm') to load beta, contrast, or statistic images,
% respectively. When using 'beta', estimates for motion parameters and
% intercepts (which in most cases are estimates of no interest) are
% not returned. If format is omitted it is set to 'beta'.
% - If SPM data contains a field .Sess (session) then .sa.chunks are set
% according to its contents
% - If a mask is supplied, then all features that are in the mask are
% returned, even if some voxels contain NaN. To remove such features,
% consider applying cosmo_remove_useless_data to the output of this
% function.
%
% Dependencies:
% - for NIfTI, analyze (.hdr/.img) and SPM.mat files, it requires the
% NIfTI toolbox by Jimmy Shen
% (note that his toolbox is included in CoSMoMVPA in /externals)
% - for Brainvoyager files (.vmp, .vtc, .msk, .glm), it requires the
% NeuroElf toolbox, available from: http://neuroelf.net
% - for AFNI files (+{orig,tlrc}.{HEAD,BRIK[.gz]}) it requires the AFNI
% Matlab toolbox, available from: https://github.com/afni/AFNI
%
% Examples:
% % load nifti file
% ds=cosmo_fmri_dataset('mydata.nii');
%
% % load gzipped nifti file
% ds=cosmo_fmri_dataset('mydata.nii.gz');
%
% % load ANALYZE file and apply brain mask
% ds=cosmo_fmri_dataset('mydata.hdr','mask','brain_mask.hdr');
%
% % load AFNI file with 6 'bricks' (values per voxel, e.g. beta
% % values); set chunks (e.g. runs) and targets (experimental
% % conditions); use a mask
% ds=cosmo_fmri_dataset('mydata+tlrc', 'chunks', [1 1 1 2 2 2]', ...
% 'targets', [1 2 3 1 2 3]', ...
% 'mask', 'masks/brain_mask+tlrc);
%
% % load BrainVoyager VMR file in directory 'mydata', and apply an
% % automask that removes all features (voxels) that are zero or
% % non-finite for all samples
% ds=cosmo_fmri_dataset('mydata/mydata.vmr', 'mask', true);
%
% % load two datasets, one for odd runs, the other for even runs, and
% % combine them into one dataset. Note that the chunks are set here,
% % but the targets are not - for subsequent analyses this may have to
% % be done manually
% ds_even=cosmo_fmri_dataset('data_even_runs.glm','chunks',1);
% ds_odd=cosmo_fmri_dataset('data_odd_runs.glm','chunks',2);
% ds=cosmo_stack({ds_even,ds_odd});
%
% % load beta values from SPM GLM analysis stored
% % in a file SPM.mat.
% % If SPM.mat contains a field .Sess (sessions) then .sa.chunks
% % is set according to the contents of .Sess.
% ds=cosmo_fmri_dataset('path/to/SPM.mat');
%
% % as above, and apply an automask to remove voxels that
% % are zero or non-finite in all samples.
% ds=cosmo_fmri_dataset('path/to/SPM.mat','mask',true);
%
% % load contrast beta values from SPM GLM file SPM.mat
% ds=cosmo_fmri_dataset('path/to/SPM.mat:con');
%
% % load contrast statistic values from SPM GLM file SPM.mat
% ds=cosmo_fmri_dataset('path/to/SPM.mat:spm');
%
% % load PyMVPA dataset 'pymvpa_ds' that was exported using
% % PyMVPA's cosmo.map2cosmo function in Python with:
% %
% % >>> from mvpa2.datasets import cosmo
% % >>> cosmo.map2cosmo(pymvpa_ds,'data.mat')
% %
% cosmo_ds=cosmo_fmri_dataset('data.mat')
%
% See also: cosmo_map2fmri
%
% part of the NIfTI code is based on code by Robert W Cox, 2003,
% dedicated to the public domain.
%
% # For CoSMoMVPA's copyright information and license terms, #
% # see the COPYING file distributed with CoSMoMVPA. #