function ds=cosmo_meeg_dataset(filename, varargin)
% Returns a dataset structure based on MEEG data
%
% ds=cosmo_meeg_dataset(filename, varargin)
%
% Inputs:
% filename filename of MEEG data to be loaded. Currently
% supported are files with extensions:
% .mat : FieldTrip time-locked or
% time-frequency data at either the
% sensor or source level.
% .txt : exported EEGLab with time-locked
% data.
% .daterp time-locked }
% .icaerp ICA time-locked } EEGLab
% .dattimef time-freq }
% .icatimef ICA time-freq }
% .datitc inter-trial coherence }
% .icaitc ICA inter-trial coherence }
% .datersp ERSP data }
% .icaersp ICA ERSP data }
% Alternatively it can be a FieldTrip or EEGLab struct
% with time-locked or time-frequency data
% 'targets', t Px1 targets for P samples; these will be stored in
% the output as ds.sa.targets (optional)
% 'chunks', c Px1 chunks for P samples; these will be stored in the
% the output as ds.sa.chunks (optional)
% 'data_field', f - For FieldTrip MEEG source dataset with multiple
% data fields (such as 'pow' and 'mom'), this sets
% which data is returned. (only for source data)
% - For EEGLAB 'ersp' data
% f='ersp' the original data is returned
% (without baseline correction),
% with data for each channel,
% frequency and time point.
% Based on datasets generated by
% EEGLAB that were inspected when
% writing this function, it seems
% that this data represents raw power
% values.
% f='erspbase' the baseline is returned, with data
% for each channel and frequency
% (but not time point)
% Note: this function does currently not support
% returning baseline-corrected data.
% 'trials', idx Mx1 array with indices of trials to load (optional).
% If not provided then all trials are loaded. The
% output has a .samples field with the number of rows
% equal to numel(idx).
%
% Returns:
% ds dataset struct with the following fields
% .samples PxQ for P samples and Q features.
% .sa.targets Px1 sample targets (if provided)
% .sa.chunks Px1 sample chunks (if provided)
% .a
% .meeg
% .sample_field name of sample field. One of 'fourierspctrm',
% 'powspctrm', or 'trial'.
% .samples_type 'timelock' or 'timefreq'.
% .samples_label Usually 'rpt'; or the first field of .dimord
% for FieldTrip data
% .dim
% .labels 1xS cell struct with labels for the feature
% dimensions of the input. Usually this is
% {'chan','time'} or {'chan','freq','time'}.
% .values 1xS cell struct with values associated with .labels.
% If the K-th value has N_K values, this means that
% the feature dimension .labels{K} takes the
% values in .values{K}. For example, if
% .labels{1}=='chan', then .values{1} contains the
% channel labels.
% .fa
% .{D} if D==a.fdim.labels{K} is the label for the K-th
% feature dimension, then .{D} contains the
% indices referencing a.fdim.values. Thus, all values in
% .{D} are in the range 1:N_K if a.fdim.values{K} has
% N_K values, and the J-th feature has dimension value
% .dim.values{K}(.{D}(J)) in the K-th dimension.
%
% Notes:
% - The resulting dataset can be mapped back to MEEG format using
% cosmo_map2meeg
% - if the input contains data from a single sample (such as an average)
% the .sample_field is set to .trial, and mapping back to MEEG format
% adds a singleton dimension to the .trial data output field.
% - For single-subject MVPA of single trials using data preprocessed with
% FieldTrip, consider setting, depending on the data type:
% * timelock (ft_timelockanalysis): cfg.keeptrials = 'yes'
% * timefreq (ft_timefreqanalysis): cfg.keeptrials = 'yes'
% * source (ft_sourceanalysis) : cfg.keeptrials = 'yes' *and*
% cfg.rawtrials = 'yes'
% - Most MVPA applications require that .sa.targets (experimental
% condition of each sample) and .sa.chunks (partitioning of the samples
% in independent sets) are set, either by using this function or
% manually afterwards.
% - If the input is a FieldTrip struct with a field .trialinfo, then this
% field is present in .sa.trialinfo. Depending on the contents of
% .trialinfo, this could be used to specify conditions in each trial.
% For example, if the third column of .trialinfo contains an integer
% specifying the condition of each trial, after running this function
% one can do
%
% ds.sa.targets=ds.sa.trialinfo(:,3)
%
% to set the trial conditions.
% - Implementation note: when loading EEGLAB data from a file, using the
% 'trials' option means that data from different channels are loaded
% through different 'load' commands. When loading a subset of all
% trials, the advantage of this implementation is that significant
% less memory is needed compared to an alternative implementation in
% which the full dataset is loaded and then the trials of interest
% are selected through slicing. The disadvantage is that loading may
% take longer, because the file is opened and closed multiple times. yet
% this approach allows one to load subsets of trials from data files
% that are larger than the available RAM.
% Such memory reductions are currently not available for FieldTrip
% data, as FieldTrip's data structures do not store data for different
% channels in different variables.
%
% See also: cosmo_map2meeg
%
% # For CoSMoMVPA's copyright information and license terms, #
% # see the COPYING file distributed with CoSMoMVPA. #