function nbrhood=cosmo_cluster_neighborhood(ds,varargin)
% define neighborhood suitable for cluster-based analysis
%
% nbrhood=cosmo_cluster_neighborhood(ds,...)
%
% Inputs:
% ds dataset struct
% 'fmri',fnn Optional connectivity for voxels, if ds is an
% fmri dataset. Use fnn=1, 2 or 3 to let two voxels be
% neighbors if they share at least a a side, edge
% or vertex (respectively).
% Default: 3
% 'source',snn Optional connectivity for grid positions, if ds is an
% meeg source dataset. Use snn=1, 2 or 3 to let two
% source positions be neighbors if they share at least a,
% side, edge or vertex (respectively). If ds.fa has a
% field 'inside', then any feature that is false for
% ds.fa.inside will not have any neighbors, nor will it
% be the neighbor of any other feature.
% Default: 3
% 'chan',c Optional connectivity for channels, if ds is an MEEG
% dataset with channels. c=true makes channels neighbors
% if their Delaunay triangulation (as computed by
% cosmo_meeg_chan_neighors) has them share an edge.
% c=false means that a channel is only neighbor of
% itself.
% Default: true
% 'surface',s Optional connectivity for nodes, if ds is a
% surface-based dataset. s=true makes channels neighbors
% if they share an edge on the surface.
% c=false means that a channel is only neighbor of itself
% Default: true
% 'vertices',v } vertices (Px3) and faces (Qx3) of the surface;
% 'faces',f } only required for a surface-based dataset
% k,t Any other feature dimension and its connectivity, e.g.
% k can be 'chan' or 'freq'. t indicates whether
% neighboring points (in time, freq, ...) are neighbors.
% Default: true (for any dimension not mentioned above)
%
% Output:
% nbrhood Neighborhood struct
% .neighbors .neighbors{k}==idxs means that feature k in ds has
% neighbors with feature indices idxs
% .fa identical to ds.fa, except that a field .sizes
% is added indicating the size of each feature
% - for surfaces, this is the area of each node
% - in all other cases, it is set to a vector of ones.
% .a identical to ds.a
%
% Examples:
% % get neighbors in very tiny synthetic fmri dataset (only 6 voxels)
% ds_fmri=cosmo_synthetic_dataset('type','fmri');
% % by default use NN=3 connectivity (voxels sharing a vertex is
% % sufficient to be neighbors)
% nh_fmri=cosmo_cluster_neighborhood(ds_fmri,'progress',false);
% % each voxel has 4 or 6 neighbors
% cosmo_disp(nh_fmri.neighbors);
% %|| { [ 1 2 4 5 ]
% %|| [ 1 2 3 4 5 6 ]
% %|| [ 2 3 5 6 ]
% %|| [ 1 2 4 5 ]
% %|| [ 1 2 3 4 5 6 ]
% %|| [ 2 3 5 6 ] }
%
% % (This example requires FieldTrip)
% cosmo_skip_test_if_no_external('fieldtrip');
% %
% % get neighbors in time-lock MEEG dataset from the neuromag306
% % system (subset of channels), with combined planar and
% % axial channels
% ds_meg=cosmo_synthetic_dataset('type','meeg',...
% 'size','normal',...
% 'sens','neuromag306_planar_combined+axial');
% nh_meg=cosmo_cluster_neighborhood(ds_meg,'progress',false);
% % neighbors are seperate for axial channels (odd features)
% % and planar_combined channels (even features)
% cosmo_disp(nh_meg.neighbors)
% %|| { [ 1 3 4 6 ]
% %|| [ 2 5 ]
% %|| [ 1 3 4 6 ]
% %|| [ 1 3 4 6 ]
% %|| [ 2 5 ]
% %|| [ 1 3 4 6 ] }
%
% % (This example requires FieldTrip)
% cosmo_skip_test_if_no_external('fieldtrip');
% %
% % get neighbors in EEG dataset, either with clustering over all
% % feature dimensions (channels x time x freq) or with all feature
% % dimensions except for channels (i.e., time x freq)
% ds_eeg=cosmo_synthetic_dataset('type','timefreq',...
% 'size','normal',...
% 'sens','eeg1005');
% % neighborhood with clustering over chan x time x freq
% nh_eeg_full=cosmo_cluster_neighborhood(ds_eeg,'progress',false);
% % each feature has up to 18 neighbors
% cosmo_disp(nh_eeg_full.neighbors)
% %|| { [ 1 2 3 ... 10 11 12 ]@1x12
% %|| [ 1 2 3 ... 10 11 12 ]@1x12
% %|| [ 1 2 3 ... 10 11 12 ]@1x12
% %|| :
% %|| [ 19 20 21 ... 28 29 30 ]@1x12
% %|| [ 19 20 21 ... 28 29 30 ]@1x12
% %|| [ 19 20 21 ... 28 29 30 ]@1x12 }@30x1
% %
% % neighborhood with clustering over time x freq (not over chan)
% nh_eeg_tf=cosmo_cluster_neighborhood(ds_eeg,'progress',false,...
% 'chan',false);
% % each feature has at most 6 neighbors
% cosmo_disp(nh_eeg_tf.neighbors)
% %|| { [ 1 4 7 10 ]
% %|| [ 2 5 8 11 ]
% %|| [ 3 6 9 12 ]
% %|| :
% %|| [ 19 22 25 28 ]
% %|| [ 20 23 26 29 ]
% %|| [ 21 24 27 30 ] }@30x1
%
%
% Notes:
% - This function uses cosmo_cross_neighborhoods internally so that
% clusters can be formed across different dimensions.
% - The output from this function can be used with
% cosmo_montecarlo_cluster_stat for multiple comparison correction
% - each dimension argument (such as 'fmri', 'source', 'chan', 'freq',
% 'time', 'surface') can be followed by a custom neighborhood, if it is
% desired to override the neighborhoods generated by this function.
% For most use cases this is recommended; it should only be used if you
% really know what you are doing.
% - To avoid making clusters along a particular dimension d, use
% cosmo_cluster_neighborhood (...,d,false). For example, an MEEG
% time-by-chan dataset ds could be clustered using:
% 1a) nh1a=cosmo_cluster_neighborhood(ds);
% 1b) nh1b=cosmo_cluster_neighborhood(ds,'time',true);
% 2) nh2 =cosmo_cluster_neighborhood(ds,'time',false);
% where 1a and 1b are equivalent, and both different from 2.
% When used with cosmo_montecarlo_cluster_stat to correct for multiple
% comparisons, the interpretation of any surviving cluster depends on
% which approach was used:
% * In approach 2, clusters are not connected by neighboring time
% points; each cluster spans a single time point. For example, if a
% cluster is found at 200ms relative to stimulus onset, one can
% infer a significant effect at 200 ms.
% * In approach 1, clusters are connected over neighboring time points,
% and each cluster can span multiple time points. If a cluster is
% found spanning a time interval between 200 and 300ms, one *cannot*
% infer a significant effect at 200 ms. One *can* infer a significant
% effect for the time interval between 200 and 300 ms.
% (Note that in both approaches clusters, are connected over
% neighboring channels; spatial inferences over individual channels
% cannot be made in either approach).
% To summarize: in approach 1, the threshold to pass significance is
% lower, but less strong inferences can be made than with approach 2.
%
% See also: cosmo_montecarlo_cluster_stat
%
% # For CoSMoMVPA's copyright information and license terms, #
% # see the COPYING file distributed with CoSMoMVPA. #