Adaptive Independent Vector Analysis for Multi-Subject Complex-Valued fMRI Data

Background Complex-valued fMRI data can provide additional insights beyond magnitude-only data. However, independent vector analysis (IVA), which has exhibited great potential for group analysis of magnitude-only fMRI data, has rarely been applied to complex-valued fMRI data. The main challenges in this application include the extremely noisy nature and large variability of the source component vector (SCV) distribution.


Introduction
Blind source separation (BSS) has been widely applied to multi-subject functional magnitude resonance imaging (fMRI) data analysis because of little requirement on prior information about the data.The resulting spatial maps (SMs) and time courses (TCs), being common components across all subjects or subject-specific, are vital for studying brain function.Among BSS techniques, tensor decomposition generates common SM and TC components across multiple subjects, but also demonstrates subjectspecific intensity.Andersen and Rayens (2004) utilized canonical polyadic decomposition (CPD), which is a general tensor model that separates multi-trial fMRI data.Beckmann and Smith (2005) extended CPD to multi-subject fMRI data by combining it with ICA to deal with inter-subject SM variability.Since different subjects may generate different responses due to variation in response time or in their hemodynamic delay, inter-subject TC variability naturally also exists for multi-subject fMRI data.Kuang et al. (2015) proposed a solution by combining shift-invariant CPD (Mørup et al., 2008) and ICA to simultaneously consider inter-subject SM and TC variability.Compared to tensor decomposition, ICAbased analysis extracts subject-specific TCs and/or SMs for emphasizing inter-subject variability.Two such approaches are group ICA (Calhoun et al., 2001(Calhoun et al., , 2008;;Guo and Pagnonib, 2008;Erhardt et al., 2011;Calhoun and Adali, 2012b;Eloyan et al., 2013;Afshin-Pour et al., 2014) and independent vector analysis (IVA, a kind of joint ICA) (Lee et al., 2008a;Dea et al., 2011;Michael et al., 2014;Ma et al., 2014;Laney et al., 2015aLaney et al., , 2015b;;Gopal et al., 2015Gopal et al., , 2016;;Adali et al., 2015).While group ICA provides individual TCs or SMs via ICA of temporally or spatially concatenated multi-subject fMRI datasets, IVA generates individual TCs and SMs via joint ICA of multi-subject fMRI datasets where similar SMs among different subjects were concatenated as source component vectors (SCVs).Multiple previous publications have shown that IVA outperforms group ICA for capturing inter-subject variability by exploiting both the dependence of similar SMs across multi-subject fMRI datasets and the independence of distinct SMs (Dea et al., 2011;Michael et al., 2014;Laney et al., 2015aLaney et al., , 2015b)).Therefore, we focus on IVA methods in this study.
Real-valued IVA algorithms have been well developed and widely applied to magnitude-only multisubject fMRI datasets.The first IVA algorithm for analyzing real-valued fMRI data was presented by Lee et al. (2007a) using a multivariate Laplace distribution (Lee et al., 2007a(Lee et al., , 2008a) ) called IVA-L.With the development of IVA-G using multivariate Gaussian distribution (Anderson et al., 2012a), an IVA algorithm called IVA-GL was implemented by utilizing IVA-G to initialize the de-mixing matrix and IVA-L to perform the subsequent separation.IVA-GL emphasizes both second-order and higher-order statistics (Anderson et al., 2012a), and thus tends to be more efficient than IVA-L and IVA-G for fMRI analysis.IVA-GL was first tested using simulated fMRI data (Dea et al., 2011;Michael et al., 2014), and then applied to real-valued fMRI data for diverse applications: e.g., producing discriminative features for quantifying motor recovery after stroke (Laney et al., 2015a(Laney et al., , 2015b)); finding dynamic changes in spatial functional network connectivity in healthy individuals and schizophrenic patients (Ma et al., 2014;Calhoun and Adali, 2016); showing the spatial variation in fMRI brain networks (Gopal et al., 2015(Gopal et al., , 2016)); fusing multimodal data (Adali et al., 2015); and removing the gradient artifact in concurrently collected electroencephalogram (EEG) and fMRI data (Acharjee et al., 2015).In addition to IVA-GL, there are also other real-valued IVA algorithms that are promising for analyzing magnitude-only fMRI data, such as IVA with the Kotz family of distribution (Anderson et al., 2013), and IVA using an adaptive multivariate generalized Gaussian distribution (MGGD) (Boukouvalas et al., 2015).
However, there is a shortage of complex-valued IVA algorithms suitable for analyzing complex-valued fMRI data.Although magnitude-only fMRI data are extensively studied, fMRI data are initially acquired as complex-valued image pairs including magnitude and phase information (Adali and Calhoun, 2007;Calhoun andAdali, 2012a, 2012b).Phase fMRI data contain useful and unique information such as blood oxygenation levels during functional activation (Hoogenraad et al., 1998;Arja et al., 2010), the effects of macro-and micro-vessels (Menon, 2002;Tomasi and Caparelli, 2007), and the orientation of large blood vessels (Klassen and Menon, 2005).Analysis of complex-valued fMRI data provides additional insights beyond magnitude-only fMRI data (Rowe, 2005;Adali and Calhoun, 2007;Calhoun andAdali, 2012a, 2012b;Rodriguez et al., 2011Rodriguez et al., , 2012;;Li et al., 2010Li et al., , 2011;;Yu et al., 2015).The complex-valued method with pre-ICA de-noising (using observed phase images to identify and remove noisy voxels in original fMRI data) achieved higher sensitivity and specificity than the magnitude-only method (Rodriguez et al., 2011(Rodriguez et al., , 2012;;Li et al., 2011).By using post-ICA de-noising (using SM phase information to identify and remove noisy voxels in ICA estimates), the complex-valued method extracts more contiguous and reasonable activations than the magnitude-only method (Yu et al., 2015).This supports the potential of identifying useful brain information from complex-valued fMRI data beyond magnitude-only fMRI data.
Although some complex-valued IVA algorithms exist, they are unsuitable for the analysis of complexvalued fMRI data for the following two reasons.
First, a good IVA application requires an appropriate multivariate probability density function to match SCVs distribution.Most existing algorithms were originally proposed and tuned to separate frequencydomain speech (Kim et al., 2006(Kim et al., , 2007;;Lee et al., 2007bLee et al., , 2008b;;Liang et al., 2014).Typical algorithms include the following: fast fixed-point IVA (FIVA) employing a spherically symmetric, exponential norm distribution (SEND), or spherically symmetric Laplace (SSL) distribution (Lee et al., 2007b); IVA assuming MGGD with a fixed shape parameter (Liang et al., 2014); and an adaptive IVA algorithm using a multivariate Gaussian mixture model for separating mixed speech and music signals (Lee et al., 2008b).These complex-valued IVA algorithms exhibited obvious degradation when applied to fMRI analysis due to different distributions between frequency-domain speech and fMRI data.
Secondly, there is large distribution variability for such a large number of fMRI SCVs (usually more than 40 SCVs).This study focuses on the analysis of full complex-valued fMRI data (without pre-ICA denoising).In this case, many complicated noise components may be involved (e.g., head movement, respiration, cardiac pulsation) in addition to brain function-related components such as task-related components, transiently task-related components, and the default mode network (DMN).These brain function-related components generally exhibit a super-Gaussian distribution, while some noise components tend to have a sub-Gaussian distribution.Therefore, IVA using a fixed source distribution may not perform well due to mismatching distributions.The noncircular FIVA (non-FIVA) (Zhang et al., 2012) and complex-valued IVA-G (Anderson et al., 2012b) algorithms, which were proposed for noncircular sources by utilizing the information of a pseudo-covariance matrix, are candidates for extracting complex-valued fMRI sources with noncircular characteristics (Schreier et al., 2009;Schreier and Scharf, 2010;Lin et al., 2011).
This study proposes an adaptive IVA algorithm for group analysis of full complex-valued fMRI data.
Preliminary results can be found in Kuang et al. (2016).We utilized MGGD, which contains multivariate super-Gaussian and sub-Gaussian distributions, and adaptively learned the MGGD shape parameter to match changing SCV distributions.Specifically, we derived a nonlinearity from an MGGD-based SCV distribution and estimated the shape parameter using maximum likelihood estimation (MLE).
Furthermore, to decrease the noise effect, we adopted a subspace de-noising strategy (Na et al., 2013), and updated the MGGD-based nonlinearity in the dominant SCV subspace.Furthermore, we utilized a post-IVA phase de-noising strategy to remove noisy voxels from the SM estimates.Finally, we explicitly employed the noncircular characteristics of fMRI source signals (i.e., complex-valued SMs) by incorporating the fMRI data pseudo-covariance matrix into the IVA algorithm.Simulated and experimental fMRI data were then utilized to evaluate the proposed algorithm.

IVA Model and Cost Function
Given K subjects of complex-valued fMRI data ̃( ) ( ) K, where J is the number of time points, , and is the total number of in-brain voxels obtained by flattening the volume image data.We first perform PCA and whitening preprocessing on ̃( ) ( ) for each subject to eliminate noise effects and to reduce dimensionality.Assume that ( ) ( ) is the matrix for reducing and whitening, we obtain a PCA-reduced and whitened mixture vector as ( ) ( ) ( ) ̃( ) ( ) , and ( ) ( ) , ( ) ( ) ( ) ( )-, where the superscript " " denotes transpose.
The IVA model is as follows: where ( )  is a mixing matrix used to recover TCs; where ( ) is the th column of ( ) , and the superscript " " denotes conjugate transpose.
The goal of IVA is to minimize the mutual information among estimated SCVs, and thus its cost function is defined as follows (Lee et al., 2007a(Lee et al., , 2008a)): where ( ) is a multivariate probability density distribution for matching SCVs for , " " and "| |" denote the determinant and modulus operation, respectively.We employed fixed-point learning in the IVA algorithm (Bingham et al., 2000;Lee et al., 2007b), in which the de-mixing matrices ) are constrained to be orthonormal in each iteration, and thus the second term of Eq.
(4) clearly exhibits the function of IVA that simultaneously minimizes the entropy of all N components and maximizes the mutual information within each SCV (i.e., ( )) (Anderson et al., 2012b).
For convenience, we directly related the SCV distribution ( ) to a real-valued nonlinear function ( ) as follows: (5) As mentioned earlier, we selected MGGD for matching the various SCV distributions in our complexvalued fMRI data analysis.Next, we derived an MGGD-based nonlinearity ( ).

MGGD-based Nonlinear Function
The MGGD-based SCV distribution is defined as follows (Gómez et al., 1998;Pascal et al., 2013): where , ( ) is gamma function; is the shape parameter, 1 is the multivariate Gaussian distribution, 0.5 corresponds to the multivariate Laplace distribution, and is the symmetric positive definite matrix.From Eq. ( 6), we have Omitting the first 2 terms of Eq. ( 7), we define a nonlinear function as:

Subspace De-noising
For each SCV , the covariance matrix of magnitude is denoted as The first eigenvalue of the covariance matrix is much larger than the other eigenvalues.As such, the nonlinearity (| | ) can be learned within the 1-dimensional subspace spanned by the dominant eigenvector (Na et al., 2013), resulting in a de-noised nonlinear function.Denoting the dominant eigenvalue as and the eigenvector , -, | | in the dominant subspace is expressed as . From this we have the following MGGD-based nonlinear function ( ): Connecting Eqs. ( 10) and ( 8) produces From Eq. ( 11), we can determine a new for incorporating subspace characteristics into the MGGDbased SCV distribution as follows: The derivation of is provided in Appendix A.

Shape Parameter Updating
We updated the shape parameter to match the SCV distribution.More precisely, we estimated the using the MLE method with Newton-Raphson optimization (Pascal et al., 2013) at each iteration as follows: where the log-likelihood function of is given by . ( 14)

Incorporating Noncircularity in Estimating Demixing Matrix
We used the deflation-based method to update ( ) , i.e., we separately updated each column ( ) of the demixing matrix ( ) , .Motivated by explicitly utilizing the noncircularity of source signals in a learning rule for the demixing matrix (Zhang et al., 2012), we incorporated the pseudo-covariance matrix { ( ) ( ( ) ) } into a fixed-point learning rule: where the superscript " " denotes complex conjugation, ( ) and ( ) are the first and second derivatives of the nonlinear function ( ): , and ( 16) A decorrelation step is employed following ( ) updating:

Proposed Algorithm
For the IVA-generated SMs and TCs, we employed post-IVA de-noising based on SM phase information to remove noisy voxels introduced by phase fMRI data (Yu et al., 2015).More precisely, we first used the phase de-ambiguity approach based on TC estimates to eliminate the phase ambiguity of subject-specific SMs and TCs, and then retained only voxels with phase values within ⁄ .The final SMs were zscored and thresholded at |Z| ≥ 0.5 for both simulated and experimental fMRI data.
We have presented the main idea and learning rules of our approach.A detailed procedure for the proposed algorithm is included below in Algorithm 1.Note the convergence condition of Algorithm 1 is reaching either the maximum iteration or the minimum relative error of cost function ϵ.Here, 1000 and =10 -6 .
Algorithm 1: Adaptive complex-valued IVA algorithm step 1: Input K subjects of complex-valued fMRI data ̃( ) ( ) and the number of components N.
step 2: Perform PCA and whitening procedure on the fMRI data of each subject, and obtain the reducing and whitening matrix ( ) and ( ) ( ) ̃( ) , .
step 4: Compute the SM estimate using Eq. ( 2) and the covariance matrix with Eq. ( 9) for all N SCVs.
step 5: Perform PCA on to obtain the dominant eigenvalue and its eigenvector , then, using Eq. ( 10), calculate the MGGD-based nonlinear function ( ) for each SCV.
step 8: If convergence, go to step 9; otherwise, go to step 4.
step 10: Perform phase de-ambiguity on SM and TC estimates and post-IVA phase de-noising on SM estimates (retaining voxels within ⁄ ).De-noised SM estimates were finally zscored and thresholded at |Z| ≥ 0.5.
step 11: Output the de-noised SM estimates and the phase-corrected TC estimates.

Experimental FMRI Data
The experimental fMRI datasets were obtained from 16 subjects performing a finger-tapping motor task while receiving auditory instructions.IRB-approved informed consent at the University of New Mexico was obtained from all participants.The paradigm included a block design with alternating periods of 30 seconds on (finger tapping) and 30 seconds off (rest).Each participant's data were collected with 165 whole-head fMRI images.Experiments were performed with a 3T Siemens TIM Trio system with a 12channel radio frequency (RF) coil.The fMRI experiment used a standard Siemens gradient-echo echoplanar imaging (EPI) sequence modified to store real and imaginary data separately.We collected data from the 12-channel RF coil and combined them (internally by Siemens) in an optimal manner based on coil sensitivity profiles (Feng et al., 2009).The following parameters were used: field-of-view = 24 cm,

Simulated FMRI Data
Ten subjects of simulated complex-valued fMRI datasets were generated based on the extracted components by the proposed method from the experimental fMRI data.We selected 12 components which had higher mean correlations with their magnitude SM references (Yu et al., 2015;Smith et al., 2009) for the ten subjects.These components included the task-related component, DMN, supplementary motor area, medial and occipital pole visual areas, auditory cortex, cerebellum, left and right frontoparietal areas, posterior cingulate cortex, inferior parietal lobule (IPL), and a noise component with a higher shape parameter than the other 11 components.Among the 12 components, the SCV shape parameters ranged from 0.368 to 0.533 (see Fig. 3); the mean correlations between SM components within an SCV varied from 0.468~0.589(see Fig. 1(1)), and those between their corresponding TCs ranged from 0.088 to 0.320 (see Fig. 1(2)).The smaller the mean correlations between SM components within an SCV or between their corresponding TCs, the larger the inter-subject spatial or temporal Each component consisted of an SM and a corresponding TC; each SM contained 53 × 63 × 46 voxels while non-brain voxels were all zeroes; and each TC included 165 time points with TR = 2 seconds.In the brain, these selected SMs and TCs were all phase corrected using the phase de-ambiguity approach based on TC estimates (Yu et al., 2015), thus the blood oxygenation-level dependent related voxels had phase values within [-π/4, π/4].We computed the mean magnitude and phase images across all subjects of complex-valued fMRI data to generate baseline images, and then generated no-noise simulated fMRI datasets by adding the product of SMs and TCs to the baseline images for each subject.Subsequently, we spatially smoothed both the real and imaginary images of simulated fMRI datasets with a 10 × 10 × 10 mm 3 FWHM Gaussian kernel.
After removing the non-brain voxels and flattening the volume image data, we obtained 59510× 165 × 10 simulated fMRI data.Rician noise (Gudbjartsson and Patz, 1995) was then added to investigate the noise effect.We generated simulated fMRI data with 9 noise levels of contrast to noise ratio (CNR), from -10 dB to 10 dB (with 2.5 dB interval).Since we have ground truth sources for all components, we evaluated all components based on the simulated fMRI data.

Performance Index
We used the following performance indices to evaluate the proposed IVA algorithm.

Error Rate
The error rate is defined as the ratio of the number of wrong components within an SCV to the number of subjects (i.e., K).Taking a specific SCV consisting of component N* as an example, a wrong component (corresponding to a subject) is determined if its index for the maximal Pearson correlation with its ground truth component is inconsistent with index N*.Assuming the error rate for the nth SCV is , , we define an average error rate across all N SCVs as ̅ ( ) ⁄ .The lower error rate, the better the SCV quality.

Joint Pearson Correlation Coefficient
We first computed K Pearson correlation coefficients between each component within an estimated SCV and the reference, and then obtained an averaged correlation coefficient across K subjects, which was called a joint Pearson correlation coefficient (JPCC) for an SCV estimate.For simulated fMRI data, references were considered to be ground truth sources.Therefore, we calculated JPCC for SM magnitude, SM phase, TC magnitude, and TC phase.Since we create differences between 2 phase value cases, i.e., within or outside a specific range (e.g., ⁄ ), we set the phase values within ⁄ as 1, and the others as 0 for both the estimated SM phase and the ground truth.We denoted JPCC for SM magnitude, SM phase, TC magnitude, and TC phase as ( ), ( ), ( ), and ( ), respectively.We also averaged JPCC across N SCVs for simulated fMRI data, denoting as ̅ , ̅ , ̅ , and ̅ , accordingly.
Regarding the task-related and DMN components for experimental fMRI data, we used the prior GLM and Smith DMN masks (Smith et al., 2009;Yu et al., 2015) as SM magnitude references, and constructed their phase masks by setting 1 for the activated voxels in the SM reference and 0 for the others.( ) and ( ) were then computed.Additionally, we computed and for task-related TC estimates using the model TC (obtained by convolving stimuli with canonical SPM hemodynamic response functions) as the reference because the phase of task-related TC estimates also had high correlation with the model TC (Li et al., 2011).

Significance via Paired t-Test
We conducted paired t-tests between the above performance indices for our method and those for non-FIVAs, FIVAs, IVA-GL, non-FIVA, and FIVA, respectively, to show the significant difference between our method and these other 5 algorithms.

Effects of Noise
Fig. 2 shows the noise influence (CNR = -10 dB to 10 dB) on the proposed algorithm, non-FIVAs, FIVAs, IVA-GL, non-FIVA, and FIVA.We computed ̅ , ̅ , ̅ , ̅ , ̅ and their standard deviations for the simulated fMRI data.The true number of components N = 12 was used.Our proposed method achieved the lowest ̅ , the highest ̅ , ̅ , ̅ , ̅ , and the smallest standard deviations for all cases, followed generally by non-FIVAs, FIVAs, non-FIVA, FIVA, and IVA-GL.We performed paired ttest to access the difference between our method and the other 5 IVA algorithms, as shown in Table 1.
We also found that non-FIVA was better than FIVA by virtue of using noncircular characteristics, while non-FIVAs and FIVAs were better than non-FIVA and FIVA due to using subspace de-noising (Fig. 2).
Among all indices, ̅ and ̅ for SM magnitudes and phases were more sensitive to noise than the other indices for all 6 algorithms.Furthermore, SM phases illustrated the highest noise sensitivity.As CNR increased, ̅ showed a nearly linear increase.This conforms to existing findings about fMRI phase characteristics (Menon, 2002;Zhao et al., 2007;Tomasi and Caparelli, 2007;Hagberg et al., 2008;Yu et al., 2015).

Shape Parameters Estimation
Fig. 3 illustrates average shape parameters estimated by our proposed approach (over 20 runs) from simulated fMRI data with CNR = -5 dB, 0 dB, and 5 dB.The true number of components N = 12 of the simulated fMRI data was utilized.The varying shape parameters were well learned and followed (e.g., see shape parameters for components 8, 9 and 12 in Fig. 3) for different noise levels of simulated fMRI data.
However, the estimated shape parameters were more accurate for higher CNR data (i.e., 5 dB and 0 dB) than for lower CNR data (-5 dB).We next utilized 2 example cases, one with higher CNR level (CNR = 5dB), and the other with lower CNR level (CNR = -5dB), to compare our proposed method with the other 5 algorithms in terms of , , , , and for estimating all 12 components (Fig. 4).We observed that the proposed approach generated better estimates than the other 5 methods, especially for higher noise levels and larger spatial and temporal changes (Components 4, 6, 8, and 10).In addition, Components 6, 8 and 10 exhibited larger deviations from the fixed shape parameter 0.5 (0.5 was the shape parameter used by non-FIVAs, FIVAs, non-FIVA, FIVA, and IVA-GL), which was another cause for the obvious degradation of these 5 algorithms for estimating the three components.As a result, the proposed method yielded the most stable and lowest error rate when estimating all 12 components.Furthermore, the proposed method had the smallest standard deviation for the 2 cases (maximums: 0.000, 0.084), as compared to non-FIVAs (0.178, 0.223), FIVAs (0.197, 0.154), non-FIVA (0.177, 0.182), FIVA (0.168, 0.204), and IVA-GL (0.232, 0.220).This suggests the robustness of our proposed method to noise and larger spatial and temporal changes.

Effects of the Number of Components
The true number of components N (here N = 12) is unknown in real situations.We investigated its impact on these IVA algorithms.More precisely, we changed N from 10 to 15 for different CNRs.Fig. 5 displays the 2 aforementioned example cases: higher noise level CNR = 5dB and lower noise level CNR = -5dB.
The 5 performance indices ̅ , ̅ , ̅ , ̅ , and ̅ were then computed.Non-FIVAs, FIVAs, non-FIVA, FIVA, and IVA-GL showed slightly higher sensitivity to the number of components than the proposed method.When examining ̅ , ̅ , ̅ , ̅ , and ̅ for the 6 approaches, we observed that these algorithms achieved better performance for N = 12, which is the true number of components.
The proposed algorithm yielded the lowest standard deviations of all performance indices for the 2 cases.

Impact of the Number of Components
Since the true number of components was unknown for experimental fMRI data, we first tested the effects of the number of components on the IVA algorithms.We changed the number of components N from 30 to 60 (reasonable ranges for extracting larger networks), and took the task-related component and DMN as 2 example components.Fig. 6 shows the results of , , and for the SM estimates of the task-related component (Fig. 6A) and the DMN (Fig. 6B) component, and Fig. 7 illustrates and for the task-related TCs. and for the task-related and DMN components, and for the task-related TC, changed with the number of components.The minimal and maximal and were mostly reached at N = 50 for all IVA algorithms.In contrast, the phase components represented by and illustrated stability with a changing number of components.As such, we used N = 50 for the following analyses.Overall, the proposed method was least influenced by the number of components for estimating the task-related and DMN components, and achieved much better performance than non-FIVAs, FIVAs, IVA-GL, non-FIVA, and FIVA in terms of , , , and their standard deviations.
We also performed paired t-test for the performance indices displayed in Fig. 6 and Fig. 7, to evaluate the difference between the proposed approach and the other algorithms.For the task-related component (Fig. 6A and Fig. 7), the smallest difference was t ≥ 4.24, p < 0.006 while the largest difference was t ≥ 55.08, p < 3 × 10 -9 .For the DMN component (Fig. 6B), the smallest difference was t ≥ 4.83, p < 0.003 while the largest difference was t ≥ 17.40, p < 3 × 10 -6 .As expected, the proposed method showed significant improvements over non-FIVAs, FIVAs, non-FIVA, and FIVA for estimating both task-related and DMN components.

Spatial Map Results
We performed one-sample t tests on each SCV estimate by the 6 IVA algorithms to find significant and consistent SM activation across 16 subjects.Fig. 8 displays means and standard deviations of and (over 20 runs) for the task-related and DMN t-maps, and example magnitudes and phases t-maps (close to the average result) for each algorithm.The t-maps were thresholded at p < 0.001 (t = 4.073; df = 15).The proposed approach showed the best performance, including the highest and , and the lowest standard deviations for both task-related and DMN t-maps.By virtue of subspace de-noising, non-FIVAs and FIVAs yielded better performance than non-FIVA and FIVA, while noncircular algorithms were better than circular ones (non-FIVAs vs. FIVAs, non-FIVA vs. FIVA).In contrast, IVA-GL obtained the lowest and , and the largest standard deviations for the task-related and DMN t-maps.When examining the example t-maps for the 2 components, as shown in Fig. 8C and 8D, we observed that the proposed approach not only extracted more contiguous and reasonable activation, but also had higher activation intensity than the other algorithms.

Shape Parameters Estimation
Fig. 9 shows a single run of estimated shape parameters for all 50 SCVs by the proposed approach from experimental fMRI data.Note for the estimates of the task-related component and DMN, the standard deviations were 0.034 and 0.007, and the t-values (p-values) were 53.54 (3.42 × 10 -22 ) and 267.09 (1.98 ×   10 -35 ).The maximum, minimum, and mean for the estimates of all 50 SCVs were 0.465, 0.266, and 0.323, respectively, and the largest SCV shape parameter change was 0.2.This confirms that SCV distributions vary for experimental fMRI data.The change is so large that a fixed shape parameter is hard to represent all SCV distributions.Therefore, it is essential for an IVA algorithm to learn shape parameters for matching each SCV distribution.

Comparison with Magnitude-only IVA-GL
To investigate whether complex-valued fMRI data can provide additional brain information beyond magnitude-only fMRI data, we compared the proposed method with the widely used real-valued IVA-GL algorithm for analyzing magnitude-only fMRI data from the same 16 subjects.We tested the performance of the real-valued IVA-GL with different numbers of components (from 20 to 60), and obtained the best results for N = 40.Hence, we selected N = 40 for magnitude-only fMRI data.The resulting SMs were zscored and thresholded at 2.5 (a typical threshold for reliably removing noisy voxels in SMs for magnitude-only methods).For our proposed method, we used |Z| ≥ 0.5 as we rely on the SM phase for removing noisy voxels in SM estimates.Fig. 10 shows the SM results for task-related and DMN components.Specifically, with values, we both showed magnitudes of averaged SM estimates over 16 subjects (i.e., averaging within an SCV) for our proposed method and averaged SM estimates across 16 subjects for magnitude-only IVA-GL.Subfigures (a), (b), and (c), display the total number of voxels, the number of voxels within and outside the GLM mask, and the number of voxels within and outside the DMN mask, respectively.
For the task-related component (Fig. 10A), our proposed approach obtained higher than magnitude-only IVA-GL (0.585 vs. 0.498), and generated many more activated voxels, not only within the GLM mask (2499 vs. 934), but also outside the mask (2888 vs. 159).Furthermore, the detected voxels outside the GLM mask by the proposed method fell still within the expected motor cortical regions, including the left and right primary motor areas, and supplementary motor areas (Fig. 10A1(c)).In contrast, the magnitude-only IVA-GL detected only a very small number of voxels outside the mask.
The results for the DMN component were similar to those of the task-related component.The proposed approach achieved much higher than the magnitude-only IVA-GL (0.882 vs. 0.682), and extracted many more activated voxels within (3999 vs. 1358) and outside (1678 vs. 58) the DMN mask but still within DMN-related regions (Fig. 10B1).In addition to the posterior cingulate cortex (PCC) (Li et al., 2013), which was detected by both methods, our proposed method extracted activation from the dorsal/ventral medial prefrontal cortex and left/right inferior parietal lobule (Li et al., 2013).Details are displayed in Fig. 10B1(b) and (c).

Discussion
Our experimental results show that SCV shape parameters frequently changed with large amplitude for experimental fMRI data.These changes can be caused by the intrinsic differences between different SCVs, by noise, or by inter-subject spatial variability for a single SCV.As such, the proposed approach was more robust to the inter-subject variability and noise effect than non-FIVAs, FIVAs, non-FIVA, FIVA, and IVA-GL.Among these 5 IVA algorithm, non-FIVAs utilized = 0.5 and achieved the best separation performance by using subspace de-noising and noncircularity.One may question whether a fixed parameter set closer to ground truth than = 0.5 could improve these results.To answer this question, we fixed the shape parameter for our algorithm to = 0.4, since the shape parameters for the 2 components of interest, i.e., the task-related component and DMN, were 0.419 and 0.426, respectively, and compared this to non-FIVAs (Fig. 11).The proposed method with = 0.4 performed slightly better than non-FIVAs (β = 0.5) in terms of , , , and , while the proposed algorithm with adaptive shape parameters yielded much lower and higher , , and , compared to that using the fixed shape parameter = 0.4.Collectively, these results suggest that the closer to ground truth the shape parameters, the better the IVA performance.A smaller change of shape parameter (0.4 vs. 0.419 and 0.426) could cause significantly degraded estimates (e.g., ) for some subjects, resulting in greatly enlarged error rates.Therefore, it is necessary to learn SCV shape parameters to obtain efficient analyses for experimental complex-valued fMRI data.
Ideally, an SCV consists of identical components from each subject, e.g., the same task-related components for 16 subjects without any inter-subject spatial variability.In this case, there is only 1 nonzero eigenvalue for the covariance matrix given in Eq. ( 9).Along this line, subspace de-noising actually removes the negative effect of inter-subject spatial variability, which can also be regarded as a kind of noise, on the nonlinear function.For experimental fMRI data, there is natural spatial variability among multiple subjects, and a dominant eigenvalue (much larger than other eigenvalues) exists for the correlation matrix of each SCV.Therefore, updating the nonlinearity in the dominant subspace not only represents SCV characteristics, but is also robust to large inter-subject variability.
The noncircular characteristics of source signals for complex-valued fMRI data have been documented in multiple previous publications (Schreier et al., 2009;Schreier and Scharf, 2010;Lin et al., 2011), with consistency between ICA performance and the degree of noncircularity for fMRI sources (Lin et al., 2011), i.e., the better the ICA estimates, the higher their degrees of noncircularity.This suggests that IVA algorithms that incorporate noncircularity for complex-valued fMRI sources could improve IVA performance.Our results for experimental fMRI data also verified the advantages of incorporating noncircularity into IVA: IVA algorithms using noncircularity were better than those considering only circularity (e.g., non-FIVAs vs. FIVAs, non-FIVA vs. FIVA).Since noisy voxels degrade the noncircularity of an SM component, the proposed method promises to extract SM components with the least noise.
Due to using adaptive shape parameters, a subspace de-noising strategy, and the noncircularity of fMRI sources, the proposed complex-valued IVA algorithm detected more contiguous and reasonable activations than the magnitude-only method for task-related (393%) and default mode (301%) SMs at |Z| ≥ 2.5.This confirms the fact that phase fMRI data also include useful and unique brain information (Adali and Calhoun, 2007;Calhoun andAdali, 2012a, 2012b;Rodriguez et al., 2011Rodriguez et al., , 2012;;Li et al., 2010Li et al., , 2011;;Yu et al., 2015), and that complex-valued fMRI data can provide additional insights beyond magnitudeonly data.Furthermore, phase fMRI data have higher noise levels than magnitude-only fMRI data, and thus a de-noising stage is required for IVA of complex-valued fMRI data.We connected the phase denoising method (Yu et al., 2015) with our method to remove noisy voxels in SM estimates post IVA.
Other candidates to accomplish this are pre-ICA de-noising methods such as quality map phase denoising (QMPD) (Rodriguez et al., 2011(Rodriguez et al., , 2012) ) or fast Fourier transform (FFT) filtering (Cong et al., 2014;Kuang et al., 2016).When omitting the post-IVA phase de-noising, all values of the IVA algorithms decreased, while the proposed method still provided the best performance for both simulated and experimental fMRI data.Therefore, the proposed method achieved superior separation performance compared to the other 5 IVA methods, regardless of whether or not they employed de-noising (pre-or post IVA).
We applied magnitude computed motion correction and spatial normalization to the phase images for experimental fMRI data.An alternative approach is applying magnitude computed motion correction and spatial normalization to the real and imaginary images followed by recalculation of the magnitude and phase images.It should be noticed that, for echo planar imaging utilized in fMRI experiments, some physically principled techniques (Dymerska et al., 2016, Hahn et al., 2009) have been developed to address inhomogeneity in the static magnetic field, and to improve the quality of complex-valued fMRI data by reducing the noise effects from such as motion and respiration.These techniques deserve future works to connect with our method for further improvements.In addition, we focused on task fMRI data in this study.Recently, IVA has found promising applications for resting-state fMRI data due to superior performance in capturing subject variability over group ICA (Ma et al., 2014;Laney et al., 2015aLaney et al., , 2015b;;Gopal et al., 2015Gopal et al., , 2016;;Calhoun and Adali, 2016).In the future, we will extend the proposed method to analyze resting-state complex-valued fMRI data.Based on the capability of extracting additional brain information as compared to the magnitude-only approach, we believe that the proposed complex-valued IVA method could be better than real-valued IVA in capturing subject variability.

Conclusions
This study proposed a new adaptive complex-valued IVA algorithm to analyze multi-subject complexvalued fMRI data.We proposed to use an MGGD-based nonlinear function with adaptively learned shape parameters to match varying SCV distributions, update the nonlinearity in the dominant SCV subspace to remove the effect of inter-subject spatial variability, and incorporate the pseudo-covariance matrix of fMRI data into the learning rule to emphasize the noncircularity of complex-valued fMRI sources.By utilizing a post-IVA phase de-noising method, we demonstrated, through experimental results of both simulated and experimental fMRI data, that the proposed approach was significantly better than typical complex-valued IVA algorithms (non-FIVAs, FIVAs, IVA-GL, non-FIVA, and FIVA).Our proposed method also extracted more contiguous and reasonable activations than the magnitude-only method IVA-GL for task-related (393%) and default mode (301%) SMs.
slice thickness = 3.5 mm, slice gap = 1 mm, number of slices = 32, matrix size = 64 × 64, TE = 29 ms, TR = 2 s, and flip angle = 70 degrees.Data preprocessing was performed using the Statistical Parametric Mapping (SPM) software package.Magnitude data were coregistered to compensate for movement in fMRI time series images.Images were then spatially normalized into standard Montreal Neurological Institute space.Following spatial normalization, the data (real and imaginary) were slightly sub-sampled, resulting in 53 × 63 × 46 voxels.Motion correction and spatial normalization parameters were calculated from the magnitude data and applied to the phase data.Both the real and imaginary images were then spatially smoothed with a 10 × 10 × 10 mm 3 full-width-at-half-maximum (FWHM) Gaussian kernel.The 3-way experimental fMRI data (59610 × 165 × 16) were formed after removing the non-brain voxels and flattening the volume image data.
variability.Component 4 (cerebellum), Component 6 (left frontoparietal area), Component 8 (supplementary motor area), and Component 10 (IPL) exhibited smaller mean correlations within their corresponding SCVs than the other components, indicating their larger inter-subject spatial variability.In addition, Components 4, 6, and 8 with lower mean correlations for TCs possessed larger inter-subject temporal variability.

Figure 1 .
Figure 1.The mean correlations between SM components within an SCV (1) and between their

Figure 3 .
Figure 3. Average shape parameters estimated by our proposed approach (over 20 runs) for simulated

Figure 6 .
Figure6.Effects of the number of components N (from 30 to 60) on the proposed approach, non-FIVAs,

Figure 7 .
Figure 7. Effects of the number of components N (from 30 to 60) on the proposed approach, non-FIVAs,

Figure 8 .
Figure 8. Means and standard deviations of and (over 20 runs) for the task-related and

Figure 9 .
Figure 9.An example of estimated shape parameters for all 50 SCVs by the proposed approach from

Figure 10 .
Figure 10.Comparison of estimated magnitude SMs averaged across 16 subjects by the proposed

Figure 11 .
Figure 11.Comparison of the proposed approach with adaptive shape parameters, the proposed approach Figure 1 Figure 5 . The minimal and maximal t-values and p-values were showed in boldface.