Dynamic Functional Connectivity in the Musical Brain

. Musical training causes structural and functional changes in the brain due to its sensory-motor demands. This leads to diﬀer-ences in how musicians perceive and process music as compared to non-musicians, thereby providing insights into brain adaptations and plasticity. Correlational studies and network analysis investigations have indicated the presence of large-scale brain networks involved in the processing of music and have highlighted the diﬀerences between musicians and non-musicians. However, studies on functional connectivity in the brain during music listening tasks have thus far focused solely on static network analysis. Dynamic Functional Connectivity (DFC) studies have lately been found useful in unearthing meaningful, time-varying functional connectivity information in both resting-state and task-based experimental settings. In this study, we examine DFC in the fMRI obtained from two groups of participants, 18 musicians and 18 non-musicians, while they listened to a musical stimulus in a naturalistic setting. We utilize spatial Group Independent Component Analysis (ICA), sliding time window correlations, and a deterministic agglomerative clustering of windowed correlation matrices to identify quasi-stable Functional Connectivity (FC) states in the two groups. To compute cluster centroids that represent FC states, we device and present a method that primarily utilizes windowed correlation matrices occurring repeatedly over time and across participants, while excluding matrices corresponding to spon-taneous ﬂuctuations. Preliminary analysis indicate states with greater visuo-sensorimotor integration in musicians, larger presence of DMN states in non-musicians, and variability in states found in musicians due to diﬀerences in training and prior experiences.


Introduction
Professional musicians typically undergo an intensive formal training period that lasts several years.The training is followed by consistent practice and performance, often running into several hours per week.This intensive sensory-motor training causes structural [10,12] and functional [4,9] changes in the brain.Musicians also have different cerebral characteristics which correlate with the age of commencement of training and also the intensity/frequency of training [13].This makes music a great tool to study brain adaptation, and musicians an ideal group to study brain changes driven by experience, especially when contrasted with a non-musicians group.
Moreover, since music is inherently multidimensional in nature, there has been an increased focus on the use of naturalistic stimulus in continuous music listening settings (emulating real-life listening experiences).These investigations have been shown to present a more holistic picture of the neural underpinnings of music processing, as against those performed in controlled auditory settings where musical features are often presented in isolation and manipulated artificially.Correlational studies following this paradigm have indicated the presence of large-scale brain networks (involving the recruitment of cognitive areas of the cerebellum, sensory and DMN cerebrocortical areas, and motor and emotionrelated circuits), in musicians, involved in the processing of musical features like Timbre, Rhythm and Tone [3,6,18].Furthermore, network studies have also been conducted to highlight functional networks and key hubs recruited during music listening [19,14].In a study which is more in line with our work, static wholebrain functional connectivity analyses revealed group-differences between musicians and non-musicians, with the primary hubs of the musicians consisting of the cerebral and cerebellar sensorimotor regions, and those of the non-musicians consisting of DMN-related regions [2].Network investigations in this domain have thus far been restricted to static Functional Connectivity (FC) analysis.
However, assessment of FC in these studies has largely been limited by an assumption of spatial and temporal stationarity throughout the fMRI scan period.While this presents a simple template for static whole brain connectivity analysis, it comes at the cost of an inability to study FC patterns across scan timecourses.To enable dynamic temporal analysis, researchers have suggested various methods to identify and characterize FC states leading to interesting findings in FC patterns over time, in task-based and resting-state analyses [17,11,5].In this study, we utilize and extend the theoretical model and framework proposed by Allen et al., [1] on fMRI data obtained in a task-free, continuous music listening setting.
We begin by identifying Intrinsic Connectivity Networks (ICNs) using a group-level (musicians and non-musicians) ICA analysis on the fMRI data.We perform sliding window correlation computations on the time-courses of the back-reconstructed ICNs.Finally, to identify quasi-stable states which repeat across participants in the group, we adopt an agglomerative clustering approach to cluster windowed correlation matrices across all participants of the group.In earlier work [1,8], all of the matrices were used in the computation of centroids.Deviating from this, we hypothesize that FC states are of two types: ones which recur over time and across participants, and ones which are spontaneous fluctuations that do not represent generalizable group characteristics.To account for this, we include a step to identify and select matrices which repeat over time and Fig. 1: An overview of our study.occur across participants, and exclude outliers corresponding to subject specific activations and spontaneous fluctuations.We then find community structures in these FC states through the Louvain modularity-maximization method.

Participants, Stimulus, fMRI data acquisition
The participant pool consisted of 18 musically trained (9 female, mean age: 28) and 18 untrained (10, 29) participants.Both groups were comparable with respect to cognitive measures (WAIS-WMS III scores) and socioeconomic status (Hollingshead's FFI).The total number of years of training for musicians was 16 ± 5.7 years.The number of hours spent practicing music on average per week was 16.6 ± 11.The data was collected as part of a broader project ("Tunteet") involving other tests (neuroimaging and neurophysiological measures) The study protocol was approved by the ethics committee of the Coordinating Board of the Helsinki and Uusimaa Hospital District.Written consent was obtained from all the participants.They were asked to listen to an instrumental nuevo tango piece -Adios Nonino by Astor Piazzolla.This piece consisted of a high amount of variation in acoustic features such as timbre, tonality, rhythm etc., and was 8 minutes in duration.Participants' brain responses were acquired while they listened to the musical stimulus.Their only task was to attentively listen to the music delivered via MR-compatible insert earphones.MRI data was collected at the Advanced Magnetic Imaging Centre, Aalto University, Finland, on a 3T Siemens Skyra, TR = 2s, TE = 32ms, whole brain, voxel size: 2 × 2 × 2 mm 3 , 33 slices, FoV: 192 mm (64 × 64 matrix), interslice skip = 0 mm.fMRI scans were preprocessed on Matlab using SPM8, VBM5 and custom scripts.Normalization to MNI segmented tissue template was carried out.Head movement related components were regressed out, followed by spline interpolation and filtering.Then, the voxel time series was Z-scored.

Group ICA and Postprocessing
Functional data from both the groups were separately analyzed using spatial Group ICA (GICA) implemented in the GIFT toolbox [7].We chose not to group data from both groups before GICA as that would result in reduced sensitivity to between-group differences [15], more so when finding statistically significant group differences is not the objective of our work.A subject-level PCA step was first used to reduce 232 time point data (464 seconds of music at TR = 2s) into 180 dimensions.This data was concatenated across time (over subjects) and a group PCA step reduced this stacked matrix into 100 components.100 independent components (aggregated across 10 runs) were obtained from the group PCA reduced matrix using the Infomax algorithm.Per participant spatial maps (SMs) and time courses (TCs) were obtained using the spatiotemporal regression back reconstruction approach [7].Per participant SMs and TCs underwent post-processing as described in [1].ICNs were identified using thresholded one sample t-test maps resulting in C mus = 42 and C nmus = 43 ICNs (stability index Iq > 0.9) chosen out of the 100 independent components.The ICNs were grouped into 7 groups indicative of Subcortical (SC), Auditory (AU), Sensorimotor (SM), Visual (VI), Cognitive Control (CC), Default Mode Network (DM), and Cerebellar (CE) regions as shown in Fig. 2.

DFC and Clustering
As in Allen et al., and Damaraju et al., [1,8], for each subject i = 1 ... N , we estimate Dynamic FC using a sliding window approach, where covariance matrices are computed from windowed segments of R i (Fig. 1).We utilize a tapered rectangular sliding window (Fig. 1) of 30 TRs, slid in steps of 1 TR, and convolved with a Gaussian of σ = 3 TRs (to obtain edge tapering), resulting in 202 windows per participant.Covariance was estimated from the regularized inverse covariance matrix (ICOV) using the graphical LASSO framework.An additional L1 norm constraint was imposed on the covariance matrix to enforce sparsity.After computing DFC values for each subject, these covariance values were Fisher-Z transformed.These matrices are henceforth referred to as correlation matrices.
As many DFC patterns recur within subjects across time, and also occur across subjects, we performed a group level (musicians and non-musicians) clustering analysis to identify the states represented by these recurring patterns.Per group, we cluster all the 3636 (18 subjects × 202 matrices = 3636) correlation matrices computed earlier.We deviate from prior work with regard to the clustering technique and the distance metric used.We chose to adopt Agglomerative Clustering with complete linkage, using cosine distance.Agglomerative Clustering is a deterministic method which has an added advantage of being able to provide a dendrogram to visualize cluster spreads and the hierarchy leading to the formation of FC states.Euclidean distance metrics do not lend themselves well in a sparse, high dimensional setting [20] (in our setting, we have 3636 vectors with each being 903 dimensional ( 43 2 = 903)).A cosine distance metric is better suited in such cases.We also wanted to capture similar states across the entire group, and not long chains of FC windows from individual participants (the chosen time step of 1TR leads to high autocorrelation in the FC timeseries).Hence we utilized complete linkage as against single/average linkage (Ward's method is ruled out due to our choice of a cosine distance metric).The optimal number of clusters (k ) was determined using the standard elbow criterion.We also validate our choice by visually inspecting the clustering using the dendrogram.At an optimal k, splitting the dendrogram at a lower position (greater k value) gives rise to repeated centroid states at the same level in the tree.Using this method we find 4 clusters in the musicians and 3 in the non-musicians.
Each of these k clusters is composed of two broad types of correlation matrices -those which recur over time and across participants (these are precisely the matrices which are indicative of quasi-stable FC states and should be included in centroid computation) and those which correspond to subject specific activations and spontaneous fluctuations (which should ideally be excluded from centroid computation).We perform the following steps: 1.At this stage, in each group, the windowed correlation matrix timeseries of each subject is composed of strips of contiguous matrices belonging to one of the k clusters.For subject i, we denote the 202 matrices as m i1 , m i2 , ... m i202 .We consider strips with atleast 10 (chosen empirically) contiguous correlation matrices belonging to the same cluster and denote them per subject as s i1 ... s ij when j = 1 .. J such strips exist for participant i.We denote the median correlation matrix (ordered by time) for each such s ij as med ij .For eg: m 3 17 -m 3 37 could belong to cluster k = 2 for subject 3 in the musicians group.The median matrix med 3j for strip s 3j would be m 3 27 .2. To get a better estimate of the cluster center for cluster k, we choose one median matrix per subject, such that the pairwise-sum of the cosine distances between the chosen matrices is minimized.Formally, we choose me-dian matrices med 1j ... med Nmaxj , one per subject (for each subject i, med ij could belong to any chosen strip s ij containing ≥ 10 contiguous matrices), such that Nmax−1 a=1 Nmax b=a+1 CosineDistance(med aj , med bj ) is minimized.Here, N max = number of subjects with atleast one strip containing ≥ 10 contiguous matrices in cluster k.By considering the median (ordered by time) matrix, which corresponds to a window in time when the FC state was most stable, we ensure that we are closer to the true cluster center.By not weighting the medians with the length of the strips (number of matrices in the strip), we ensure that the objective function does not end up solely selecting small length strips.3. We compute the center of the cluster k as the mean of the chosen median matrices.We then sort all the matrices (from all subjects) belonging to cluster k based on their cosine distances from the computed center.4. We consider an appropriate percentile (chosen based on the first derivative of the distance series -i.e.rate of change of distance from the cluster center) of matrices from the above sorted order and use these windowed correlation matrices for centroid computation.The cluster centroids thus computed indicate quasi-stable FC states.
The centroids of clusters which contain most of the data (states 1 and 2 for both Mus and NMus) are fully reproducible across bootstrap resamples of participants.To find community structures, the cluster centroids, which are indicative of recurring DFC states, were partitioned into modules using multiple runs of the Louvain modularity-maximization algorithm.

Results and Discussion
Preliminary analysis reveals that overall, we observe more states in the musicians group, and a higher NSUB value (no. of subjects who have atleast one window in that cluster) on average for the non-musicians.This could be hypothesized to be attributed to greater similarities in listening strategies among non-musicians (simpler sensory bottom-up listening) as against musicians who utilize finetuned top-down analytic listening strategies depending on their varied training methods, leading to differences in underlying neural correlates.
In the most common state (state 1) of both the groups, for musicians we find a coupling between the visual and the sensorimotor regions (both are parts of module 2), while in non-musicians, the auditory regions and the sensorimotor regions lie in the same module (parts of module 2) with the visual regions lying in a separate module 3.This is in line with the action-perception coupling found in the musicians, wherein experience with a sensorimotor task such as instrument playing leads to a strong coupling of sensory (visual/auditory) and motor regions [16].Studies have shown correlations in activations in sensorimotor regions and visual representation of music/instrument playing.On the other hand coupling between the auditory and sensorimotor regions in non-musicians can be attributed to reactionary responses to acoustic features such as rhythm.Fig. 4: Amplitude of FC oscillations -higher end of the scale indicates greater variability in connectivity over the course of time between the corresponding ICNs.
For both groups, state 1 lacks coupling of the DMN regions, indicating that this state represents most of the active listening period with a high cognitive load.
For musicians, it can be suggested that state 3 is representative of the DMN related default state (DMN regions occur in the same module 3), representative of times of low cognitive load.State 2 and state 4 could be hypothesized to correspond to other states of active music listening, where state 4 is observable in a less number of subjects (training/prior experience dependent).State 2 presents greater integration of the auditory, visual and sensorimotor regions (module 1) as compared to state 1.State 4 indicates a few differences from state 1 -the subcortical regions along with the putamen/angular gyri and the inferior/superior frontal regions are grouped together (module 3).
For non-musicians, state 2 is primarily indicative of the grouping of DMN regions together (module 1), suggesting that this state corresponds to times of low cognitive load.The presence of this state for a reasonably large percentage of time as against in musicians, indicates a greater tendency to fall back to the default state in non-musicians in times of low auditory cognitive load.State 3 exhibits a separate module for visual regions (module 3), and also coupling of DMN regions.This, along with small-negative correlations of the visual regions with the other regions in the centroid FC matrix calls for further investigation.
In terms of temporal variability, musicians indicated a larger number of stable (less variable) (over the time course) ICN pairs, primarily associated with the Subcortical, Auditory, DMN and Cerebellar regions.Non-musicians exhibited a large number of fluctuating (more variable) connections between ICN pairs, primarily associated with the DMN, Visual and Sensorimotor regions.
To conclude, we are the first to analyze DFC in musicians and non-musicians in fMRI data collected in a naturalistic setting.We utilize spatial GICA, sliding time window correlations, and a deterministic agglomerative clustering of windowed correlation matrices to identify quasi-stable FC states in the two groups.

Fig. 2 :
Fig. 2: Lateral views of the left (at top) and right (at bottom) hemispheres for both the groups indicating the spatial maps of the ICNs grouped as indicated (views show the union of the ICNs for each grouping).The numbers indicate the number of selected ICNs which belong to that group.Most of the regions are common to both the Mus and NMus groups and are indicated in the middle row.

Fig. 3 :
Fig. 3: Correlation heatmaps and Modularity partitions for the cluster centroids ordered by their percentage of occurrence.MCD values indicate the mean cosine distance of all the points in the cluster from the centroid.NSUB values indicate the number of subjects who have atleast one window in that cluster.NMOD indicates the number of modules in the Modularity partition.