Page MenuHomec4science

IOPLaTeXGuidelines.tex
No OneTemporary

File Metadata

Created
Sat, May 4, 02:42

IOPLaTeXGuidelines.tex

\documentclass[12pt]{iopart}
%Uncomment next line if AMS fonts required
\usepackage{iopams}
\usepackage{amssymb}
\usepackage{amsfonts}
\usepackage{graphics}
\usepackage{amsthm}
\usepackage{graphicx}
\usepackage{amssymb}
\usepackage{pifont}
\usepackage[]{algorithm2e}
\usepackage{pdflscape}
\usepackage{floatpag,mwe}
\usepackage{changepage}
\usepackage{adjustbox}
\usepackage{parskip}
\usepackage{url}
\bibliographystyle{acm}
\begin{document}
\title[Sparse coupled logistic regression for dynamic FC mapping]{Sparse coupled logistic regression to estimate co-activation and modulatory influences of brain regions}
\author{Thomas A. W. Bolton$^{1,2}$ \& Dimitri Van De Ville$^{1,2}$}
\address{$^1$ Institute of Bioengineering, \'{E}cole Polytechnique F\'{e}d\'{e}rale de Lausanne (EPFL), Lausanne, Switzerland \\ $^2$ Department of Radiology and Medical Informatics, University of Geneva (UNIGE), Geneva, Switzerland}
\ead{thomas.bolton@epfl.ch}
\vspace{10pt}
%\begin{indented}
%\item[]November 2019
%\end{indented}
\begin{abstract}
Accurate mapping of the functional interactions between remote brain areas with resting-state functional magnetic resonance imaging requires the quantification of their underlying dynamics. In conventional methodological pipelines, a spatial scale of interest is first selected, and dynamic analysis then proceeds at this hypothesised level of complexity. If large-scale functional networks or states are studied, more local regional rearrangements are then not described, potentially missing important information.
Here, we propose a novel mathematical framework that jointly estimates resting-state functional networks, and spatially more localised cross-regional modulations. To do so, the changes in activity of each brain region are modelled by a logistic regression including co-activation coefficients (reflective of network assignment, as they highlight simultaneous activations across areas) and causal interplays (denoting finer regional cross-talks, when one region active at time $t$ modulates the $t$ to $t+1$ transition likelihood of another). A two-parameter $\ell_1$ regularisation scheme is used make these two sets of coefficients sparse: one controls overall sparsity, while the other governs the trade-off between co-activations and causal interplays, enabling to properly fit the data despite the yet unknown balance between both types of couplings.
On simulated time courses, we show that the framework successfully retrieves the two types of cross-regional interactions at once. On experimental data, we reveal that the resting brain operates with a mix of both processes. Our methodological pipeline offers a conceptually elegant alternative for the assessment of functional brain dynamics, and can be downloaded at \url{https://c4science.ch/source/Regional_SCHMM.git}.
\end{abstract}
\vspace{2pc}
\noindent{\it Keywords}: dynamic functional connectivity, effective connectivity, logistic regression, $\ell_1$ regularisation
% Uncomment for Submitted to journal title message
\submitto{\JNE}
% Uncomment if a separate title page is required
%\maketitle
% For two-column output uncomment the next line and choose [10pt] rather than [12pt] in the \documentclass declaration
%\ioptwocol
\clearpage
%%%%%%%%%%% INTRODUCTION
\section{Introduction}
Understanding the structural wiring of the brain at its most global scale, and how information flows between remote processing centres, are essential questions to shed light on higher-order behaviours involving multi-modal integration and associated brain disorders. When it comes to functional magnetic resonance imaging (fMRI), the mapping of brain function is commonly performed from resting-state (RS) recordings through the computation of \textit{functional connectivity} (FC), that is, the statistical interdependence between different time courses reflective of regional activity~\cite{Friston1994b}, as can be assessed from an array of measures~\cite{Smith2010}. This approach has revealed the presence of a set of RS networks (RSNs)~\cite{Damoiseaux2006,Power2010,Yeo2011}, whose properties are critical landmarks of brain function and cognition~\cite{Bressler2010,VanDenHeuvel2010}.
Over the past decade, it has become increasingly clear that quantifying FC between two brain regions as one scalar for a full scanning session is an overly simplistic approach that does not characterise the numerous reconfigurations that occur at the time scale of seconds~\cite{Chang2010}. Accordingly, many methodological pipelines have been developed to dig into time-resolved FC, and map brain function dynamically (see~\cite{Preti2017,Lurie2018} for contemporary reviews).
The most notorious family of dynamic approaches simplifies the originally voxel-wise fMRI data into a state-level representation: first, FC is computed over successive temporal sub-windows, and the concatenated data across the full subject population at hand is subjected to hard clustering to yield a set of dynamic FC (dFC) states~\cite{Allen2014,Damaraju2014}. Because spatial Independent Component Analysis (ICA) is typically performed prior to clustering, each state stands for a set of RSNs showing specific correlational relationships.
In other analytical schemes, whole-brain voxelwise activity~\cite{Liu2013b}, or activity transients~\cite{Karahanoglu2015}, undergo clustering instead of FC patterns; in this case, each of the retrieved centroids directly stands for an RSN. If temporal ICA is applied after spatial ICA, temporally mutually independent RSNs are retrieved~\cite{Smith2012}. Finally, the use of a hidden Markov model (HMM) also enables to derive RSNs, as represented under the form of (sparse) FC patterns~\cite{Eavani2013,Vidaurre2017} or vectors of activation~\cite{Chen2016d}.
In all the above cases, there is the underlying assumption that the raw fMRI data can be downscaled to a set of RSNs, and that the dynamics of brain function should be understood from this simplified starting point. Recent results, however, question the validity of this assumption: for instance, some brain regions do not remain attached to the same network throughout a scanning session, but instead adjust their modular allegiance over time in a way that relates to cognitive performance~\cite{Chen2016,Pedersen2018b}. In addition, brain regions or networks also morph spatially over time~\cite{Kiviniemi2011,Kottaram2018,Iraji2019}.
To capture these spatially more subtle reconfigurations, novel methodologies have attempted to operate at the regional scale, and the assessment of \textit{causal} relationships (\textit{i.e.}, from time $t$ to $t+1$) between distinct areas showed particular merits as an alternative conceptualisation of RS functional brain dynamics, be it through autoregressive models~\cite{Liegeois2017,Lennartz2018} or Ornstein-Uhlenbeck processes~\cite{Gilson2016}.
At present, there are thus two conceptually discrepant ways to view RS dFC: on the one hand, expressing it as sets of simultaneously activating regions that make networks, and on the other hand, viewing it as effective connectivity between individual areas. It remains to be determined which of these two viewpoints offers the best representation of RS dynamics, and whether they describe overlapping or distinct facets of the data.
In this work, we have attempted to progress in answering these questions by developing a novel methodological framework that jointly estimate sets of co-activations, and causal couplings, between individual brain regions. A dedicated parameter also enables to modulate the trade-off in data fitting between these two viewpoints.
\clearpage
%%%%%%%%%%% METHODS
\section{Materials and Methods}
\subsection{Mathematical framework}
Let us denote the activity of a region $r$ (out of $R$ in total) at time $t$ as $h_t^{(r)}$. We hypothesise two possible states of activity: \textit{baseline} ($h_t^{(r)}=0$) or \textit{active} ($h_t^{(r)}=+1$). Each region may interact with all the other areas $s\neq r$ in two ways: (1) showing simultaneous activity (that is, episodes of co-activation), or (2) being causally modulated. To jointly describe these two phenomena, we characterise the probability of a region $r$ to switch between activity states as a logistic regression~\cite{Friedman2010}:
\begin{equation}
\left\{
\begin{array}{ll}
\mathcal{P}(h_{t+1}^{(r)}=+1 | h_{t}^{(r)}=0, \mathbf{h}_{t}^{\mathbf{(-r)}}, \mathbf{h}_{t+1}^{\mathbf{(-r)}}) = \frac{1}{1 + e^{-(\alpha_A^{(r)} + \boldsymbol{\gamma}_{A}^{(r)\top} \mathbf{h}_{t+1}^{\mathbf{(-r)}} + \boldsymbol{\beta}_{A}^{(r)\top} \mathbf{h}_{t}^{\mathbf{(-r)}})}}\\
\mathcal{P}(h_{t+1}^{(r)}=0 | h_{t}^{(r)}=+1, \mathbf{h}_{t}^{\mathbf{(-r)}}, \mathbf{h}_{t+1}^{\mathbf{(-r)}}) = \frac{1}{1 + e^{-(\alpha _D^{(r)}+ \boldsymbol{\gamma}_{D}^{(r)\top} \mathbf{h}_{t+1}^{\mathbf{(-r)}} + \boldsymbol{\beta}_{D}^{(r)\top} \mathbf{h}_{t}^{\mathbf{(-r)}})}}
\end{array}.
\right.
\end{equation}
The baseline-to-active transition is modelled by the first equation, while the return to baseline from an active state is governed by the second. Associated coefficients are respectively written with the $\cdot_{A}$ and $\cdot_{D}$ subscripts. In what follows, for the sake of clarity, we will omit these subscripts and only show one set of equations, as the formulations are strictly equivalent for both types of transitions.
If all other regions are at a baseline level of activity at the start ($\mathbf{h}_{t}^{\mathbf{(-r)}}=\mathbf{0}$) and end ($\mathbf{h}_{t+1}^{\mathbf{(-r)}}=\mathbf{0}$) of the transition, only the scalar coefficient $\alpha^{(r)}$ plays a role in shaping the transition likelihood. The vector $\boldsymbol{\gamma}^{(r)}\in\mathbb{R}^{R-1}$ contains the co-activation coefficients for all regions $s\neq r$: positive-valued coefficients will enhance the likelihood of the transition of interest if $h_{t+1}^{(s)}=+1$ (that is, if regions $r$ and $s$ are co-active at time $t+1$). Negative-valued coefficients will, likewise, reduce the transition probability. The reasoning is similar for the vector $\boldsymbol{\beta}^{(r)}\in\mathbb{R}^{R-1}$, except that a modulatory effect is then exerted if $h_t^{(s)}=+1$ (\textit{i.e.}, region $s$ is active before the transition, resulting in a causal modulation instead of a co-activation).
If the above pair of equations is considered for each brain region, the resulting coefficients can be arranged in two types of matrices, where the $r$\textsuperscript{th} column contains the influences onto region $r$ (diagonal elements are left empty): one type is reflective of co-activations, which we be termed $\mathbf{\Gamma}$, and one symbolises causal modulations, and will be referred to as $\mathbf{B}$. $\mathbf{\Gamma}$ and $\mathbf{B}$ can respectively be interpreted as equivalents of the functional connectome and effective connectome. An overview of our framework is provided in Figure~\ref{JNE_F1}.
%%%%%%%%%%%%%
\begin{figure}[h!]
\centering
\includegraphics[width=1.0\textwidth]{Figures/JNE_FIG1.eps}
\caption[Overview of the framework]{\scriptsize\textbf{Overview of the framework.} \textbf{(A)} Example activity time courses for a set of 14 regions; each can transit between a baseline state of activity (symbolised by a grey circle) and an active state (red circle). The green, salmon and blue underlays highlight the regions that belong to the same RSN, and thus exhibit a similar transitory dynamics. Regions 10 to 12 evolve according to their own dynamics, which are independent from all the others. As for regions 13 and 14, they are \textit{hubs} that belong to two networks at a time (as rendered by the mixed colour underlay), and thus turn active as soon as one of their affiliated networks does so. \textbf{(B)} Coefficient matrices associated to the example presented in {(A)} for co-activations (top row) and causal modulations (bottom row). The left column pertains to the transition from the baseline to the active state: a positive-valued coefficient at $l,r$ means that when region $l$ is active, it enhances the likelihood of a transition for region $r$ at the same time point (for co-activations) or one time point later (for causal modulations). The middle column similarly characterises transitions from the active to the baseline state; thus, modulations that enhance the overall activity of an area are here reflected by negative-valued coefficients (\textit{i.e.}, the probability to go down in activity is lowered). The right column yields total influences summed across both transition types. \textbf{(C)} To solve the framework, optimal regularisation parameters $\lambda$ and $\xi$ are first determined by extracting local maxima of the full likelihood across regions and transition types (top box). Then, co-activation and causal coefficients are computed for each region $r$ and transition type (middle box). Finally, the likelihood to switch activity state can be compared with and without an external region's influence, to compute a pair-wise probabilistic modulation coefficient (bottom box). R: region. N: network.}
\label{JNE_F1}
\end{figure}
%%%%%%%%%%%%%
The concomitant modelling of co-activations and causal modulations enables to jointly derive the two sets of coefficients. Given the fact that the resting brain is often described as a series of RSNs~\cite{Damoiseaux2006,Power2010,Yeo2011}, we expect $\boldsymbol{\Gamma}$ to only contain a sparse subset of non-null entries. Similarly, only a restricted amount of areas or networks are expected to causally modulate each other~\cite{Christoff2016,Bolton2017b}. To fit these neurobiological priors, while also enabling convergence of the framework with fewer data points, we appended an $\ell_1$ regularisation term:
\begin{equation}
\xi ||\boldsymbol{\gamma}^{(r)}||_1 + (1-\xi) ||\boldsymbol{\beta}^{(r)}||_1 < \rho \quad \forall \quad r = 1,...,R.
\end{equation}
In the above, the parameter $\rho$ controls the extent of regularisation casted on all coefficients (it is associated to an inversely proportional parameter $\lambda$ in the optimisation equation detailed below). The parameter $\xi$ enables to balance the extent with which the co-activation and causal sets are regularised: if $\xi=0$, regularisation only operates on causal coefficients, while if $\xi=1$, only co-activation coefficients are made sparse. This respectively amounts to a description of regional brain dynamics where co-activations, or causal influences, dominate.
\subsection{Implementation}
Solving the above set of coupled logistic regression equations requires that the activity levels of all regions be known. To binarise the input time courses, we individually z-score each, and set to 1/0 the time points with a value above/below 0. While binarisation may remove part of the insightful information from the original data, it has been used in recently developed methodological pipelines~\cite{Kang2019}. In the discussion, we touch upon possibilities to make the framework amenable to a case with more than 2 states of activity.
After defining the activation states, initial parameter estimates can be computed. Co-activation and modulatory coefficients are all set to 0, and intrinsic transition probabilities are estimated by a standard HMM approach~\cite{Rabiner1989}.
Following~\cite{Friedman2010}, in a regularised logistic regression, one attempts to solve the following:
\begin{equation}
\min_{\alpha^{(r)},\boldsymbol{\gamma}^{(r)},\boldsymbol{\beta}^{(r)}} -\mathcal{L}^{(r)}(\alpha^{(r)},\boldsymbol{\gamma}^{(r)},\boldsymbol{\beta}^{(r)}) + \lambda (\xi ||\boldsymbol{\gamma}^{(r)}||_1 + (1-\xi) ||\boldsymbol{\beta}^{(r)}||_1),
\end{equation}
where $r$ is the assessed region, and the log-likelihood is approximated as:
\begin{equation}
\mathcal{L}^{(r)}(\alpha^{(r)},\boldsymbol{\gamma}^{(r)},\boldsymbol{\beta}^{(r)}) = - \frac{1}{2|\mathcal{T}|} \sum_{t\in\mathcal{T}} \omega_{t} (z_{t} - \alpha^{(r)} - \boldsymbol{\gamma}^{(r)\top} \mathbf{h}_{t+1}^{\mathbf{(-r)}} - \boldsymbol{\beta}^{(r)\top} \mathbf{h}_{t}^{\mathbf{(-r)}}) + C.
\end{equation}
The ensemble $\mathcal{T}$ contains all the data points for which the probed region is in the start state of interest at time $t$ (\textit{e.g.}, baseline for the baseline-to-active transitions), and $C$ is a constant. If we denote the probability of the transition of interest as $p(\alpha^{(r)},\boldsymbol{\gamma}^{(r)},\boldsymbol{\beta}^{(r)},\mathbf{h}_{t}^{\mathbf{(-r)}}, \mathbf{h}_{t+1}^{\mathbf{(-r)}})$, the parameters $\omega_t$ and $z_t$ depend on the current estimates of the coefficients---which we denote with a tilda---as:
\begin{equation}
\left\{
\begin{array}{ll}
\omega_t = p(\tilde{\alpha}^{(r)},\boldsymbol{\tilde{\gamma}}^{(r)},\boldsymbol{\tilde{\beta}}^{(r)},\mathbf{h}_{t}^{\mathbf{(-r)}}, \mathbf{h}_{t+1}^{\mathbf{(-r)}})-p(\tilde{\alpha}^{(r)},\boldsymbol{\tilde{\gamma}}^{(r)},\boldsymbol{\tilde{\beta}}^{(r)},\mathbf{h}_{t}^{\mathbf{(-r)}}, \mathbf{h}_{t+1}^{\mathbf{(-r)}})^2 \\
z_{t} = \tilde{\alpha}^{(r)} + \boldsymbol{\tilde{\gamma}}^{(r)\top} \mathbf{h}_{t+1}^{\mathbf{(-r)}} + \boldsymbol{\tilde{\beta}}^{(r)\top} \mathbf{h}_{t}^{\mathbf{(-r)}} + \frac{y_t - p(\tilde{\alpha}^{(r)},\boldsymbol{\tilde{\gamma}}^{(r)},\boldsymbol{\tilde{\beta}}^{(r)},\mathbf{h}_{t}^{\mathbf{(-r)}}, \mathbf{h}_{t+1}^{\mathbf{(-r)}})}{\omega_t}
\end{array}.
\right.
\end{equation}
$y_t$ defines whether there was a change in activity level from time $t$ to $t+1$ or not (respectively, $y_t = 1$ or $y_t = 0)$. Coefficients are iteratively estimated by a coordinate-wise descent algorithm, following ~\cite{Friedman2007}: the initial estimates outlined above are used at the maximal regularisation level $\lambda_{MAX}$, and individual coefficients are successively re-estimated in random order (note that for $\alpha^{(r)}$ coefficients, which do not enter the $\ell_1$ regularisation term, soft shrinkage is not required). The process continues until the change across two iterations becomes lower than a defined tolerance threshold $\epsilon$. The next regularisation level is then considered, using warm restarts to speed up computations (\textit{i.e.}, the estimates obtained at the end of a regularisation cycle are used as initial values for the following one).
In all the analyses performed in this work, we considered a regularisation path with $\lambda\in[10000,0.02]$ (206 logarithmically distributed values), compared five levels of trade-off between co-activation and causal coefficients ($\xi=\{0,0.25,0.5,0.75,1\}$), and used a tolerance $\epsilon=10^{-40}$.
\subsection{Validation of the framework on simulated data}
We first sought to validate our pipeline on simulated data containing cross-regional causal modulations as well as co-activations. To do so, we considered parameters resembling those of the assessed experimental data (see the following section) as much as possible. We simulated activity time courses for $R=45$ regions, for a total of $S=135$ subjects and $T=1190$ time points per subject.
To design our simulations in accordance with the RS literature~\cite{Yeo2011}, we considered the presence of $N=7$ separate RSNs, each of which could contain between 4 and 7 areas. Time courses for all regions belonging to the same network were similar (prior to the addition of noise). In addition, we also included a set of areas evolving according to their own, independent dynamics; since in such a setting, no co-activation or causal coefficients should be retrieved, these regions can be regarded as a negative control. Furthermore, a few regions were also set as \textit{hubs} that jointly belong to two networks, and activate as soon as one of the networks turns on. Figure~\ref{JNE_F2}C (top left matrix) shows the ground truth co-activation relationships between the set of simulated regions.
Each simulated dynamics was associated to a probability to switch from the baseline to the active state, selected uniformly in the [0.2,0.5] interval. Similarly, the probability to transit from the active to the baseline state was uniformly selected in the [0.7,0.9] interval. Causal modulations were introduced between a subset of networks, as summarised in Figure~\ref{JNE_F2}C (top right matrix): when a modulating network turned active, it could enhance the activity of the modulated network (both by enhancing the likelihood of a 0 to +1 transition, and reducing that of a +1 to 0 one), as symbolised by a positive-valued causal coefficient, or decrease that activity, as reflected by a negative-valued element. We used a shift in transition probability $\Delta P=0.6$.
Eventually, all time courses were corrupted with Gaussian noise, at standard deviation $\sigma=2$; indicative time courses for a simulated subject are presented in Figure~\ref{JNE_F2}A, where noise is sufficient not to be able to infer any cross-regional relationships by mere eyesight.
To assess the ability of the framework to recover the ground truth, we computed Pearson's spatial correlation coefficient between ground truth and estimated coefficients, separately for the co-activation and causal sets, and contrasted these similarity measures to the evolution of the log-likelihood of the data. In addition, we examined whether the information contained in the co-activation coefficients was sufficient to re-order the regions into their underlying networks, by computing Ward's linkage on probabilistic co-activation values (see Figure~\ref{JNE_F1}C, bottom box).
\subsection{Application of the framework to experimental fMRI data}
We applied our framework to experimental RS fMRI data from the \textit{Human Connectome Project}~\cite{VanEssen2013}. We considered one scanning session long of $T=1190$ time points for $S=135$ subjects. The data was acquired at a fast TR of 720 ms, at a spatial resolution of 2 $\times$ 2 $\times$ 2 mm\textsuperscript{3}; additional acquisition details can be found elsewhere~\cite{Smith2013}.
We started from the publicly available minimally preprocessed data. Each voxel-wise time course was detrended, and constant, linear and quadratic trends were regressed out at the same time as a Discrete Cosine Transform basis (cutoff frequency: 0.01 Hz). We chose not to perform global signal regression, since it remains a debated preprocessing step~\cite{Murphy2017}, and in light of recent results showing extensive relationships between spatio-temporal motion patterns and human behaviour~\cite{Bolton2019c}, also decided not to include individual motion time course regressors (note that motion is handled by conservative scrubbing at a later stage of the pipeline---see below).
Voxel-wise time courses were averaged into 90 regions of interest defined from the AAL atlas~\cite{TzourioMazoyer2002}; although more accurate parcellation schemes have been introduced~\cite{Glasser2016,Schaefer2017}, they involve a larger amount of brain regions and would thus require an amount of data larger than the available one for accurate estimation. As the main goal of the present report is the introduction of our framework, rather than its application to neurobiologically relevant questions, we opted to operate at the smaller AAL scale.
Scrubbing was performed at a framewise displacement threshold~\cite{Power2012} of 0.3 mm, and discarded frames were re-estimated by cubic spline interpolation. As a final step, from the fully preprocessed data, we used a total variation-based denoising approach~\cite{Karahanoglu2013,Bolton2019d} to derive cleaned \textit{activity-inducing} signals freed from haemodynamic effects. We only included temporal regularisation in the process, without any spatial prior.
To assess the reproducibility of our findings, we separately applied our framework to the activity-inducing time courses obtained for each hemisphere of the brain; in both cases, co-activations and causal modulations were thus estimated between $R=45$ separate areas.
\clearpage
\section{Results}
\subsection{Validation of the framework on simulated data}
Figure~\ref{JNE_F2} displays the results of our simulations. Around the largest regularisation extents ($\lambda_1=9000$), the log-likelihood was low regardless of the balance between the regularisation of co-activation and causal coefficients, and this was associated to overall low similarity to the ground truth transition probability modulation patterns (Figure~\ref{JNE_F2}B), an unsurprising feature given that probabilistic modulation coefficients were then extremely sparse, or (for the less regularised set) randomly distributed (see $\lambda_1$ cases in Figure~\ref{JNE_F2}C).
When regularisation decreased (\textit{i.e.}, going to the left in Figure~\ref{JNE_F2}B plots), the log-likelihood remained low when regularisation was principally casted on causal modulations (see the orange and purple curves in the top plot); as seen in the associated coefficient matrices from Figure~\ref{JNE_F2}C, this is because many erroneous coefficients still populated the co-activation set, which is the dominating factor in the simulated data. Log-likelihood was more elevated for the schemes that favoured sparsity of co-activation coefficients (red and blue curves), or enabled an equal regularisation between both sets (green curve). At the global log-likelihood optimum ($\lambda_2 = 190, \xi=0.5$), co-activation probabilistic modulations were accurately retrieved in a majority (but not all) of cases, as well as for a still limited subset of causal relationships. This resulted in intermediate similarity to the ground truth.
When regularisation was further lowered, regardless of the $\xi$ parameter value, all curves converged towards a common, almost full representation that captured ground truth co-activation and causal influences with high fidelity: all regional similarity values exceeded 0.8 for co-activations, and for causal modulations, the majority exceeded 0.6. Only hub regions (for which underlying patterns are by construction more complex) and areas from network 3 (linked to a negative-valued modulation from network 7) showed slightly lower similarity values around 0.5, but the related patterns could still be captured in the associated coefficient matrices from Figure~\ref{JNE_F2}C.
%%%%%%%%%%%%%
\begin{figure}[p]
\centering
\includegraphics[width=1.0\textwidth]{Figures/JNE_FIG2.eps}
\caption[Results on simulated data]{\scriptsize\textbf{Results on simulated data.} \textbf{(A)} Example simulated time courses for $R=45$ regions, each displayed as one row for 250 samples. Colour coding denotes the network attribution of the regions ($N_1$ to $N_7$), as well as independent areas (in white) and hubs (in dark grey). \textbf{(B)} For the whole path of regularisation (strong to weak regularisation from the right to the left), whole log-likelihood of the data across brain regions and transition types (top plot), and associated similarity to the ground truth co-activation (middle plot) and causal (bottom plot) coefficients. The colour coding of the time courses respectively stands for the trade-off between the co-activation and causal set regularisations (top plot), and the network assignment of the regions (middle and bottom plot). The $\lambda_1$, $\lambda_2$ and $\lambda_3$ vertical lines highlight three indicative regularisation levels further detailed in \textbf{(C)}. \textbf{(C)} For co-activation (left column) and causal (right column) coefficients, ground truth values (top row), and probabilistic cross-regional influences for three co-activation/causal trade-off values ($\xi=1,0.5,0$) and overall regularisation levels ($\lambda_1 = 9000,\lambda_2 = 190,\lambda_3 = 8.4$). \textbf{(D)} Dendrogram for regional clustering from co-activation probabilistic influences, with the same regional colour coding as in \textbf{(A)}.}
\label{JNE_F2}
\end{figure}
%%%%%%%%%%%%%
Log-likelihood reached a local optimum at $\lambda_3 = 8.4$, which was very close to the global one. The slightly lower likelihood value despite the closer match to the ground truth is explained by the presence of a wide array of small noisy coefficients, seen as small negative-valued entries in the $\lambda_3$ matrices of Figure~\ref{JNE_F2}C.
To summarise, although the arrangement of regions into networks and their relationships could not be determined from inspecting the time courses (Figure~\ref{JNE_F2}A), they could be retrieved following the application of our framework. In addition, all regions could be correctly assigned to their associated network from co-activation probabilistic couplings (Figure~\ref{JNE_F2}D): following hierarchical clustering, 8 distinct groups could indeed be determined, including the 7 networks of interest and an extra cluster for independent regions. Note that hub areas were all assigned to one of the networks that they were linked to.
\subsection{Application of the framework to experimental fMRI data}
Figure~\ref{JNE_F3} shows the results on applying our framework to experimental fMRI data. We considered the data obtained by applying our framework to the right hemisphere of the brain as a main set (associated results are presented in the top row of Figure~\ref{JNE_F3}B and in Figure~\ref{JNE_F3}C), and assessed replicability by comparing the results to those obtained from the left hemisphere (bottom row in Figure~\ref{JNE_F3}B).
Log-likelihood was maximised for $\xi=0.5$, and $\lambda=750$ (right hemisphere) or $\lambda=950$ (left hemisphere). Clustering of the brain regions through Ward's linkage enabled to segment the data into 7 networks (Figure~\ref{JNE_F3}B, top right plot): network 1 contained 19 areas, including all the hubs from the default mode network~\cite{Buckner2008} (medial prefrontal cortex, posterior cingulate cortex, angular gyrus, [para]hippocampus), as well as the amygdala and thalamus. Other networks included a smaller set of areas, and often represented sensory brain systems (for instance, components from the visual circuitry in networks 6 and 7). Upon reordering of the regions, done according to the similarity in co-activation coefficients, clear FC patterns could be seen, highlighting the fact that both measures capture similar information (see the left column in Figure~\ref{JNE_F3}B). FC patterns were almost identical across hemispheres, and the main features of co-activations were also retrieved, including in particular coefficients linking networks 2 and 3, 4 and 5, or 6 and 7.
Regarding causal modulations (Figure~\ref{JNE_F3}B, middle column for an overview, and Figure~\ref{JNE_F3}C for more detailed results), influences exerted by networks 2 to 7 were primarily positive-valued, as seen in both hemisphere cases; this means that when one of these networks turns active, it will tend to then entrain the others towards the active state as well. On the contrary, modulations emanating from network 1 were primarily negative-valued: thus, when it turns active, it will prevent other systems from becoming active themselves, and favour their transition to the baseline state.
%%%%%%%%%%%%%
\begin{figure}[p]
\centering
\includegraphics[width=1.0\textwidth]{Figures/JNE_F3.eps}
\caption[Results on experimental fMRI data]{\scriptsize\textbf{Results on experimental fMRI data.} \textbf{(A)} Example activity-inducing (AI) time courses for the 45 (unsorted) right hemisphere region of an indicative subject, annotated according to their AAL atlas label. \textbf{(B)} For the right hemisphere (top row) or left hemisphere (bottom row) data, functional connectivity matrices (left column), causal modulations retrieved by our framework (middle column), and co-activations recovered by the pipeline (right column). Regions were sorted from co-activation coefficients obtained from the right hemisphere results (Ward's linkage, top right plot). A distance cutoff of $d_W = 0.9$ was chosen to separate the 45 regions into 7 networks (see the vertical orange line). Horizontal and vertical black lines separate the different networks in each matrix representation. Note that for comparison purposes, the reordering of left hemisphere regions is the same as that derived from the right hemisphere data. \textbf{(C)} For the 45 regions at hand and right hemisphere results, summary of network assignments (colour coding of the nodes) and cross-regional causal modulations (edges). Edges are proportional in width to the extent of the probabilistic coupling, and their colour highlights the type of modulation (red: enhancement of activity/blue: lowering of activity).}
\label{JNE_F3}
\end{figure}
%%%%%%%%%%%%%
\clearpage
\section{Discussion}
In this work, we introduced a novel mathematical framework enabling to jointly derive the patterns of co-activation between brain regions, reflective of the brain's functional organisation as a set of RSNs~\cite{Damoiseaux2006,Yeo2011}, and additional cross-regional causal modulations that enable to go beyond this network-level characterisation and also model more subtle cross-regional interplays. One can conceive our strategy as a joint recovery of FC (embedded in the $\boldsymbol{\Gamma}$ co-activation coefficients) and effective connectivity (in $\mathbf{B}$).
Our strategy is an improvement over previous work that also used a logistic regression characterisation to describe causal interactions between functional brain networks~\cite{Bolton2017b}: in this former methodology, however, network maps had to be computed in a separate analytical step, prior to the establishment of their causal interplays. As such, and much like the majority of other prominent dynamic FC approaches---see for instance~\cite{Liu2013,Allen2014,Karahanoglu2015,Vidaurre2017}, more subtle relationships at a smaller spatial scale than that of RSNs are then lost.
On simulated data, both co-activation and causal coefficient sets could accurately be retrieved by our framework despite marked noise. The optimal log-likelihood of the data was achieved in a weak regularisation setting, as we considered enough data points for accurate estimation of the full model: in total, we analysed 160650 time points for the estimation of $2(R+(R-1)R+(R-1)R)=8010$ coefficients (two sets of coefficients---one per type of transition---for individual regional dynamics, co-activation and causal links), resulting in 20 data points available per estimate. Regularisation is expected to become more handy when dimensionally larger problems are addressed at a similar dataset size: for example, it will be interesting to derive coefficients on an extended set of brain regions obtained with finer parcellations that do not only operate from structural brain markers~\cite{Glasser2016,Schaefer2017}.
As our simulations primarily included positive-valued coefficients, noisy coefficient estimates accompanying ground truth values were biased towards negative values (see the $\lambda_3$ settings in Figure~\ref{JNE_F2}C). This is why the simulated negative causal relationship between networks 7 and 3 was the least accurately captured one. At stronger regularisation levels, noisy coefficients disappeared, and a restricted subset of ground truth entries were recovered, owing to the $\ell_1$ norm properties~\cite{Tibshirani1996}.
Several strategies may be envisioned to further improve the accuracy of the results obtained with our framework. First, the purely $\ell_1$ regularisation strategy could be turned into an \textit{elastic net} mix between $\ell_1$ and $\ell_2$ norms~\cite{Zou2005}, but it would then come at the cost of an extra free parameter to specify. Second, additional assumptions could be explicitly introduced to the model formulation, such as the symmetric and non-negative nature of $\boldsymbol{\Gamma}$. Third, as noise operates to counterbalance strong positive-valued coefficients along a given column of $\boldsymbol{\Gamma}$ or $\mathbf{B}$ (recall that coefficient estimates are obtained separately for each region $r$ standing as one matrix column), the framework could be extended to successively run through a column-wise (as presently) and a row-wise solving step, where in the latter case, we would instead be estimating all the modulations emanating from a given region $r$ (instead of impinging on it). Each of these three options has merits, but comes at the expense of a greater computational complexity and less streamlined modelling.
On experimental fMRI data, the optimal trade-off yielding a maximised log-likelihood was found for $\xi=0.5$, meaning that both co-activations and causal interplays are required to accurately describe functional brain dynamics. This highlights the importance of developing methodological approaches that do not only focus on one viewpoint, but instead attempt to jointly capture network-level reconfigurations and more subtle cross-regional modulations.
An interesting development for future work could be to characterise, instead of the probability to transit from a given state of activity to another, the likelihood to show an activation \textit{transient} (that is, go up or down in activity regardless of the exact starting point). By this mean, the current framework could seamlessly be generalised to more than only 2 states of activity, which may better represent the dynamics of some brain regions. This information is already available (by comparison to phase-randomised null data) from the \textit{total activation} pipeline used in the deconvolution of the analysed fMRI data~\cite{Karahanoglu2013,Bolton2019d}.
An additional interest would then be the easier comparison of results obtained from datasets acquired at various TRs, so that the increasingly understood specificities of fast TR datasets~\cite{Chen2019b,Power2019} can be better disentangled from more general effects. To do so, one could determine whether a transient has just occurred prior to the assessed time point by jointly examining a span of a few time points ($t-1$, $t-2$, \textit{etc.}).
Another point worth of interest is that our framework provides more than the information analysed in the present work: as a matter of fact, while we treated the $0\rightarrow+1$ and $+1\rightarrow0$ transitions as mirrors of each other (and subtracted both sets of probabilistic couplings to obtain the analysed outputs), more complex information may lie within the individual coefficient matrices. For example, it may be that a given region is only modulated by another at baseline, but not when it is active.
Finally, a few promising applications of our framework an be foreseen: first, it will be exciting to compare co-activation and causal coefficients across different subject populations (\textit{e.g.}, a set of healthy volunteers as opposed to a diseased population). To do so, bootstrapping could be conducted on each population, and statistical testing could then be conducted for each coefficient of interest. The examination of subject-specific properties will, however, be more challenging to address, as population-level estimates only can be derived for typically available amounts of data. Second, another possible application could be in \textit{hyperscanning}~\cite{Montague2002}, where two subjects are scanned in parallel while they interact. Co-activations, or causal modulations, could be quantified across both subjects as a way to shed light on the functional underpinnings of cooperative processing.
\clearpage
\section{Acknowledgments}
The authors are grateful to the Bertarelli Foundation and the Vasco Sanz Fund for supporting the present research.
\clearpage
\bibliography{papers_library}
\end{document}

Event Timeline