For all these reasons, the EEG inverse problem is an undetermined ill-posed problem [ 1 — 4 ]. Solutions developed by this theory are stated in terms of a regularization function, which helps us to select, among the infinite solutions, the one that best fulfills a prescribed constrain e.

To define the constrain, we can use mathematical restrictions minimum norm estimates or anatomical, physiological, and functional prior information. Some examples of useful neurophysiological information are [ 1 , 6 ]: the irrotational character of the brain current sources, the smooth dynamic of the neural signals, the clusters formed by neighboring or functional related BES, and the smoothness and focality of the electromagnetic fields generated and propagated within the volume conductor media brain cortex, skull, and scalp. This regularization function usually induces solutions that spread over a considerable part of the brain.

Uutela et al. This penalty function promotes solutions that tend to be scattered around the true sources. These mixed norm approaches induce structured sparse solutions and depend on decomposing the BES signals as linear combinations of multiple basis functions, e. For a more detailed overview on inverse methods for EEG, see [ 2 , 3 , 13 ] and references therein. For a more detailed overview on regularization functions applied to structured sparsity problems, see [ 14 — 16 ] and references therein.

All of these regularizers try to induce neurophysiological meaningful solutions, which take into account the smoothness and structured sparsity of the BES matrix: during a particular cognitive task, only the BES related with the brain area involved in such a task will be active, and their corresponding time evolution will vary smoothly, that is, the BES matrix will have few nonzero rows, and in addition, the columns will vary smoothly. In this paper, we propose a regularizer that takes into account not only the smoothness and structured sparsity of the BES matrix but also its low rank, capturing this way the linear relation between the active sources and their corresponding neighbors.

In order to do so, we propose a new method based on matrix factorization and regularization, with the aim of recovering the latent structure of the BES matrix. In our approach, the resulting optimization problem is nonsmooth and nonconvex.

A standard approach to deal with the nonsmoothness introduced by the nonsmooth regularizers mentioned above is to reformulate the regularization problem as a second-order cone programming SOCP problem [ 12 ] and use interior point-based solvers. However, interior point-based methods can not handle large scale problems, which is the case of large EEG inverse problems involving thousands of brain sources. Another approach is to try to solve the nonsmooth problem directly, using general nonsmooth optimization methods, for instance, the subgradient method [ 17 ]. This method can be used if a subgradient of the objective function can be computed efficiently [ 14 ].

In this paper, in order to tackle the nonsmoothness of the optimization problem, we depart from these optimization methods and use instead efficient first-order nonsmooth optimization methods [ 5 , 18 , 19 ]: forward-backward splitting methods. These methods are also called proximal splitting because the nonsmooth function is involved via its proximity operator.

Forward-backward splitting methods were first introduced in the EEG inverse problem by Gramfort et al. These methods have drawn, increasing attention in the EEG, machine learning, and signal processing community, especially because of their convergence rates and their ability to deal with large problems [ 19 — 21 ]. On the other hand, in order to handle the nonconvexity of the optimization problem, we use an iterative alternating minimization approach: minimizing over the coding matrix while maintaining fixed the latent source matrix and viceversa.

Both of these optimization problems are convex: the first one can be solved using proximal splitting methods, while the second one can be solve directly in terms of a matrix inversion. The rest of the paper is organized as follows. In Section 2, we give an overview of the EEG inverse problem.

In Section 3, we present the mathematical background related with the proximal splitting methods. The resulting nonsmooth and nonconvex optimization problem is formally described in Section 4. In Section 5, we propose an alternating minimization algorithm, and its convergence analysis is presented in Section 6. The advantages of considering both characteristics in a single method, like in the proposed one, become clear in comparison with the independent use of the Group Lasso and Trace Norm regularizers.

Finally, conclusions are presented in Section 8. The EEG signals represent the electrical activity of one or several assemblies of neurons [ 22 ]. The area of a neuron assembly is small compared to the distance to the observation point the EEG sensors. Therefore, the electromagnetic fields produced by an active neuron assembly at the sensor level are very similar to the field produced by a current dipole [ 23 ]. This simplified model is known as the equivalent current dipole ECD.

Due to the uniform spatial organization of their dendrites perpendicular to the brain cortex , the pyramidal neurons are the only neurons that can generate a net current dipole over a piece of cortical surface, whose field is detectable on the scalp [ 3 ]. These voltages can be recorded by using different types of electrodes [ 22 ], such as disposable gel-less, and pre-gelled types , reusable disc electrodes gold, silver, stainless steel, or tin , headbands and electrode caps, saline-based electrodes, and needle electrodes.

The i th row of the matrix Y represents the electrical activity recorded by the i th EEG electrode during the observation time window. In the BES matrix S , each row represents the time evolution of one brain electrical source, and each column represents the activity of all the corresponding sources in a particular time instant. Finally, the forward operator A summarizes the geometric and electric properties of the conducting media brain, skull, and scalp and establishes the link between the current sources and EEG sensors A i j tells us how the j th BES influences the measure obtained by the i th electrode.

Following this notation, the EEG inverse problem can be stated as follows: Given a set of EEG signals Y and a forward model A , estimate the current sources within the brain S that produce these signals.

### Samenvatting

The proximity operator [ 19 , 25 ] corresponding to a convex function f is a mapping from R n to itself and is defined as follows:. Note that the proximity operator is well defined, because the above minimum exists and is unique the objective function if strongly convex. Proximal splitting methods are specifically tailored to solve an optimization problem of the form. This implies the following [ 18 ]:.

In optimization, 7 is known as forward-backward splitting process [ 19 ]. If we would have a closed-form expression for such proximity operator or if we could approximate it efficiently with the approximation errors decreasing at appropriate rates [ 27 ] , then we could efficiently solve 7.

Furthermore, when f has a Lipschitz continuous gradient, there are fast algorithms to solve 7. The resulting optimization problem can be stated as follows:. Finally, the parameter K encloses the low rank of S :. Hence, the proposed regularization framework takes into account all the prior knowledge about the structure of the target matrix S. In this section, we address the issue of implementing the learning method 10 numerically. We propose the following reparameterization of 10 :.

## An Introduction to Inverse Limits with Set-valued Functions (SpringerBriefs in Mathematics)

The optimization problem 12 is a simultaneous minimization over matrices B and C. On the other hand, for a fixed B , the minimum over C can be solved directly in terms of a matrix inversion. These observations suggest an alternating minimization algorithm [ 15 , 28 ]:. In order to obtain the initialization matrix C 0 , we use an approach based on the singular value decomposition of Y. Without loss of generality, let us work with 9 in the noiseless case:. Then, given C 0 , we can start iterating using 13 and As we have seen in Section 3, this kind of problem can be efficiently handled using proximal splitting methods e.

The gradient of the smooth function F B B. In what follows, we show how the minimum over C can be solved directly in terms of a matrix inversion:. We are going to analyze the convergence behavior of Algorithm 1 by using the global convergence theory of iterative algorithms developed by Zangwill [ 30 ]. The property of global convergence expresses, in a sense, the certainty that the algorithm converges to the solution set.

In order to use the global convergence theory of iterative algorithms, we need a formal definition of iterative algorithm, as well as the definition of a set-valued mapping a. Set-valued mapping. Iterative algorithm. Now that we know the main building blocks of the global convergence theory of iterative algorithms, we are in a position to state the convergence theorem related to Algorithm Before going in this assertion, let us show some definitions and theorems used in the proof.

Compact set. A set X is said to be compact if any sequence or subsequence contains a convergent subsequence whose limit is in X.

## 2-manifolds and inverse limits of set-valued functions on intervals

Composite map. Closed map. Composition of closed maps. Weierstrass theorem. Theorem 6. Hence, using Definition 6. This concludes the proof of assumption 1. To do so, we are going to use Theorem 6. In this section, we evaluate the performance of the matrix factorization approach and compare it with the Group Lasso regularizer:. In order to have a reproducible comparison of the different regularization approaches, we generated two synthetic scenarios:. The other sources are not active zero electrical activity.

Therefore, in this scenario, the synthetic matrix S is a structured sparse matrix with only 12 nonzero rows the rows associated to the active sources. Therefore, in this scenario, the synthetic matrix S is a structured sparse matrix with only 40 nonzero rows the rows associated to the active sources.

### Table of Contents

In both scenarios, the simulated electrical activity simulated waveforms associated to the four Main Active Sources MAS was obtained from a face perception-evoked potential study [ 35 , 36 ]. To obtain the simulated electrical activity associated to each one of the active neighbor sources, we simply set it as a scaled version of the electrical activity of its corresponding nearest MAS with a scaled factor equal to 0. Hence, there is a linear relation between the four MAS and their corresponding nearest neighbor sources; therefore, in both scenarios, the rank of the synthetic matrix S is equal to 4.

As forward model A , we used a three-shell concentric spherical head model. In this model, the inner sphere represents the brain, the intermediate layer represents the skull, and the outer layer represents the scalp [ 37 ].

To obtain the values of each one of the components of the matrix A , we need to solve the EEG forward problem [ 38 ]: Given the electrical activity of the current sources within the brain and a model for the geometry of the conducting media brain, skull and scalp, with its corresponding electric properties , compute the resulting EEG signals.

This problem was solved by using the SPM software [ 39 ]. Due to the nature of the mathematics on this site it is best views in landscape mode. If your device is not in landscape mode many of the equations will run off the side of your device should be able to scroll to see them and some of the menu items will be cut off due to the narrow screen width. The 3-D Coordinate System — In this section we will introduce the standard three dimensional coordinate system as well as some common notation and concepts needed to work in three dimensions.

Equations of Lines — In this section we will derive the vector form and parametric form for the equation of lines in three dimensional space. We will also give the symmetric equations of lines in three dimensional space. Note as well that while these forms can also be useful for lines in two dimensional space.

Equations of Planes — In this section we will derive the vector and scalar equation of a plane. We also show how to write the equation of a plane from three points that lie in the plane. Quadric Surfaces — In this section we will be looking at some examples of quadric surfaces. Some examples of quadric surfaces are cones, cylinders, ellipsoids, and elliptic paraboloids.

Functions of Several Variables — In this section we will give a quick review of some important topics about functions of several variables. In particular we will discuss finding the domain of a function of several variables as well as level curves, level surfaces and traces. Vector Functions — In this section we introduce the concept of vector functions concentrating primarily on curves in three dimensional space. We will however, touch briefly on surfaces as well.

We will illustrate how to find the domain of a vector function and how to graph a vector function. We will also show a simple relationship between vector functions and parametric equations that will be very useful at times. Calculus with Vector Functions — In this section here we discuss how to do basic calculus, i. Tangent, Normal and Binormal Vectors — In this section we will define the tangent, normal and binormal vectors.

Arc Length with Vector Functions — In this section we will extend the arc length formula we used early in the material to include finding the arc length of a vector function. Curvature — In this section we give two formulas for computing the curvature i. Velocity and Acceleration — In this section we will revisit a standard application of derivatives, the velocity and acceleration of an object whose position function is given by a vector function. For the acceleration we give formulas for both the normal acceleration and the tangential acceleration.

Cylindrical Coordinates — In this section we will define the cylindrical coordinate system, an alternate coordinate system for the three dimensional coordinate system. As we will see cylindrical coordinates are really nothing more than a very natural extension of polar coordinates into a three dimensional setting. Spherical Coordinates — In this section we will define the spherical coordinate system, yet another alternate coordinate system for the three dimensional coordinate system. This coordinates system is very useful for dealing with spherical objects.

We will derive formulas to convert between cylindrical coordinates and spherical coordinates as well as between Cartesian and spherical coordinates the more useful of the two. We will also see a fairly quick method that can be used, on occasion, for showing that some limits do not exist. Partial Derivatives — In this section we will the idea of partial derivatives. We will give the formal definition of the partial derivative as well as the standard notations and how to compute them in practice i.

- Low Power Design in Deep Submicron Electronics.
- Six Variations on an Original Theme in D Major, Op. 76.
- International Trade and Climate Change: Economic, Legal, and Institutional Perspectives (Environment and Development Series).
- Critical Literacies: Global and Multicultural Perspectives.

There is only one very important subtlety that you need to always keep in mind while computing partial derivatives. Interpretations of Partial Derivatives — In the section we will take a look at a couple of important interpretations of partial derivatives. First, the always important, rate of change of the function. We will also see that partial derivatives give the slope of tangent lines to the traces of the function. Higher Order Partial Derivatives — In the section we will take a look at higher order partial derivatives. Unlike Calculus I however, we will have multiple second order derivatives, multiple third order derivatives, etc.

Differentials — In this section we extend the idea of differentials we first saw in Calculus I to functions of several variables. Chain Rule — In the section we extend the idea of the chain rule to functions of several variables. In particular, we will see that there are multiple variants to the chain rule here all depending on how many variables our function is dependent on and how each of those variables can, in turn, be written in terms of different variables. We will also give a nice method for writing down the chain rule for pretty much any situation you might run into when dealing with functions of multiple variables.

In addition, we will derive a very quick way of doing implicit differentiation so we no longer need to go through the process we first did back in Calculus I. Directional Derivatives — In the section we introduce the concept of directional derivatives. With directional derivatives we can now ask how a function is changing if we allow all the independent variables to change rather than holding all but one constant as we had to do with partial derivatives.