Electroencephalography robust statistical
linear modelling using a single weight per trial
Cyril Pernet1,2, Ignacio Suay Mas2, Guillaume Rousselet3, Ramon Martinez4, Rand Wilcox5 & Arnaud Delorme4,6,7
1 Neurobiology Research Unit, Copenhagen University Hospital, Rigshospitalet, Denmark, 2 Centre for Clinical Brain Sciences, University of Edinburgh, United Kingdom,
3 Centre for Cognitive Neuroimaging, Institute of Neuroscience and Psychology, University of Glasgow, United Kingdom, 4 Swartz Center for Computational Neurosciences,
University of California, San Diego, 5 Department of Psychology, University of Southern California, United States of America, 6 Centre de Recherche Cerveau et Cognition,
Université Toulouse III – Paul Sabatier, Toulouse, France, 7 Centre National de la Recherche Scientifi que, Centre de Recherche Cerveau et Cognition, Toulouse, France
Pernet etal. This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 IGO License, which permits the copy
and redistribution of the material in any medium or format provided the original work and author are properly credited. In any reproduction of this article there should not be any
suggestion that APERTURE NEURO or this article endorse any speci c organization or products. The use of the APERTURE NEURO logo is not permitted. This notice should be
preserved along with the article’s original URL. Open access logo and text by PLoS, under the Creative Commons Attribution-Share Alike 4.0 Unported license.
: 2022, Volume 2 - 1 - CC-BY: © Pernet etal.
: 2022, Volume 2 - 1 -
Being able to remove or weigh down the in uence of outlier data is desirable for any statistical model. While magnetic and electro-
encephalographic (MEEG) data are often averaged across trials per condition, it is becoming common practice to use information
from all trials to build statistical linear models. Individual trials can, however, have considerable weight and thus bias inferential
results (effect sizes as well as thresholded t/F/p maps). Here, rather than looking for univariate outliers, de ned independently at
each measurement point, we apply the principal component projection (PCP) method at each channel, deriving a single weight
per trial at each channel independently. Using both synthetic data and open electroencephalographic (EEG) data, we show (1) that
PCP is ef cient at detecting a large variety of outlying trials; (2) how PCP-based weights can be implemented in the context of the
general linear model (GLM) with accurate control of type 1 family-wise error rate; and (3) that our PCP-based weighted least squares
(WLS) approach increases the statistical power of group analyses as well as a much slower iterative reweighted least squares (IRLS)
approach, although the weighting scheme is markedly different. Together, our results show that WLS based on PCP weights derived
from whole trial pro les is an ef cient method to weigh down the in uence of outlier EEG data in linear models.
Keywords: Electroencephalography, single trials, statistical outliers, weighted least squares, general linear model
Data availability: All data used are publicly available (CC0); all codes (simulations and data analyses) are also available online in the LIMO MEEG GitHub repository (MIT license).
Date Received: May 03, 2021
Date Accepted: December 30, 2021
DOI: 10.52294/2e69f7cc-f061-40ad-ad77-017110464dfd
channel × time × trials and channel × frequency × time ×
trials. Several neuroimaging packages are dedicated to
the statistical analyses of such large multidimensional data,
often using linear methods. For instance, in the LIMO MEEG
toolbox (2), each channel, frequency and time frame are an-
alysed independently using the general linear model, an
approach referred to as mass-univariate analysis. Ordinary
least squares (OLS) approaches are used to  nd model pa-
rameters that minimize the error between the model and
the data. For least squares estimates to have good statistical
properties, it is however expected that the error covariance
off-diagonals are zeros, such that Cov(e) =
2l, l being the
identity matrix (3), assuming observations are independent
and identically distributed. It is well established that devia-
tions from that assumption lead to substantial power reduc-
tion and to an increase in the false-positive rate. When OLS
assumptions are violated, robust techniques offer reliable
solutions to restore power and control the false-positive
rate. Weighted least squares (WLS) approach is one such
Magnetic and electroencephalographic (MEEG) data
analysis can be subdivided into two main parts: prepro-
cessing and processing. Data preprocessing corresponds
to any manipulation and transformation of the data, such
as artefacts removal and attenuation or change in data
representations (spectral transformation, source model-
ling). Data processing refers to mathematical procedures
that do not change the data, i.e. statistical analysis and
statistical modelling (1). Here, we present a method that
improves data processing by weighting down trials that
have different dynamics from the bulk of the data, within
the context of an experimental design. We investigated
single-trial weighting for full scalp/channel statistical lin-
ear hierarchical modelling, providing a more robust sta-
tistical analysis method to the community.
After data preprocessing, MEEG data are typically ep-
oched to form three- or four-dimensional matrices of e.g.
: 2022, Volume 2 - 2 - CC-BY: © Pernet et al.
: 2022, Volume 2 - 2 -
robust method that uses different weights across trials, such
that Cov(e) =
V, with V a diagonal matrix:
y = X
e, E
= 0,
) =
with y an n-dimensional vector (number of trials), X the
n*p design matrix,
a p-dimensional vector (number of
predictors in X) and e the error vector of dimension n.
The WLS estimators can then be obtained using an OLS
on transformed data (equations 2 and 3):
Wy = WX
+ We, E(e) = 0, Cov(e) =
I (2)
= (XTWX)1XTWy (3)
with W a 1*n vector of weights.
When applied to MEEG data, a standard mass-univariate
WLS entails obtaining a weight for each trial but also
each dimension analysed, i.e. channels, frequencies and
time frames. Following such procedure, a trial could be
considered as an outlier or be assigned a low weight,
for a single frequency or time frame, which is implausi-
ble given the well-known correlations of MEEG data over
space, frequencies and time. We propose here that a
single or a few consecutive data points should never be
agged as outliers or weighted down, and that a single
weight per trial (and channel) should be derived instead,
with weights taking into account the whole temporal or
spectral pro le. In the following, we demonstrate how
the principal component projection (PCP) method (4) can
be used in this context, and how those weights can then
be used in the context of the general linear model (WLS),
applied here to event-related potentials. Such weight-
ing should not be taken to bypass data preprocessing.
Because measurement artefacts and physiological arte-
facts are typically several orders of magnitude stronger
than the signal of interest, and often span consecutive
trials, they require dedicated algorithms. After data
preprocessing, residual signal artefacts might however
remain. In addition, natural and unintended experimen-
tal variations might also exist among all trials, and the
weighting scheme aims to account for these.
Trial-based weighted least squares
The new WLS solution consists of computing  rst-level
general linear model (GLM) beta estimates using weights
from the PCP algorithm (4). As such, the estimation proce-
dure follows the usual steps of weighting schemes, like a
standard iterative reweighted least squares (IRLS) solution:
1. After the OLS solution is computed, an adjust-
ment is performed on residuals by multiplying them
1/ 1where h is a vector of leverage points
(i.e. the diagonal of the hat matrix H = X(X, X)1 X where
X is the design matrix re ecting experimental condi-
tions). This adjustment is necessary because leverage
points are the most in uential on the regression space,
i.e. they tend to have low residual values (5).
2. Residuals are then standardized using a robust esti-
mator of dispersion, the median absolute deviation
(MAD) to the median, and re-adjusted by the tun-
ing function. Here, we used the bisquare function. A
series of weights is next obtained either based on
distances (PCP method) or minimizing the error term
(IRLS) with, in both cases, high weights for data points
having high residuals (with a correction for leverage).
3. The solution is then computed following equation 3.
Unlike IRLS, WLS weights are not derived for each
time frames independently but by looking at the multi-
variate distances in the principal component space (step
2 above) and thus deriving a unique weight for all time
frames. An illustration of the method is shown in  gure 1.
Trial weights are computed as a distance among trials pro-
jected onto the main (≥99%) principal components space.
Here, the principal components computed over the f time
frames are those directions that maximize the variance
across trials for uncorrelated (orthogonal) time periods ( g-
ure 1B). Outlier trials are points in the f-dimensional space
that are far away from the bulk. By virtue of the principal
component analysis (PCA), these outlier trials become
more visible along the principal component axes than in
the original data space. Weights ( gure 1E) for each trial
are obtained using both the Euclidean norm ( gure 1C, dis-
tance location) and the kurtosis weighted Euclidean norm
( gure 1D, distance scatter) in this reduced PCA space (see
Ref. (4) for details). Weights are then applied to each trial
(at each data points) on the original data (i.e. there is no
data reduction). We exploit this simple technique because
it is computationally fast given the rich dimensional space
of EEG data and because it does not assume the data to
originate from a particular distribution. The PCP algorithm
is implemented in the limo_pcout.m function and the WLS
solution in the limo_WLS.m function, both distributed
with the LIMO MEEG toolbox (https://limo-eeg-toolbox.
Simulation-based analyses
A. Outlier detection and effectiveness of the weighting
Simulated event-related potentials (ERPs) were generated
to evaluate the classi cation accuracy of the PCP meth-
od and to estimate the robustness-to-outliers and signal-
to-noise ratio (SNR) of the WLS solution in comparison
to an OLS solution and a standard IRLS solution, which
minimizes residuals at each time frame separately (imple-
mented in limo_IRLS.m). To do so, we manipulated the
following: (1) the percentage of outliers, using 10%, 20%,
30%, 40% or 50% of outliers; (2) the SNR (de ned relative
to the mean over time of the background activity); and (3)
the type of outliers. The  rst set of outliers were de ned
: 2022, Volume 2 - 3 - CC-BY: © Pernet et al.
: 2022, Volume 2 - 3 -
and the truth while KS distances provide a fuller picture
of the overall differences in distributions.
The code used to generate the ERP and the results
are available at https://github.com/LIMO-EEG-Toolbox/
with TP the true positives, TN the true negatives, FP the
false positives and FN the false negatives.
B. Statistical inference
Accurate estimation of model parameters (i.e. beta esti-
mates in the GLM – equation 3) is particularly important
because it impacts group-level results. Inference at the
single-subject level may, however, also be performed
and accurate p-values need to be derived. Here, error
degrees of freedom are obtained using the Satterthwaite
approximation ((8) equation 5).
dfe = tr ([I H]T[I H]) (5)
with dfe, the degrees of freedom of the error, the identity
matrix, and the hat matrix.
based on the added noise: white noise, pink noise, alpha
oscillations and gamma oscillations following (6). In these
cases, the noise started with a P1 component and lasted
~200 ms (see below). The second set of outliers were de-
ned based on their amplitude or outlier-to-signal ratio
(0.5, 0.8, 1.2 and 1.5 times the true N1 amplitude).
Synthetic data were generated for one channel, using
the model developed by Yeung et al. (2018 — (7)). The
simulated signal corresponded to an event-related po-
tential with P1 and N1 components (100 ms long) added
to background activity with the same power spectrum
as human EEG, generating 200 trials of 500 ms dura-
tion with a 250-Hz sampling rate. Examples for each
type of simulation are shown in  gure 2 and results are
based, for each case, on a thousand random repeti-
tions. Performance of the PCP algorithm at detecting
outlying synthetic EEG trials was investigated by com-
puting the confusion matrix and mapping the true- and
false-positive rates in the receiver operating space,
and by computing the Matthew correlation coef cients
(MCCs – equation 4). The robustness, or effectiveness
of the weighting scheme, was examined by computing
the Pearson correlations and the Kolmogorov–Smirnov
(KS) distances between the ground truth mean and the
OLS, WLS and IRLS means. Pearson values allowed to es-
timate the linear relationships between estimated means
Fig. 1. Illustration of the PCP weighting scheme using trials for ‘famous faces’ of the OpenNeuro.org publicly available ds002718 dataset. Data are from subject 3, chan-
nel 34 (see section on empirical data analysis). Panel A shows the single-trial responses to all stimuli. The principal component analysis is computed over time, keeping
the components explaining the most variance and summing to at least 99% of explained variance (giving here 69 eigenvectors, i.e. independent time components from
the initial 176 time points). The data are then projected onto those axes (panel B). From the data projected onto the components, Euclidean distances for location and
scatter are computed (panels C and D – showing smooth histograms of weights) and combined to obtain a distance for each trial. That distance is either used as weights
in a linear model or used to determine outliers (panel E, with outliers identi ed for weights below ~0.27, shown in dark grey). At the bottom right, the mean ERP for
trials classi ed as good (red) vs. outliers (black) and the weighted mean (green) are shown (panels F and G). Shaded areas indicate the 95% highest-density percentile
bootstrap intervals.