RGP-VAE: Riemannian Geometry-Preserving VAE for MI-BCI Data Augmentation

Abstract

This paper addresses the challenge of generating synthetic electroencephalogram (EEG) covariance matrices for motor imagery brain-computer interface (MI-BCI) applications. We aim to develop a generative model capable of producing high-fidelity synthetic covariance matrices while preserving their symmetric positive-definite (SPD) nature. We propose a Riemannian Geometry-Preserving Variational Autoencoder (RGP-VAE) integrating geometric mappings with a composite loss function combining Riemannian distance, tangent space reconstruction accuracy and generative diversity. The model generates valid, representative EEG covariance matrices while learning a subject-invariant latent space. Synthetic data proves practically useful for MI-BCI, with its impact depending on the paired classifier. This work introduces and validates the RGP-VAE as a geometry-preserving generative model for EEG covariance matrices, highlighting its potential for signal privacy, scalability and data augmentation.

Model Architecture

Figure 1. An overview of the proposed RGP-VAE. An input SPD matrix X_i is projected onto the tangent space at a reference point P_ref via the logarithmic map (log_{P_ref}), vectorized, and fed to the encoder. The encoder produces a latent distribution (μ, log σ²) from which a latent vector z_i is sampled. The decoder reconstructs the tangent representation, which is mapped back onto the SPD manifold via the exponential map (exp_{P_ref}).

Quantitative Results

Average balanced accuracy (%) across 12 subjects. Wilcoxon signed-rank test with Bonferroni correction (α=0.0083).

Generator	Classifier	Baseline	Augmented	Δ Augmented	p-value	Synthetic-Only	Δ Synthetic	p-value
Prior	MDM	59.5±5.5	58.9±5.4	−0.59	0.092	58.4±5.0	−1.16	0.043
	KNN	53.2±4.0	55.4±4.2	+2.19	0.003	56.2±4.2	+3.00	0.001
	SVC	60.7±5.3	57.4±6.3	−3.24	0.016	56.8±6.4	−3.92	0.002
Posterior	MDM	59.5±5.5	58.8±5.3	−0.69	0.092	59.0±5.5	−0.57	0.151
	KNN	53.2±4.0	55.6±4.1	+2.45	0.002	56.7±4.1	+3.49	0.002
	SVC	60.7±5.3	57.2±6.6	−3.48	0.007	56.7±6.3	−4.01	0.002

Key Findings

100% Valid SPD Matrices

Across all folds, every synthetic EEG covariance matrix from both prior and posterior generators passed symmetry and positive-definiteness checks - overcoming the 40% failure rate of a standard Euclidean VAE.

KNN Benefits Significantly

Posterior sampling yielded the largest classification gain for KNN (on average +3.49 %, p=0.002), and prior sampling was similarly beneficial (average of +3.00 %, p=0.001). Subject-level gains reached up to +7.8 %.

Subject-Invariant Latent Space

UMAP visualization reveals latent codes from different subjects are heavily intermingled, confirming the model learns generalised cross-subject representations via parallel transport alignment.

Realistic Diversity

With a noise scaling factor of σ_i=2.2 and γ = 0.035, the statistical variance and geometric diversity closely matches original data, without distorting SPD properties.

Classifier-Dependent Utility

SVC performance significantly degraded with augmentation (up to −4.01 %, p=0.002), while MDM remained stable. Data augmentation utility is not universal - it depends on the classifier.

Privacy & Scalability

Beyond classification, synthetic EEG covariance generation enables privacy-preserving data sharing and pipeline scalability testing without requiring raw neural signal sharing.

Results

Figure 2 — Prior Sampling Accuracy Improvements. Distribution of accuracy improvement for each classifier using the prior generator. The plot shows the percentage point difference between the `Augmented' and `Synthetic-Only' conditions relative to the `Baseline' across all subjects. The red line signifies the mean whilst the blue line is the median.

Figure 3 — Posterior Sampling Accuracy Improvements. Distribution of accuracy improvement for each classifier using the posterior generator, showing similar trends to the prior generator but with more pronounced fluctuations.

BibTeX

@inproceedings{polaka2026rgpvae,
  title={Riemannian Geometry-Preserving Variational Autoencoder for {MI-BCI} Data Augmentation},
  author={Poļaka, Viktorija and de Jong, Ivo Pascal and Sburlea, Andreea Ioana},
  booktitle={Submitted to Graz Brain-Computer Interface Conference},
  volume={10},
  year={2026},
  url       = {https://641e16.github.io/RGP-VAE/}
}

Riemannian Geometry-Preserving Variational Autoencoderfor MI-BCI Data Augmentation