bfms — Brain Foundation Models
bfms is a unified PyTorch framework for multimodal physiological representation learning and real-time cognitive state estimation, providing core model architectures and algorithms for brain foundation models (BFMs) in a standard and modular manner.
Brain foundation models are large encoder networks pre-trained on vast unlabelled physiological recordings via self-supervised objectives, then fine-tuned for specific cognitive state estimation tasks with minimal labelled data. They tackle the core labelling bottleneck in psychophysiology: while unlabelled EEG/BVP/GSR recordings are increasingly abundant from clinical archives and wearable deployments, carefully annotated cognitive-state data remains scarce.
Why bfms?
Collecting labelled cognitive-state data is expensive: - Participants must complete long experiments with carefully controlled stimuli. - Ground-truth labels require validated questionnaires (NASA-TLX, SART) administered repeatedly, interrupting experimental flow. - A well-controlled study typically yields data from 20–50 participants — orders of magnitude fewer than the millions of samples used to train vision or language foundation models.
bfms addresses this by separating representation learning from task-specific prediction. A single pre-trained encoder backbone is shared across all downstream tasks, sensors, and populations.
Design Principles
Our models aim to satisfy the following properties:
| Principle | Description |
|---|---|
| Task-Agnostic | Generalizes across tasks engaging different cognitive faculties (memory, attention, workload, spatial reasoning) |
| Subject-Agnostic | Robust to inter-subject variability; supports personalized adaptation (e.g. PULSE, TERSE) |
| Hardware-Agnostic | Works across devices, manufacturers, and sensor generations with varying sampling rates and channel counts (e.g. EEG-X) |
| Channel Topology-Agnostic | Handles variable channel sets with full permutation equivariance (e.g. DIVER-0, LUNA) |
| Sequence Length-Agnostic | Supports arbitrary-length recording durations |
| Privacy-Preserving | Resistant to biometric identity extraction from model weights or activations |
| Modality-Agnostic | Applicable to EEG, BVP/PPG, GSR, ECG, eye-tracking, and speech |
| Multi-Modal Fusion | Unified processing to identify modality-invariant and modality-specific features (e.g. MISA, PhysioOmni) |
| Asymmetric Cross-Modal Transfer | Leverages rich EEG supervisory signals to build quality encoders for data-scarce modalities |
Supported Modalities
| Modality | Signal | Primary Cognitive Relevance |
|---|---|---|
| EEG | Electrical cortical activity | Workload, attention, emotion, fatigue |
| ECG | Cardiac electrical activity | Stress, autonomic arousal |
| PPG / BVP | Peripheral blood volume pulse | Heart rate variability → stress and load |
| Eye Gaze & Pupillometry | Gaze position, pupil diameter | Workload, situational awareness |
| Speech | Acoustic para-linguistic features | Stress, arousal, cognitive load |
Installation
From PyPI
From Source
Requires Python 3.10+. Core dependencies include PyTorch 2.0+, MNE, snnTorch, SpikingJelly, and BindsNET.
Architecture Overview
bfms follows the Masked Autoencoder (MAE) paradigm popularized by Meta, adapted for physiological time-series. The framework is heavily inspired by torch-brain and braindecode, and implements architectural components from state-of-the-art large EEG models including LaBraM, EEGPT, and CBraMod.
Beyond transformers, we also implement alternative backbone families:
- Spiking Neural Networks (SNNs) — energy-efficient temporal coding with biological plausibility
- Continuous Thought Machines (CTMs) — dynamic recurrent processing for variable-length sequences
- State Space Models (SSMs) — linear recurrence for long-sequence modelling
Contributing
We welcome contributions from the community! Please read our Contributing Guide to get started, and review our Code of Conduct.