bfms — Brain Foundation Models

bfms is a unified PyTorch framework for multimodal physiological representation learning and real-time cognitive state estimation, providing core model architectures and algorithms for brain foundation models (BFMs) in a standard and modular manner.

Brain foundation models are large encoder networks pre-trained on vast unlabelled physiological recordings via self-supervised objectives, then fine-tuned for specific cognitive state estimation tasks with minimal labelled data. They tackle the core labelling bottleneck in psychophysiology: while unlabelled EEG/BVP/GSR recordings are increasingly abundant from clinical archives and wearable deployments, carefully annotated cognitive-state data remains scarce.

Why bfms?

Collecting labelled cognitive-state data is expensive:

Participants must complete long experiments with carefully controlled stimuli.
Ground-truth labels require validated questionnaires (NASA-TLX, SART) administered repeatedly, interrupting experimental flow.
A well-controlled study typically yields data from 20–50 participants — orders of magnitude fewer than the millions of samples used to train vision or language foundation models.

bfms addresses this by separating representation learning from task-specific prediction. A single pre-trained encoder backbone is shared across all downstream tasks, sensors, and populations.

Design Principles

Our models aim to satisfy the following properties:

Principle	Description
Task-Agnostic	Generalizes across tasks engaging different cognitive faculties (memory, attention, workload, spatial reasoning)
Subject-Agnostic	Robust to inter-subject variability; supports personalized adaptation (e.g. PULSE, TERSE)
Hardware-Agnostic	Works across devices, manufacturers, and sensor generations with varying sampling rates and channel counts (e.g. EEG-X)
Channel Topology-Agnostic	Handles variable channel sets with full permutation equivariance (e.g. DIVER-0, LUNA)
Sequence Length-Agnostic	Supports arbitrary-length recording durations
Privacy-Preserving	Resistant to biometric identity extraction from model weights or activations
Modality-Agnostic	Applicable to EEG, BVP/PPG, GSR, ECG, eye-tracking, and speech
Multi-Modal Fusion	Unified processing to identify modality-invariant and modality-specific features (e.g. MISA, PhysioOmni)
Asymmetric Cross-Modal Transfer	Leverages rich EEG supervisory signals to build quality encoders for data-scarce modalities

Supported Modalities

Modality	Signal	Primary Cognitive Relevance
EEG	Electrical cortical activity	Workload, attention, emotion, fatigue
ECG	Cardiac electrical activity	Stress, autonomic arousal
PPG / BVP	Peripheral blood volume pulse	Heart rate variability → stress and load
Eye Gaze & Pupillometry	Gaze position, pupil diameter	Workload, situational awareness
Speech	Acoustic para-linguistic features	Stress, arousal, cognitive load

Installation

From PyPI

pip install bfms

From Source

git clone https://github.com/aether-sutd/bfms.git
cd bfms
pip install .

Requires Python 3.10+. Core dependencies include PyTorch 2.0+, MNE, snnTorch, SpikingJelly, and BindsNET.

Architecture Overview

bfms follows the Masked Autoencoder (MAE) paradigm popularized by Meta, adapted for physiological time-series. The framework is heavily inspired by torch-brain and braindecode, and implements architectural components from state-of-the-art large EEG models including LaBraM, EEGPT, and CBraMod.

Beyond transformers, we also implement alternative backbone families:

Spiking Neural Networks (SNNs) — energy-efficient temporal coding with biological plausibility
Continuous Thought Machines (CTMs) — dynamic recurrent processing for variable-length sequences
State Space Models (SSMs) — linear recurrence for long-sequence modelling

Roadmap

Core Infrastructure

Neural Attention from LaBraM
Neural Codebook + Normalized EMA Quantizer from LaBraM
Criss-Cross Attention from CBraMod
Gradient Reversal Layer
Contrastive, regression, and variance losses
Signal processing utilities (filtering, normalization, spectral features)
Curriculum trainer

Model Architectures

Continuous Thought Machine (CTM)
EEG-adapted CTM variant
Spiking GRU / TCN / Spikeformer backbone families
Synaptic SNN variants (mono- and tri-synaptic)
Full LaBraM pre-training architecture
Full CBraMod pre-training architecture
MAE pre-training pipeline (masking, patch embedding, reconstruction head)
JEPA-style pre-training pipeline

Datasets & Loaders

Dataset loaders: HTC, MOCAS, N-Back, SENSE-42, UNIVERSE, WAUC
PyTorch dataset classes for EEG classification and regression
Raw and processed EEG dataset wrappers
Unified streaming dataloader for large-scale pre-training

Multimodal & Advanced Features

Channel topology-agnostic encoding (variable channel permutation equivariance)
Hardware/device-agnostic encoding across EEG generations
Sequence length-agnostic temporal processing
Multi-modal fusion architecture (EEG + PPG + ECG + eye-tracking)
Asymmetric cross-modal knowledge transfer
Differentially-private pre-training

Personalization & Adaptation

Subject-agnostic pre-training with inter-subject alignment
Personalized fine-tuning interfaces (LoRA, adapter, prompt tuning)
Out-of-the-box integration with PULSE / PhysioPFM-style adaptation

Explainability

LASTS-based surrogate explanation framework
Counterfactual explanation utilities

Project Structure

bfms/
├── src/bfms/
│   ├── datasets/          # PyTorch dataset classes and ETL loaders
│   ├── losses/            # Contrastive, regression, and variance losses
│   ├── models/            # Full model implementations (LaBraM, CBraMod, CTM, SNN, …)
│   ├── nn/                # Reusable building blocks
│   │   ├── attentions/    # Neural, criss-cross, spike attention
│   │   ├── quantizers/    # Normalized EMA codebook
│   │   ├── embeddings/    # Patch and positional embeddings
│   │   ├── functional/    # Signal transforms and normalization
│   │   ├── snn/           # Spiking neural network layers
│   │   └── ctm/           # CTM helper modules
│   ├── processing/        # Signal preprocessing and feature extraction
│   ├── trainers/          # Training loop utilities
│   └── utils/             # Masking, sampling, ML utilities
└── docs/                  # Documentation source

Contributing

We welcome contributions from the community! Please read our Contributing Guide to get started, and review our Code of Conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
docs		docs
src/bfms		src/bfms
.gitignore		.gitignore
.python-version		.python-version
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock
zensical.toml		zensical.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bfms — Brain Foundation Models

Why bfms?

Design Principles

Supported Modalities

Installation

From PyPI

From Source

Architecture Overview

Roadmap

Core Infrastructure

Model Architectures

Datasets & Loaders

Multimodal & Advanced Features

Personalization & Adaptation

Explainability

Project Structure

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bfms — Brain Foundation Models

Why bfms?

Design Principles

Supported Modalities

Installation

From PyPI

From Source

Architecture Overview

Roadmap

Core Infrastructure

Model Architectures

Datasets & Loaders

Multimodal & Advanced Features

Personalization & Adaptation

Explainability

Project Structure

Contributing

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages