fastsem — High-Performance SEM

What is fastsem?

fastsem implements the RAM (Reticular Action Model) parameterization for Structural Equation Models. It reads lavaan-style model syntax from stdin, loads data from a CSV file, and fits the model using L-BFGS with analytical gradients. When an OpenCL-capable GPU is present the float32 objective is evaluated on-device while the float64 analytical gradient runs on CPU, eliminating the precision issues of finite-difference GPU optimisation.

# example.sem — 1-factor CFA with missing data and a definition variable
# /path/to/data.csv
eta =~ 1*x1 + data.t * x2 + x3
eta ~~ eta
x1  ~~ x1
x2  ~~ x2
x3  ~~ x3

Run with: ./fastsem < example.sem

Features

⚡

GPU Acceleration

OpenCL float32 objective evaluated on-device. Analytical float64 gradient on CPU. Degrades gracefully to multi-threaded CPU when no GPU is available.

📊

FIML

Full Information Maximum Likelihood handles MCAR / MAR missing data without list-wise deletion. Parallel Weave evaluation for small models (≤ 32 variables).

🔨

Definition Variables

mxsem-style data.col * var syntax for per-observation parameter scaling — e.g., time-varying loadings in growth curve models.

📈

Analytical Gradients

Exact ML and FIML gradients via the RAM weight matrix W = Σ⁻¹−Σ⁻¹SΣ⁻¹. No finite differences on the optimisation surface.

📝

lavaan Syntax

Model files use lavaan’s =~ (measurement), ~ (regression), and ~~ (variance / covariance) operators. Fixed loadings with k*var.

📋

Standard Errors & Fit

Asymptotic SEs via Hessian inverse. Delta method for variance parameters. Reports −2ln(L), χ², p-value, AIC, and BIC.

Dependencies

The distributed binaries are fully self-contained. The only runtime dependency is an OpenCL driver, which is typically already present on any machine with a modern GPU. If no GPU is available fastsem falls back to multi-threaded CPU automatically — no configuration required.

Dependency	Where to get it	Notes
OpenCL runtime	Ships with your GPU driver (NVIDIA, AMD, Intel)	Optional — CPU fallback is automatic if absent

Downloads

Linux · x86-64

fastsem-linux-x64

Download the binary, make it executable, and pipe your model file in:

chmod +x fastsem
./fastsem < model.sem

GPU is used automatically if an OpenCL driver is present. Pass --no-gpu to force CPU-only mode.

Windows · x86-64

fastsem-windows-x64.exe

Self-contained binary — OpenBLAS statically linked. No DLL dependencies beyond system libraries. Cross-compiled via nimble windows.

macOS · arm64 / x86-64

Coming soon

Native build on macOS uses Accelerate (no OpenBLAS needed). Binary release planned once a macOS CI runner is configured.

What’s Next

macOS binary release

Universal arm64 + x86-64 binary using Accelerate as the BLAS back-end, distributed via a CI macOS runner.
Additional fit indices

RMSEA (with 90% CI), CFI, TLI, and SRMR to match the standard lavaan / Mplus output.
WLS / DWLS / GLS estimators

Weighted and generalised least-squares estimators for non-normal and ordinal indicators.
R bindings via rnim

Expose parseLavaan and fitSem directly inside an R session with native data-frame input.
Python bindings via nimpy

A pip-installable wheel that lets Python users call fastsem with NumPy arrays and pandas DataFrames.
Multi-group models

Simultaneous estimation across groups with configural, metric, and scalar invariance constraints.
GPU FIML analytical gradient

Full per-observation GPU gradient for FIML (beyond the current fixed-Σ approximation), enabling exact GPU acceleration for missing-data models.