Shodh-MoE: Sparse Mixture-of-Experts Architecture for Multi-Physics Foundation Models

other · 2026-05-16

A new architecture called Shodh-MoE has been developed by researchers to tackle the issue of negative transfer in multi-physics foundation models. This negative transfer arises when training different partial differential equation (PDE) regimes together—like broadband open-channel fluid dynamics and boundary-dominated porous media flows—leading to gradient conflicts, unstable optimization, and loss of plasticity in dense neural operators. Shodh-MoE utilizes compressed 16^3 physical latents generated by a physics-informed autoencoder, incorporating an intra-tokenizer Helmholtz-style velocity parameterization to ensure that decoded states remain within divergence-free velocity manifolds. The model achieves precise mass conservation, yielding a velocity divergence of about 2.8 x 10^-10, thus addressing a significant challenge in advancing scientific machine learning (SciML) towards universal foundation models.

Key facts

Shodh-MoE is a sparse-activated latent transformer architecture for multi-physics transport.
It addresses negative transfer in co-training disparate PDE regimes.
Operates on compressed 16^3 physical latents from a physics-informed autoencoder.
Uses Helmholtz-style velocity parameterization to enforce divergence-free velocity manifolds.
Achieves exact mass conservation with velocity divergence ~2.8 x 10^-10.
Negative transfer causes gradient conflict, unstable optimization, and plasticity loss.
Broadband open-channel fluid dynamics and porous media flows impose incompatible demands.
Published on arXiv with ID 2605.15179.

Shodh-MoE: Sparse Mixture-of-Experts Architecture for Multi-Physics Foundation Models

Key facts

Entities

Institutions

Sources