FPMoE: Sparse MoE Model Boosts Functional Code Generation
A new lightweight open-source code generation model named FPMoE has been developed by researchers, utilizing a sparse Mixture-of-Experts (MoE) architecture to enhance the performance of large language models (LLMs) on functional programming languages (FPLs) such as Haskell, OCaml, and Scala. Current LLMs, including leading models, struggle significantly with FPLs due to their training primarily on imperative languages. The research team discovered that fine-tuning for individual languages does not effectively capture shared functional abstractions, and combined multi-language fine-tuning leads to cross-language interference. FPMoE addresses these challenges by employing three language-specific routed experts (one for each of Haskell, OCaml, and Scala) alongside a shared expert that identifies cross-language functional patterns, including monadic reasoning and type-directed programming. The findings are published in arXiv:2605.27849.
Key facts
- FPMoE is a sparse Mixture-of-Experts model for functional code generation.
- It targets Haskell, OCaml, and Scala.
- Existing LLMs perform worse on functional programming languages.
- Per-language fine-tuning fails to capture shared abstractions.
- Multi-language fine-tuning causes cross-language interference.
- FPMoE has three language-specific routed experts and one shared expert.
- The shared expert captures monadic reasoning and type-directed programming.
- The model is open-source and described in arXiv:2605.27849.
Entities
Institutions
- arXiv