DeepInfra Joins Hugging Face Inference Providers

ai-technology · 2026-04-29

DeepInfra, a platform for serverless AI inference, has become a recognized Inference Provider on the Hugging Face Hub. It features affordable token pricing and boasts a library of more than 100 models, which includes LLMs, text-to-image, text-to-video, and embeddings. Currently, DeepInfra caters to conversational and text-generation applications on Hugging Face, providing access to models such as DeepSeek V4, Kimi-K2.6, and GLM-5.1. Users can utilize custom API keys or direct their requests through Hugging Face, with PRO users benefiting from $2 in monthly inference credits. This integration can be accessed through Hugging Face SDKs for both Python and JavaScript, along with several agent harnesses.

Key facts

DeepInfra is now a supported Inference Provider on the Hugging Face Hub.
DeepInfra offers serverless AI inference with cost-effective pricing per token.
Catalog includes over 100 models: LLMs, text-to-image, text-to-video, embeddings.
Initial support for conversational and text-generation tasks on Hugging Face.
Models available include DeepSeek V4, Kimi-K2.6, GLM-5.1.
Users can set custom API keys or route requests through Hugging Face.
PRO users get $2 worth of Inference credits monthly.
Integration works with Hugging Face SDKs (Python and JavaScript) and agent harnesses.

Entities

Institutions

DeepInfra
Hugging Face
Hugging Face Hub
Hugging Face SDKs
Python
JavaScript
Pi
OpenCode
Hermes Agents
OpenClaw

Sources

Hugging Face Blog — 2026-04-29