DeepInfra Joins Hugging Face as Supported Inference Provider
DeepInfra is now a supported Inference Provider on the Hugging Face Hub, expanding serverless inference capabilities.
['The Hugging Face Hub has welcomed DeepInfra as a new Inference Provider, significantly broadening the range of serverless inference options available directly on model pages. This integration also extends to the Hugging Face client SDKs for both JavaScript and Python, streamlining the use of diverse models with preferred providers.', "DeepInfra stands out as a serverless AI inference platform, boasting one of the industry's most cost-effective pricing models per token. With an extensive catalog of over 100 models, it enables developers to seamlessly integrate a wide array of AI functionalities into their applications with minimal setup required.
The platform supports a wide variety of model types, including LLMs, text-to-image, text-to-video, embeddings, and more.", 'As part of this initial integration, DeepInfra is launching support for conversational and text-generation tasks on Hugging Face. This development allows users to access popular open-weight LLMs such as DeepSeek V4, Kimi-K2.6, GLM-5.1, among others. Additional task support, including text-to-image, text-to-video, and embeddings, is slated for rollout soon.', 'For those interested in leveraging DeepInfra as an Inference Provider, detailed documentation is available on its dedicated page.
A comprehensive list of models supported by DeepInfra can be found here. Users can also follow DeepInfra on Hugging Face via https://huggingface.co/DeepInfra. Furthermore, DeepInfra is accessible through the Hugging Face SDKs, specifically huggingface_hub (>= 1.11.2) for Python and @huggingface/inference for JavaScript.', "The integration of Hugging Face Inference Providers with most Agent Harnesses, including Pi, OpenCode, Hermes Agents, and OpenClaw, allows for the direct use of DeepInfra-hosted models within favorite tools without requiring extra code.
It's worth noting that billing for direct requests is handled by the corresponding provider, while routed requests through the Hugging Face Hub incur standard provider API rates without any markup.", 'Hugging Face PRO users receive $2 worth of Inference credits monthly, usable across providers. Subscribing to the PRO plan offers additional benefits, including Inference credits, ZeroGPU, Spaces Dev Mode, and higher limits. Free users also have access to limited free inference, with the option to upgrade for more features.']
Source: Hugging Face