Ollama Turbo

Aug 5, 2025 - 21:15
 0  0
Ollama Turbo

TurboPreview

Supercharge models with faster hardware

$20/mo

Upgrade to Turbo

Turbo lets you

  • Speed up model inference

    Run models using datacenter-grade hardware, returning responses much faster.

  • Run larger models

    Upgrade to the newest hardware, making it possible to run larger models.

  • Privacy first

    Ollama does not retain your data to ensure privacy and security.

  • Save battery life

    Take the load of running models off your Mac, Windows or Linux computer, giving you performance back for your other apps.

Frequently asked questions

  • What is Turbo?

    Turbo is a new way to run open models using datacenter-grade hardware. Many new models are too large to fit on widely available GPUs, or run very slowly. Ollama Turbo provides a way to run these models fast while using Ollama's App, CLI, and API.

  • Which models are available in Turbo?

    While in preview, the gpt-oss-20b and gpt-oss-120b models are available.

  • Does Turbo work with Ollama's CLI?

    Yes, Ollama's CLI works with Turbo mode. See the docs for more information.

  • Does Turbo work with Ollama's API and JavaScript/Python libraries?

    Yes, Ollama's API and JavaScript/Python libraries work with Turbo mode. See the docs for more information.

  • What data do you retain in Turbo mode?

    Ollama does not log or retain any queries made via Turbo mode.

  • Where is the hardware that power Turbo located?

    All hardware is located in the United States.

  • What are the usage limits for Turbo?

    Turbo includes hourly and daily limits to avoid capacity issues. Usage-based pricing will soon be available to consume models in a metered fashion.

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Angry Angry 0
Sad Sad 0
Wow Wow 0