Models

This page provides information about how to use different model types with the Oblix server.

Model Format

When using the Oblix server, models are specified in the format provider:model_name, for example:

ollama:llama2 - Refers to the "llama2" model running on Ollama locally
openai:gpt-3.5-turbo - Refers to OpenAI's GPT-3.5 Turbo model
claude:claude-3-sonnet-20240229 - Refers to Anthropic's Claude 3 Sonnet model
auto - Special identifier that enables Oblix's intelligent orchestration

The "auto" Model

The most powerful way to use Oblix is with the special auto model, which enables Oblix's intelligent orchestration system:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:62549/v1",
    api_key="placeholder"  # Required by OpenAI client but not used by Oblix
)

# Using auto model for intelligent orchestration
response = client.chat.completions.create(
    model="auto",  # This enables Oblix's orchestration
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

When using auto, Oblix will:

Evaluate system resource availability
Check network connectivity status
Select the optimal model based on these factors
Fall back gracefully between cloud and edge as conditions change

Using Specific Models

While the auto model is recommended for most use cases, you can directly specify models to bypass orchestration and use a specific provider:

# Force execution on a specific local model
response = client.chat.completions.create(
    model="ollama:llama2",  # Specify exact model to bypass orchestration
    messages=[{"role": "user", "content": "Hello!"}]
)

# Force execution on a specific cloud model
response = client.chat.completions.create(
    model="openai:gpt-3.5-turbo",  # Use this model specifically
    messages=[{"role": "user", "content": "Hello!"}]
)

How Models Are Connected

Behind the scenes, the Oblix server connects to:

Local Ollama models via the Ollama API (typically at http://localhost:11434)
OpenAI models via the OpenAI API
Claude models via the Anthropic API

API keys for cloud providers are specified when starting the server:

oblix server \
  --local-model ollama:llama2 \
  --cloud-model openai:gpt-3.5-turbo \
  --cloud-api-key YOUR_API_KEY

or via environment variables:

export OPENAI_API_KEY=YOUR_API_KEY
export ANTHROPIC_API_KEY=YOUR_API_KEY

oblix server \
  --local-model ollama:llama2 \
  --cloud-model openai:gpt-3.5-turbo

Model Parameters

When making API requests, you can specify model parameters using the same format as the OpenAI API:

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Write a short poem"}],
    temperature=0.7,
    max_tokens=150,
    top_p=1.0,
    frequency_penalty=0.0,
    presence_penalty=0.0
)

These parameters will be passed to the underlying model as appropriate.

Model Format​

The "auto" Model​

Using Specific Models​

How Models Are Connected​

Model Parameters​

Model Format

The "auto" Model

Using Specific Models

How Models Are Connected

Model Parameters