Run, Swap, and Scale LLM Backends. The Local MLOps Control Plane for AI.

Building and bench-testing AI is a pain. You're stuck wrestling a fragile stack of scripts, containers, and boilerplate just to test one prompt. Aleutian unifies your entire MLOps workflow—RAG, data, observability—and gives you dynamic control to swap between public APIs and private, fine-tuned models, all from a single control plane. It gives you the freedom to build, the control to deploy, and the sovereignty to own your data.

Before You Start: Prerequisites

Podman Desktop (Installed)
Ollama App (Installed)
Podman Machine (Running)
Ollama App (Running)

Step 1 Run Prerequisites

Ensure Podman & Ollama are running before you start.

Step 2 Install CLI Tools (macOS/Linux)

Step 3 Pull the Default AI Model

This model runs locally and is required for the stack to start.

Step 4 Configure Required Secrets

Required for embedding models. Get your HF Token.

Required by the orchestrator. Get your OpenAI Key (or use 'dummy-key').

Step 5 Start the Stack

View Full Documentation on GitHub for Linux/source installs.

Get Started Now View on GitHub

Your App, Your Data, Your Choice

Aleutian is the vital link that "chains" your app to any backend, giving you 100% control over cost, privacy, and power.

Your Application

CLI, API, or UI

Aleutian Control Plane

RAG Engine, Privacy Scan, History

Swappable Backends

Ollama (Local)

Free, Private, Fast

OpenAI / Gemini

Max Power, Pay-as-you-go

Private TGI/vLLM

Self-Hosted, VPC

A Familiar Workflow

No complex abstractions. Just simple, powerful commands that fit right into your workflow.

$ aleutian populate vectordb ./my-local-docs

INFO[0000] Starting recursive file walk of './my-local-docs'
INFO[0000] Found 24 files.
INFO[0000] Scanning 'annual-report-2024.pdf' for PII...
INFO[0000]  > Found 3 potential PII matches (Credit Card).
INFO[0000]  > User approved ingestion.
INFO[0000] Calling PDF Parser service...
INFO[0000] Embedding 120 chunks for 'annual-report-2024.pdf'
...
INFO[0002] Successfully populated 24 documents.

$ aleutian ask "Key takeaways from the 2024 report?"

INFO[0000] Using RAG pipeline: 'reranking'
INFO[0000] Querying Weaviate... (found 20 chunks)
INFO[0000] Reranking context... (top 5 chunks)
INFO[0000] Calling LLM backend: 'ollama' (model: llama3:8b)


Based on the 2024 annual report, the key takeaways were:
1.  A 15% increase in revenue, driven by the new product line.
2.  Significant expansion into the European market.
3.  A stated goal of focusing on AI-driven automation.


Sources:
 [1] annual-report-2024.pdf (page 2)
 [2] annual-report-2024.pdf (page 8)

$ aleutian ask "Key takeaways..." --backend openai

INFO[0000] Using RAG pipeline: 'reranking'
INFO[0000] Querying Weaviate... (found 20 chunks)
INFO[0000] Reranking context... (top 5 chunks)
INFO[0000] Calling LLM backend: 'openai' (model: gpt-4-turbo)


The 2024 report highlighted three primary takeaways:
1.  **Revenue Growth (15%):** Primarily attributed to the launch of the new product line.
2.  **Market Expansion:** Successful entry and growth within the European market.
3.  **Future Strategy:** A clear strategic pivot towards AI-driven automation for the upcoming fiscal year.


Sources:
 [1] annual-report-2024.pdf (page 2)
 [2] annual-report-2024.pdf (page 8)

$ aleutian weaviate summary

INFO[0000] Connecting to local Weaviate instance...


Weaviate Schema & Object Summary:

  Class: Document
    Objects: 24
    Properties:
      - content: (text)
      - source: (text)
      - embedding: (vector)

  Class: Session
    Objects: 5
    Properties:
      - summary: (text)
      - llmBackend: (text)
      - startTime: (date)

$ aleutian chat --no-rag

INFO[0000] Starting direct chat with LLM backend: 'ollama'
Connected. Type 'exit' or 'quit' to end.

> Write a python function to reverse a string

Aleutian (llama3:8b):

Certainly! Here is a simple Python function to reverse a string:

```python
def reverse_string(s):
    return s[::-1]

# Example usage:
print(reverse_string("hello"))  # Output: olleh
```
>

Who is Aleutian For?

Does This Sound Like You?

Aleutian isn't just a tool; it's a workflow. It's designed for specific, painful problems that AI/ML engineers face every day.

The Security-Conscious Engineer

The Pain

You are banned from using Copilot, ChatGPT, or any cloud API on proprietary code.
You are terrified of any sensitive data leaving your network, even by accident.
Your "solution" is running a local Ollama model, but it's just an API. You have no RAG, no data pipelines, no observability, and no UI to make it useful.
You've been told the "hybrid" approach is secure, but you know sending any code snippets to a 3rd party still violates policy.

The Promise

You get a 100% air-gapped MLOps stack. Analyze all your private documents with a complete, production-grade RAG and observability system that runs entirely offline on your laptop.

The AI/ML Experimenter

The Pain

Your Jupyter Notebook is 90% boilerplate: initializing Weaviate clients, Ollama clients, and managing API keys.
Debugging your multi-step agent is a nightmare of `print()` statements and sifting through terminal logs.
Adding a new tool (like a stock ticker) or a new data store (like Postgres) means building another FastAPI server and figuring out how to deploy and manage it.
Your setup is a fragile mess of scripts that you can't easily share with your co-workers.

The Promise

You get a stable, shareable MLOps runtime. All your tools, data connectors, and RAG pipelines run in one managed, observable stack that you can call in one line: `client = AleutianClient()`.

The Cost-Conscious Developer

The Pain

Your OpenAI API bill is already $50/day... and you're just developing.
You're hesitant to iterate and run tests because every single RAG query costs you money.
You want to test 5 different local GGUF models (Llama3, Mistral, Phi-3), but manually downloading, converting, and configuring each one is a painful, time-sucking process.

The Promise

You get a 100% free local dev loop. Use the built-in `aleutian convert` command to pull any model from Hugging Face, quantize it (-q bf16), and automatically register it with Ollama in one step (`--register`). Have a fine-tuned model? Point the same command at your local directory (`--is-local-path`) to convert and load it. Test your entire app for free, then just change "one line" in your config to swap models.

The MLOps/DevOps Engineer

The Pain

The ML team "throws a notebook over the wall" that "works on their machine".
Your job is to re-architect this "throwaway" prototype from scratch for production.
It has no health checks, no observability, no container definitions, and no clear CI/CD path.

The Promise

The local `podman-compose.yml` is the blueprint for production. Because the prototype was built on Aleutian, it's already containerized, observable, and configurable. You save months of refactoring.

Technical Features

The Complete MLOps Stack

Your "Who Is This For?" workflow is powered by these six core components, all running locally.

Secure Data Ingestion

Automate data processing and privacy-scanning before it hits your vector DB.

Flexible RAG Engine

Utilize multiple RAG strategies (like Re-ranking) out-of-the-box via a simple API.

Unified LLM Access

Seamlessly switch between local models (Ollama) and external APIs (OpenAI) with one config change.

Integrated Observability

Get immediate insights with a pre-configured OpenTelemetry, Jaeger, and Prometheus stack.

Modular & Extensible

Add custom containers and tools using standard `podman-compose.override.yml` practices.

Production Ready

Your local stack translates directly to Kubernetes or Docker Swarm for production.

Python SDK (`aleutian-client`)

Integrate Aleutian's power programmatically into your notebooks, scripts, or backend applications.

Powerful CLI

Manage the stack, ingest data, run queries, convert models, and administer Weaviate directly from your terminal.

Go-to-Market Faster

Need help with integration?

Focus on your product, not your infrastructure. Our experts can deploy and configure a bespoke Aleutian stack for your specific needs.

Learn About Enterprise

Ready to Scale?

Aleutian Enterprise

Get advanced privacy controls, enhanced security, and priority support. Tell us your needs.

Built for Developers, by Developers

AleutianLocal is free, open-source, and licensed under AGPLv3. We believe in empowering developers to own their entire stack, from laptop to production.

Contribute on GitHub Join our Discord Connect on Slack

Run, Swap, and Scale LLM Backends. The Local MLOps Control Plane for AI.

Before You Start: Prerequisites

Your App, Your Data, Your Choice

Your Application

Aleutian Control Plane

Swappable Backends

A Familiar Workflow

Who is Aleutian For?

The Security-Conscious Engineer

The Pain

The Promise

The AI/ML Experimenter

The Pain

The Promise

The Cost-Conscious Developer

The Pain

The Promise

The MLOps/DevOps Engineer

The Pain

The Promise

Technical Features

Secure Data Ingestion

Flexible RAG Engine

Unified LLM Access

Integrated Observability

Modular & Extensible

Production Ready

Python SDK (aleutian-client)

Powerful CLI

Need help with integration?

Aleutian Enterprise

Built for Developers, by Developers

Python SDK (`aleutian-client`)