[Add AI Brains to Any App Without API Keys or Sign-Ups]
Analyze with AI
Get AI-powered insights from this Mad Devs tech article:
How we replaced OpenAI, Anthropic, and other APIs with one local agent that just works: no token counting, no billing, no headaches.
The problem: LLM integration is boring
Imagine you're building an application: you need text generation, data analysis, or just a "smart" backend. The first thing that comes to mind is grabbing an OpenAI API key and writing a wrapper.
Well, here's what actually happens in real life:
- Key management – where to store them, how to rotate them.
- Billing and limits – suddenly, someone in the dev environment forgot about token limits.
- Proxies and blocks – in some countries, APIs aren't accessible without workarounds.
- Prompt formatting – each provider has their own quirks.
- Error handling – rate limits, timeouts, retry logic.
- Tooling – which tools to add, how to call them, how to combine them.
We ran into this when automating large-scale content generation; we needed to process thousands of terms, each requiring web search, HTML generation, and strict formatting rules. Using commercial APIs would mean either burning through the budget or building an abstraction layer thicker than the business logic itself. The task wasn't expensive or important enough to drag in a full agent SDK, but we wanted it to work 24/7.
The solution: use an open-source agent as a black box
We based our approach on OpenCode, a CLI agent that runs locally and can handle small, straightforward tasks. No registration or keys, only a binary that takes a text task and returns results.
The core approach: Instead of writing an HTTP client for an API, we run opencode as a subprocess, give it instructions via stdin/arguments, and read the result from a file or stdout. The agent decides which model to use, performs web searches if needed, and formats the output itself.
Key advantage: You don't pay for tokens. For tasks like ours (generating content following specific rules), this is the difference between "a few dollars per run" and "practically free." As long as open code allows working with free models, it's great. When opencode restricts this, you can switch to free models from OpenRouter (which would require an OpenRouter key).
How it works under the hood
The architecture is dead simple:
Your app → subprocess(opencode run "...") → result file
Here's the minimal Python wrapper class we use in production:
import subprocess
import uuid
import os
class OpencodeRunner:
def __init__(self, prompt_file='prompt.md'):
self.prompt_file = prompt_file
def run_generation(self, keyword):
output_path = f"/tmp/result_{uuid.uuid4()}.html"
# СBuilding a team: term + prompt content + instruction to save to a file
cmd = f'opencode run "Term: {keyword}. $(cat {self.prompt_file}). Save to {output_path}"'
# Important: pass the config so that the agent doesn’t ask for confirmations
env = os.environ.copy()
env['OPENCODE_CONFIG'] = '/app/opencode-config.json'
process = subprocess.Popen(
cmd,
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
env=env
)
# Stream output in real time (useful for debugging)
for line in process.stdout:
print(line, end='', flush=True)
process.wait()
# Read the result
with open(output_path, 'r') as f:
content = f.read()
os.remove(output_path)
return contentThe opencode-config.json configuration is minimal:
{
"$schema": "https://opencode.ai/config.json",
"permission": "allow"
}This permission: allow is critically important because it tells the agent not to wait for user confirmation on every action. Without it, everything hangs in headless mode.
Docker deployment: just drop in the binary
To add AI brains to any service, just copy one file into the image:
FROM python:3.11-slim
# Copy agent binary
COPY bin/opencode /usr/local/bin/opencode
RUN chmod +x /usr/local/bin/opencode
# Copy config with auto-allow
COPY opencode-config.json /app/opencode-config.json
ENV OPENCODE_CONFIG=/app/opencode-config.json
# Your application
COPY . /app
WORKDIR /app
CMD ["python", "main.py"]❗️ Important note
The opencode binary is too large for GitHub, so it's better not to keep it in the repo. Before building, copy it from the system where it's installed:
mkdir -p bin && cp $(which opencode) bin/opencodeYou can download the latest opencode during Docker build, but if you're building on a machine where opencode is already installed, copying it locally is much faster and simpler.
Production patterns: what we learned in practice
1. Rate limiting and graceful degradation
Throughout our agent's operation, we haven't hit any limits. However, we noticed something interesting: at some point, opencode, with its free models, starts working very slowly, sometimes at a rate of 1 token per second or even less. Apparently, this is how they implement a queue concept: instead of hitting the agent with limits, they just put it in a batch queue. In our case, this is actually better; let it work slowly as long as it works 24/7.
2. Streaming output
When the agent performs web searches or processes complex queries, it can work for minutes. There's no point waiting for full completion; stream stdout to logs. This helps you understand what's happening and debug hangs.
3. Prompt files instead of inline strings
Don't stuff prompts into code. Store them in .md files, use $(cat file.md) when calling. This lets you:
- Version prompts through Git
- Edit without rebuilding the container
- Use multi-line instructions with formatting
When this approach works perfectly
This architecture isn't a silver bullet, but it's ideal for:
✅ Batch processing with fuzzy logic
- SEO content generation from templates
- Data classification and tagging
- Extracting structured information from unstructured sources
✅ Simple autonomous agents
- Systems that run 24/7 unattended because there's a low risk or non-critical value
- Processing task queues where there's no rush
- Services where latency isn't critical (not real-time chat)
Example agent output
Below is the agent's reasoning for a term generation task by keyword. Notice how the agent makes web requests and decides on its own how many queries to make, where to send them, and what to search for.
Found 1 tasks. Processing batch...
Running command: opencode run "Term: statistical significance. $(cat docs/prompts/term-prompt-ext.md). Save the result to a file /tmp/term_9617574d-484d-4358-910a-c391d26624fb.html"
I'll help you create a factual description of "statistical significance" following your detailed instructions. Let me first search for current information about this term.
Search Web search: statistical significance definition context usage challenges 2025 2026
Now let me search for more specific information about common challenges and related topics for statistical significance.
Search Web search: "statistical significance" common challenges p-value limitations misuse research
Now let me search for related topics to complete the description.
Search Web search: statistical significance related topics hypothesis testing p-value confidence interval effect size
Write tmp/term_9617574d-484d-4358-910a-c391d26624fb.html
Based on the research I've conducted, I'll now create the factual description of "Statistical Significance" following your detailed instructions. Let me write the HTML file:
The HTML file has been created successfully at `/tmp/term_9617574d-484d-4358-910a-c391d26624fb.html`. The description follows all your specified formatting requirements, uses American English, stays within the 1500-character limit, and provides factual information about statistical significance based on reputable sources.Bottom line: brains for your app in 5 minutes
We went from "we should integrate OpenAI" to a production system that generates hundreds of pages of content daily, without a single API key in the code, without sign-ups, and without hitting limits. The secret is to stop thinking about LLMs as external services with HTTP APIs and start thinking about them as local command-line tools; in other words, apply an agentic mindset.
Minimum starter kit:
- Download the
opencodebinary. - Create
opencode-config.jsonwith"permission": "allow". - Write a subprocess wrapper (the code above works).
- Add the binary to your Dockerfile.
- Profit.
No keys or sign-ups: only working AI in your application.
