One HTML file — ~730 KB, zero dependencies
Any model — cloud, local, or in-browser
No server — runs entirely in your browser
Privacy by architecture — not by policy
Free Personal Use  ·  Zero Backend  ·  v2.1

The AI OS
for your browser.

Nemilia turns any browser into a complete AI production environment — capture content from the web, run multi-agent workflows, search your documents, and deliver finished results. One file. Any model. No server.

Auto-detects Ollama, Jan, LM Studio — zero configuration for local models. Or connect any cloud provider with your existing API key. Switch mid-session without touching a single workflow.

⚡ New in v2.1 — Chrome extension with offline capture queue  ·  Captures inbox with vision analysis & triage  ·  Promote to Library  ·  Per-agent web search  ·  Perplexity provider  ·  Chrome AI (Gemini Nano)  ·  Auto-embed on ingest  ·  Compose attachment chips

↓ Download v2.1 on GitHub See how it works
No account · No install · No server · Any modern browser
✓ Chrome extension included ✓ Any AI provider ✓ Local models auto-detected ✓ Multi-agent DAG orchestration ✓ Document RAG & web search ✓ Offline-capable 100% client-side AES-256-GCM encrypted keys Zero telemetry
Nemilia neh-MEE-lee-ah — Nahuatl to think, to remember, to imagine

Capture. Enrich. Think. Deliver.

From the web page you’re reading to a finished multi-agent deliverable — without leaving the browser. The Chrome extension is the input peripheral that completes the loop.

Step 01
Capture
Right-click any page. Send full text, a selection, a screenshot, or an image directly to Nemilia — or trigger a workflow on the spot.
Step 02
Captures Inbox
Items land in Captures with auto-tagging, vision analysis on images, and BM25 + vector embedding. Triage to Compose, Chat, Workflow, or Library.
Step 03
Multi-Agent Workflow
Agents run in DAG stages. Each output is scored 1–10 and retried automatically. Cost and token usage tracked live as they stream.
Step 04
Workflow Results
Finished output saved to Workflow Results. Export as Word, PDF, Markdown, or plain text. Re-inject into the next run.

Nemilia is infrastructure
you configure.

Seven specialist agents ship out of the box. But the real power is in what you build on top — agents with any role, any persona, any model, wired into pipelines that match exactly how you work.

Custom Agents
Any role. Any model.
Any domain.
Every agent is fully configurable — name, icon, color, role, system prompt, model override, temperature, and web search behavior. Build a legal reviewer, a financial analyst, a code auditor, or a brand voice editor. Wire them into any workflow.
🄯
Name & Role
Defines the agent's identity and task scope
💬
System Prompt
Full persona, instructions, and constraints
🔲
Model Override
Use a different model per agent — mix cloud and local
🌡️
Temperature
Per-agent control — precise for review, creative for writing
🌐
Web Search
Auto-detect, always on, or custom query per agent
👁
HITL Checkpoint
Pause and review before this agent's output continues
Custom Workflows
Drag. Wire. Run.
Iterate.
Drag agents to set order, toggle HITL checkpoints per step, and configure a default prompt template with placeholders. The orchestrator handles decomposition, parallel DAG execution, quality scoring, and synthesis automatically.
STAGE 1 STAGE 2 — PARALLEL STAGE 3 Your Agent Researcher Your Agent Analyst Your Agent Writer WEAVE Synthesis DAG Parallel HITL per step Auto-scoring Templates Quality retry Unlimited agents Mix cloud + local
Legal contract review Competitive intelligence Content production pipeline Technical architecture audit Investment due diligence Market research briefing Code review & documentation SEO content factory Financial analysis report Product spec generation Threat intelligence digest Academic literature review

One file.
No compromises.

No install wizard. No backend. No account. Every capability ships inside the single HTML file you download.

📷
Chrome Extension Capture
Right-click any page to capture text, selections, screenshots, or images. Offline queue holds items until Nemilia is open. Zero-config delivery straight to your Captures inbox.
📦
Captures Inbox
Every capture lands with auto-tagging, vision analysis for images, BM25 + vector embedding, and one-click triage: send to Compose, Chat, a Workflow, or promote to Library for RAG.
💬
Chat Mode
Persistent multi-turn conversations with streaming, document RAG context, memory injection, and live web search. Conversations auto-title and persist across sessions.
🔲
Multi-Agent DAG Orchestration
Seven built-in specialist agents — SCOUT, LENS, QUILL, FORGE, WEAVE, NEXUS, VISION — run in parallel pipeline stages. Outputs scored 1–10, retried automatically below threshold. Build unlimited custom agents for any domain.
📄
Library & Document RAG
Upload PDFs, DOCX, TXT, CSV, MD, and more. Hybrid semantic + BM25 search with 384-dimension vector embeddings. Bulk-import via folder scan. Promote captures directly to Library without re-embedding.
🌐
Live Web Research
NEXUS searches in real-time via Serper, Brave, Tavily, or SearXNG. Per-agent web search — any agent can query independently. Perplexity available as a provider with built-in web results.
👁
Human-in-the-Loop (HITL)
Toggle checkpoints on any agent. The pipeline pauses, shows you the output, and waits for your approval, inline edit, or redirect. Optional audio alert when execution pauses.
🔄
Visual Workflow Builder
Drag agents to reorder. Toggle HITL per step. Live SVG DAG diagram renders inline before and during runs. Built-in Research Suite and Web Research Suite templates.
🧠
Persistent Agent Memory
Agents write MEMORY[key]: value to a persistent key-value store. Memory injects into every run automatically. Independent toggles for memory writes and injection per session.
🎨
VISION — Charts, Diagrams & Images
Generates Chart.js charts, SVG diagrams, and HTML infographics using your LLM — no image API needed. Connect DALL·E 3, FLUX, xAI Aurora, Google Imagen 4, Stable Diffusion 3, Replicate, or Fal.ai for photo-quality output.
🔧
MCP Tool Execution
Agents call real tools via MCP servers — read files, query databases, run code, hit APIs. Supports streamableHttp and SSE transports. Works with Supergateway and any stdio MCP server. 100% client-side.
📁
Workspace & Obsidian Sync
Sync to a real folder on disk via File System API. Every agent, prompt, workflow, and result becomes a plain file — editable in VS Code, versionable in Git. Full Obsidian vault support with wikilink stripping.

Your models.
Your rules.
Any provider.

Use whatever AI you already trust — or run everything locally with zero configuration. Nemilia is a production environment for your intelligence, not a gatekeeper to it. Switch providers mid-session without touching a single workflow.

☁️
Cloud API — Any Provider
Use your existing OpenAI, Anthropic, Groq, Gemini, DeepSeek, or OpenRouter API key. Encrypted with AES-256-GCM, never leaves your browser except to call the provider directly.
✓ Works with existing keys ✓ Any model, any provider ✓ Keys stay encrypted locally
📴
In-Browser — WebLLM
Llama, Phi, Qwen, Gemma running directly in your GPU via WebLLM. No server, no API key, no internet after first model download. Chrome AI (Gemini Nano) available for zero-download on-device chat.
✓ Zero internet after download ✓ GPU-accelerated inference ✓ Chrome AI — no download needed
OpenAIGPT-4o, o4-mini, o3
AnthropicClaude Opus & Sonnet 4
GroqLlama 4, DeepSeek R1
Google Gemini2.5 Flash, 2.5 Pro
OpenRouterMulti-model routing
DeepSeekR1, V3
MistralLarge, Codestral
xAI GrokGrok 3, Grok 3 mini
Kimi (Moonshot)K2 1T MoE
PerplexityBuilt-in web search
Ollama💻 Auto-detected
LM Studio💻 Auto-detected
Jan💻 Auto-detected
Llamafile💻 Mozilla · localhost
Custom API🔌 Any OpenAI-compatible
WebLLM📴 In-browser offline

ⓘ Chrome AI (Gemini Nano) also available — zero install, zero API key, on-device chat only. Requires Chrome 127+ with Gemini Nano enabled.

Your data never touches
our servers.
Because we don’t have any.

Nemilia has no backend. No database. No analytics pipeline. It’s a file. Your documents, captures, prompts, agent memory, and API keys stay where they belong — on your machine. This isn’t a privacy policy. It’s a structural fact.

🔒
AES-256-GCM key encryption — API keys encrypted with PBKDF2 200,000 iterations before storage. Never written as plaintext anywhere.
🚫
Absolute zero telemetry — No analytics, no tracking pixels, no usage beacons, no error reporting to any server. Verify it yourself — it’s one file.
💻
100% client-side processing — Document parsing, vector embeddings, agent memory, orchestration, vision analysis, chart generation. Everything runs locally in your browser tab.
📴
Fully air-gapped capable — Use Ollama, Jan, LM Studio, or WebLLM and Nemilia operates with no internet connection at all. Suitable for sensitive and regulated workloads.

No narration.
Just the app working.

Two videos. From first open to finished workflow result — real content, real agents, real output.

Part 01
App Tour & Workflow Run
Nav overview  ·  Chrome extension capture  ·  Captures inbox  ·  Compose  ·  Multi-agent pipeline executing live
1:52
Part 02
Results, Agents & Workflows
Workflow Results  ·  Agent editor  ·  Workflow builder  ·  Build section walkthrough
2:40

Download the file.
Open it. You’re running.

No waitlist. No signup. No install wizard. Download one HTML file, open it in your browser, connect any AI provider you already use — ready in under 60 seconds.

Questions? luis@nemilia.com — Luis reads every email personally.

Free for personal use  ·  Business Source License 1.1  ·  Commercial license available  ·  Converts to MIT 2030

✉️

Get in Touch

For press inquiries, partnership opportunities, or just to say hello — Luis reads every email personally.

✉ luis@nemilia.com
Response time typically within 24 hours