One HTML file — ~3.5 MB, zero runtime dependencies
Any model — cloud, local, or in-browser
No server — runs entirely in your browser
Privacy by architecture — not by policy
Free Personal Use  ·  Zero Backend  ·  v2.2

The AI OS
for your browser.

Nemilia turns any browser into a complete AI production environment — capture content from the web, run multi-agent workflows, search your documents, and deliver finished results. One file. Any model. No server.

Auto-detects Ollama, Jan, LM Studio — zero configuration for local models. Or connect any cloud provider with your existing API key. Switch mid-session without touching a single workflow.

New in v2.2 — Multi-profile encrypted workspace  ·  Dashboard  ·  Skills system  ·  AI Generators  ·  Tasks — autonomous MCP execution  ·  MCP Catalog + Tool Tester  ·  Custom image endpoint  ·  Mobile overhaul

↓ Download v2.2 Install Chrome Extension

No account · No install · No server · Any modern browser  ·  View on GitHub →

✓ Chrome extension included ✓ Any AI provider ✓ Local models auto-detected ✓ Multi-agent DAG orchestration ✓ Document RAG & web search ✓ Offline-capable 100% client-side AES-256-GCM encrypted keys Zero telemetry
Nemilia neh-MEE-lee-ah — Nahuatl to think, to remember, to imagine

Capture. Orchestrate. Automate. Deliver.

From the web page you’re reading to a finished multi-agent deliverable — without leaving the browser. The Chrome extension is the input peripheral that completes the loop.

Step 01
Capture
Captures inbox — tagged, vision-analyzed, BM25 + vector indexed, triage to Compose, Chat, Workflow, or Library. Requires the Chrome extension.
Step 02
Orchestrate
Multi-agent DAG — SCOUT, LENS, QUILL, FORGE, WEAVE executing in parallel stages with live validation scores
Step 03
Automate
Tasks — autonomous LLM + MCP tool loop; each tool call and result streamed live until the goal is done
Step 04
Deliver
Workflow Results — finished synthesis with export to Word, PDF, MD, TXT, HTML

Nemilia is infrastructure
you configure.

Seven specialist agents ship out of the box. But the real power is in what you build on top — agents with any role, any persona, any model, wired into pipelines that match exactly how you work.

Custom Agents
Any role. Any model.
Any domain.
Every agent is fully configurable — name, icon, color, role, system prompt, model override, temperature, and web search behavior. Build a legal reviewer, a financial analyst, a code auditor, or a brand voice editor. Wire them into any workflow.
🄯
Name & Role
Defines the agent's identity and task scope
💬
System Prompt
Full persona, instructions, and constraints
🔲
Model Override
Use a different model per agent — mix cloud and local
🌡️
Temperature
Per-agent control — precise for review, creative for writing
🌐
Web Search
Auto-detect, always on, or custom query per agent
👁
HITL Checkpoint
Pause and review before this agent's output continues
Custom Workflows
Drag. Wire. Run.
Iterate.
Drag agents to set order, toggle HITL checkpoints per step, and configure a default prompt template with placeholders. The orchestrator handles decomposition, parallel DAG execution, quality scoring, and synthesis automatically.
STAGE 1 STAGE 2 — PARALLEL STAGE 3 Your Agent Researcher Your Agent Analyst Your Agent Writer WEAVE Synthesis DAG Parallel HITL per step Auto-scoring Templates Quality retry Unlimited agents Mix cloud + local
Reusable Instruction Sets

Skills change how the AI thinks, not how you configure it.

Scope-targeted modifiers injected into agent context at run time — tone, format, analytical stance. Ten built-in skills ship pre-loaded (all off by default). Create more manually or generate with AI from a plain-language description.

Be Concise Formal Tone Cite Sources Rich Markdown Step-by-Step Code Expert Devil's Advocate Research Mode Executive Summary Structured Output
Autonomous Execution

Give it a goal. It figures out the rest.

Tasks run a single LLM agent in a tool-calling loop against your connected MCP servers — no workflow DAG, just a goal and a tool set. Each tool call and result streams live. AI Generators build any asset from a plain-language description using your configured provider.

Generate Agent Generate Prompt Generate Workflow Generate Task Generate Skill
Legal contract review Competitive intelligence Content production pipeline Technical architecture audit Investment due diligence Market research briefing Code review & documentation SEO content factory Financial analysis report Product spec generation Threat intelligence digest Academic literature review

One file.
No compromises.

No install wizard. No backend. No account. Every capability ships inside the single HTML file you download.

📷
Chrome Extension Capture
Right-click any page to capture text, selections, screenshots, or images. Offline queue holds items until Nemilia is open. Zero-config delivery straight to your Captures inbox.
📦
Captures Inbox
Every capture lands with auto-tagging, vision analysis for images, BM25 + vector embedding, and one-click triage: send to Compose, Chat, a Workflow, or promote to Library for RAG.
💬
Chat Mode
Persistent multi-turn conversations with streaming, document RAG context, memory injection, and live web search. Conversations auto-title and persist across sessions. Pop out into a dedicated window — persistent side-by-side with your main workspace or any other app.
🔲
Multi-Agent DAG Orchestration
Seven built-in specialist agents — SCOUT, LENS, QUILL, FORGE, WEAVE, NEXUS, VISION — run in parallel pipeline stages. Outputs scored 1–10, retried automatically below threshold. Build unlimited custom agents for any domain.
📄
Library & Document RAG
Upload PDFs, DOCX, TXT, CSV, MD, and more. Hybrid semantic + BM25 search with 384-dimension vector embeddings. Bulk-import via folder scan. Promote captures directly to Library without re-embedding.
🌐
Live Web Research
NEXUS searches in real-time via Serper, Brave, Tavily, or SearXNG. Per-agent web search — any agent can query independently. Perplexity available as a provider with built-in web results.
👁
Human-in-the-Loop (HITL)
Toggle checkpoints on any agent. The pipeline pauses, shows you the output, and waits for your decision. Approve, Edit output, or Send feedback — the pipeline pauses and waits for your decision before advancing to the next stage. Optional audio alert when execution pauses.
🔄
Visual Workflow Builder
Drag agents to reorder — position defines DAG stage. Agents at the same level run in parallel. Toggle HITL per step. Live SVG DAG diagram renders inline before and during runs. Built-in Research Suite and Web Research Suite templates.
🧠
Persistent Agent Memory
Agents write MEMORY[key]: value to a persistent key-value store. Memory injects into every run automatically. Independent toggles for memory writes and injection per session.
🎨
VISION — Charts, Diagrams & Images
Generates Chart.js charts, SVG diagrams, and HTML infographics using your LLM — no image API needed. Connect DALL-E 3, FLUX, Stable Diffusion — connect any OpenAI-compatible image endpoint as a custom provider.
🔧
MCP Tool Execution
Agents call real tools via MCP servers — read files, query databases, run code, hit APIs. Supports streamableHttp and SSE transports. Works with Supergateway and any stdio MCP server. 100% client-side.
📁
Workspace & Obsidian Sync
Sync to a real folder on disk via File System API. Every agent, prompt, workflow, and result becomes a plain file — editable in VS Code, versionable in Git. Full Obsidian vault support with wikilink stripping.
🔐
Multi-Profile Encrypted Workspace
Isolated profiles, each with its own encrypted workspace, agents, API keys, and history. AES-256-GCM encryption, PBKDF2 200k iterations. Switch profiles — all in-memory data zeroed first.
📊
Dashboard
Post-login home with live workspace stats, active provider and model, quick-action tiles, and recent chats and workflow results. Navigate anywhere in one click.
Skills System
10 built-in reusable instruction sets, all off by default. Scope to All, Chat, or Compose. Toggle per run. Create manually or generate with AI. Injected into agent context at run time.
🤖
Tasks — Autonomous MCP
Goal-driven LLM + tool loop against your connected MCP servers. Live execution log streams every tool call and result. Full run history. Export to Library or make into a reusable agent.
AI Generators
Generate agents, prompts, workflows, tasks, and skills from a plain-language description. Uses your configured provider. Each opens the editor pre-filled — review and save.

Your models.
Your rules.
Any provider.

Use whatever AI you already trust — or run everything locally with zero configuration. Nemilia is a production environment for your intelligence, not a gatekeeper to it. Switch providers mid-session without touching a single workflow.

☁️
Cloud API — Any Provider
Use your existing OpenAI, Anthropic, Groq, Gemini, DeepSeek, Fireworks AI, or OpenRouter API key. Encrypted with AES-256-GCM, never leaves your browser except to call the provider directly.
✓ Works with existing keys ✓ Any model, any provider ✓ Keys stay encrypted locally
📴
In-Browser — WebLLM
SmolLM2 1.7B, Llama 3.2 1B, Phi, Qwen, Gemma running directly in your GPU via WebLLM. No server, no API key, no internet after first model download. Chrome AI (Gemini Nano) available for zero-download on-device chat.
✓ Zero internet after download ✓ GPU-accelerated inference ✓ Chrome AI — no download needed
OpenAIGPT-4.1, o3, o4-mini
AnthropicOpus 4.7, Sonnet 4.6, Haiku 4.5
GroqLlama 4, DeepSeek R1, Kimi K2, Qwen 3
Google GeminiGemini 2.5 Pro, 2.5 Flash
OpenRouterMulti-model routing
DeepSeekR1, V3
MistralLarge, Codestral
xAI GrokGrok 3, Grok 3 mini
Kimi (Moonshot)K2 1T MoE
Fireworks AIDeepSeek V4, Kimi K2.6, Qwen 3.6
PerplexityBuilt-in web search
Ollama💻 Auto-detected
LM Studio💻 Auto-detected
Jan💻 Auto-detected
Llamafile💻 Mozilla · localhost
Custom API🔌 Any OpenAI-compatible
WebLLM📴 In-browser offline

ⓘ Chrome AI (Gemini Nano) also available — zero install, zero API key, on-device chat only. Requires Chrome 127+ with Gemini Nano enabled.

Your data never touches
our servers.
Because we don’t have any.

Nemilia has no backend. No database. No analytics pipeline. It’s a file. Your documents, captures, prompts, agent memory, and API keys stay where they belong — on your machine. This isn’t a privacy policy. It’s a structural fact.

🔒
AES-256-GCM key encryption — API keys encrypted with PBKDF2 200,000 iterations before storage. Never written as plaintext anywhere.
🚫
Absolute zero telemetry — No analytics, no tracking pixels, no usage beacons, no error reporting to any server. Verify it yourself — it’s one file.
💻
100% client-side processing — Document parsing, vector embeddings, agent memory, orchestration, vision analysis, chart generation. Everything runs locally in your browser tab.
📴
Fully air-gapped capable — Use Ollama, Jan, LM Studio, or WebLLM and Nemilia operates with no internet connection at all. Suitable for sensitive and regulated workloads.

Download the file.
Open it. You’re running.

No waitlist. No signup. No install wizard. Download one HTML file, open it in your browser, connect any AI provider you already use — ready in under 60 seconds.

Questions? luis@nemilia.com — Luis reads every email personally.

Free for personal use  ·  Business Source License 1.1  ·  Commercial license available  ·  Converts to MIT 2030

✉️

Get in Touch

For press inquiries, partnership opportunities, or just to say hello — Luis reads every email personally.

✉ luis@nemilia.com
Response time typically within 24 hours