Free Personal Use · Zero Backend · v2.2

The AI OS
for your browser.

Nemilia turns any browser into a complete AI production environment — capture content from the web, run multi-agent workflows, search your documents, and deliver finished results. One file. Any model. No server.

Auto-detects Ollama, Jan, LM Studio — zero configuration for local models. Or connect any cloud provider with your existing API key. Switch mid-session without touching a single workflow.

New in v2.2 — Multi-profile encrypted workspace · Dashboard · Skills system · AI Generators · Tasks — autonomous MCP execution · MCP Catalog + Tool Tester · Custom image endpoint · Mobile overhaul

↓ Download v2.2 Install Chrome Extension

No account · No install · No server · Any modern browser · View on GitHub →

✓ Chrome extension included ✓ Any AI provider ✓ Local models auto-detected ✓ Multi-agent DAG orchestration ✓ Document RAG & web search ✓ Offline-capable 100% client-side AES-256-GCM encrypted keys Zero telemetry

Nemilia neh-MEE-lee-ah — Nahuatl to think, to remember, to imagine

Build Your Team. Design Your Pipeline.

Nemilia is infrastructure
you configure.

Seven specialist agents ship out of the box. But the real power is in what you build on top — agents with any role, any persona, any model, wired into pipelines that match exactly how you work.

Custom Agents

Any role. Any model.
Any domain.

Every agent is fully configurable — name, icon, color, role, system prompt, model override, temperature, and web search behavior. Build a legal reviewer, a financial analyst, a code auditor, or a brand voice editor. Wire them into any workflow.

🄯

Name & Role

Defines the agent's identity and task scope

💬

System Prompt

Full persona, instructions, and constraints

🔲

Model Override

Use a different model per agent — mix cloud and local

🌡️

Temperature

Per-agent control — precise for review, creative for writing

🌐

Web Search

Auto-detect, always on, or custom query per agent

👁

HITL Checkpoint

Pause and review before this agent's output continues

Custom Workflows

Drag. Wire. Run.
Iterate.

Drag agents to set order, toggle HITL checkpoints per step, and configure a default prompt template with placeholders. The orchestrator handles decomposition, parallel DAG execution, quality scoring, and synthesis automatically.

Reusable Instruction Sets

Skills change how the AI thinks, not how you configure it.

Scope-targeted modifiers injected into agent context at run time — tone, format, analytical stance. Ten built-in skills ship pre-loaded (all off by default). Create more manually or generate with AI from a plain-language description.

Be Concise Formal Tone Cite Sources Rich Markdown Step-by-Step Code Expert Devil's Advocate Research Mode Executive Summary Structured Output

Autonomous Execution

Give it a goal. It figures out the rest.

Tasks run a single LLM agent in a tool-calling loop against your connected MCP servers — no workflow DAG, just a goal and a tool set. Each tool call and result streams live. AI Generators build any asset from a plain-language description using your configured provider.

Generate Agent Generate Prompt Generate Workflow Generate Task Generate Skill

What people build

Legal contract review Competitive intelligence Content production pipeline Technical architecture audit Investment due diligence Market research briefing Code review & documentation SEO content factory Financial analysis report Product spec generation Threat intelligence digest Academic literature review

Everything Included

One file.
No compromises.

No install wizard. No backend. No account. Every capability ships inside the single HTML file you download.

📷

Chrome Extension Capture

Right-click any page to capture text, selections, screenshots, or images. Offline queue holds items until Nemilia is open. Zero-config delivery straight to your Captures inbox.

📦

Captures Inbox

Every capture lands with auto-tagging, vision analysis for images, BM25 + vector embedding, and one-click triage: send to Compose, Chat, a Workflow, or promote to Library for RAG.

💬

Chat Mode

Persistent multi-turn conversations with streaming, document RAG context, memory injection, and live web search. Conversations auto-title and persist across sessions. Pop out into a dedicated window — persistent side-by-side with your main workspace or any other app.

🔲

Multi-Agent DAG Orchestration

Seven built-in specialist agents — SCOUT, LENS, QUILL, FORGE, WEAVE, NEXUS, VISION — run in parallel pipeline stages. Outputs scored 1–10, retried automatically below threshold. Build unlimited custom agents for any domain.

📄

Library & Document RAG

Upload PDFs, DOCX, TXT, CSV, MD, and more. Hybrid semantic + BM25 search with 384-dimension vector embeddings. Bulk-import via folder scan. Promote captures directly to Library without re-embedding.

🌐

Live Web Research

NEXUS searches in real-time via Serper, Brave, Tavily, or SearXNG. Per-agent web search — any agent can query independently. Perplexity available as a provider with built-in web results.

👁

Human-in-the-Loop (HITL)

Toggle checkpoints on any agent. The pipeline pauses, shows you the output, and waits for your decision. Approve, Edit output, or Send feedback — the pipeline pauses and waits for your decision before advancing to the next stage. Optional audio alert when execution pauses.

🔄

Visual Workflow Builder

Drag agents to reorder — position defines DAG stage. Agents at the same level run in parallel. Toggle HITL per step. Live SVG DAG diagram renders inline before and during runs. Built-in Research Suite and Web Research Suite templates.

🧠

Persistent Agent Memory

Agents write MEMORY[key]: value to a persistent key-value store. Memory injects into every run automatically. Independent toggles for memory writes and injection per session.

🎨

VISION — Charts, Diagrams & Images

Generates Chart.js charts, SVG diagrams, and HTML infographics using your LLM — no image API needed. Connect DALL-E 3, FLUX, Stable Diffusion — connect any OpenAI-compatible image endpoint as a custom provider.

🔧

MCP Tool Execution

Agents call real tools via MCP servers — read files, query databases, run code, hit APIs. Supports streamableHttp and SSE transports. Works with Supergateway and any stdio MCP server. 100% client-side.

📁

Workspace & Obsidian Sync

Sync to a real folder on disk via File System API. Every agent, prompt, workflow, and result becomes a plain file — editable in VS Code, versionable in Git. Full Obsidian vault support with wikilink stripping.

🔐

Multi-Profile Encrypted Workspace

Isolated profiles, each with its own encrypted workspace, agents, API keys, and history. AES-256-GCM encryption, PBKDF2 200k iterations. Switch profiles — all in-memory data zeroed first.

📊

Dashboard

Post-login home with live workspace stats, active provider and model, quick-action tiles, and recent chats and workflow results. Navigate anywhere in one click.

⚡

Skills System

10 built-in reusable instruction sets, all off by default. Scope to All, Chat, or Compose. Toggle per run. Create manually or generate with AI. Injected into agent context at run time.

🤖

Tasks — Autonomous MCP

Goal-driven LLM + tool loop against your connected MCP servers. Live execution log streams every tool call and result. Full run history. Export to Library or make into a reusable agent.

✨

AI Generators

Generate agents, prompts, workflows, tasks, and skills from a plain-language description. Uses your configured provider. Each opens the editor pre-filled — review and save.

Provider-Agnostic by Design

Your models.
Your rules.
Any provider.

Use whatever AI you already trust — or run everything locally with zero configuration. Nemilia is a production environment for your intelligence, not a gatekeeper to it. Switch providers mid-session without touching a single workflow.

☁️

Cloud API — Any Provider

Use your existing OpenAI, Anthropic, Groq, Gemini, DeepSeek, Fireworks AI, or OpenRouter API key. Encrypted with AES-256-GCM, never leaves your browser except to call the provider directly.

✓ Works with existing keys ✓ Any model, any provider ✓ Keys stay encrypted locally

💻

Local — Auto-Detected

Nemilia auto-detects Ollama, Jan, LM Studio, and Llamafile — zero configuration. Every locally installed model becomes a fully capable agent. LM Studio + qwen3.5 confirmed for offline vision analysis.

✓ Zero config auto-detection ✓ Offline vision with LM Studio ✓ Any locally installed model

📴

In-Browser — WebLLM

SmolLM2 1.7B, Llama 3.2 1B, Phi, Qwen, Gemma running directly in your GPU via WebLLM. No server, no API key, no internet after first model download. Chrome AI (Gemini Nano) available for zero-download on-device chat.

✓ Zero internet after download ✓ GPU-accelerated inference ✓ Chrome AI — no download needed

OpenAIGPT-4.1, o3, o4-mini

AnthropicOpus 4.7, Sonnet 4.6, Haiku 4.5

GroqLlama 4, DeepSeek R1, Kimi K2, Qwen 3

Google GeminiGemini 2.5 Pro, 2.5 Flash

OpenRouterMulti-model routing

DeepSeekR1, V3

MistralLarge, Codestral

xAI GrokGrok 3, Grok 3 mini

Kimi (Moonshot)K2 1T MoE

Fireworks AIDeepSeek V4, Kimi K2.6, Qwen 3.6

PerplexityBuilt-in web search

Ollama💻 Auto-detected

LM Studio💻 Auto-detected

Jan💻 Auto-detected

Llamafile💻 Mozilla · localhost

Custom API🔌 Any OpenAI-compatible

WebLLM📴 In-browser offline

ⓘ Chrome AI (Gemini Nano) also available — zero install, zero API key, on-device chat only. Requires Chrome 127+ with Gemini Nano enabled.

Privacy by Architecture

Your data never touches
our servers.
Because we don’t have any.

Nemilia has no backend. No database. No analytics pipeline. It’s a file. Your documents, captures, prompts, agent memory, and API keys stay where they belong — on your machine. This isn’t a privacy policy. It’s a structural fact.

🔒

AES-256-GCM key encryption — API keys encrypted with PBKDF2 200,000 iterations before storage. Never written as plaintext anywhere.

🚫

Absolute zero telemetry — No analytics, no tracking pixels, no usage beacons, no error reporting to any server. Verify it yourself — it’s one file.

💻

100% client-side processing — Document parsing, vector embeddings, agent memory, orchestration, vision analysis, chart generation. Everything runs locally in your browser tab.

📴

Fully air-gapped capable — Use Ollama, Jan, LM Studio, or WebLLM and Nemilia operates with no internet connection at all. Suitable for sensitive and regulated workloads.

The AI OS
for your browser.

Capture. Orchestrate. Automate. Deliver.

Nemilia is infrastructure
you configure.

Skills change how the AI thinks, not how you configure it.

Give it a goal. It figures out the rest.

One file.
No compromises.

Your models.
Your rules.
Any provider.

Your data never touches
our servers.
Because we don’t have any.

Download the file.
Open it. You’re running.

Get in Touch

The AI OSfor your browser.

Capture. Orchestrate. Automate. Deliver.

Nemilia is infrastructureyou configure.

Skills change how the AI thinks, not how you configure it.

Give it a goal. It figures out the rest.

One file.No compromises.

Your models.Your rules.Any provider.

Your data never touchesour servers.Because we don’t have any.

Download the file.Open it. You’re running.

Get in Touch

The AI OS
for your browser.

Nemilia is infrastructure
you configure.

One file.
No compromises.

Your models.
Your rules.
Any provider.

Your data never touches
our servers.
Because we don’t have any.

Download the file.
Open it. You’re running.