Web Agent: Self-Learning AI in Your Browser

Hermdroid Web Agent running in a browser window with terminal interface and chat sidebar

Introduction

!Web Agent in-browser architecture: self-learning loop with DOM interaction and tool execution

If you've ever tried to use an AI agent, you're probably familiar with the friction: install this Python stack, configure that API key, set up a virtual environment, debug the Docker container, remember to restart the server, and then pray your environment variables don't get lost between reboots.

What if you could use a full-featured AI agent — skills, memory, tools, automation — without installing anything at all?

Today, we're open-sourcing Web Agent (codename) / Web Agent — a production-ready AI agent that runs entirely in your browser, powered by WebContainers, with zero local setup required. No Python. No Node.js installation. No server. No command line required. Just open the browser, create a profile, set your API key, and start working.

Built on the same architecture as Hermes Agent (our desktop AI assistant), Web Agent brings the full power of autonomous AI workflows to any modern browser — isolated profiles, persistent memory, a knowledge vault, slash commands, cron automation, and multi-platform gateways — while keeping all your data locally encrypted and never sending it to our servers.

This is the AI agent I've been using internally at aratech to automate our Directus blog workflows, research tasks, and knowledge management. Now it's yours, under the MIT license, at github.com/nikola66/web-agent and live at webagent.aratech.ae.

The Problem With AI Agents Today

Let's be honest: AI agents today are powerful, but they're also painful to use.

The standard agent setup is something like this:

Install a local runtime — Python virtual environment, node modules, Docker images, Ollama pulls
Configure your environment — API keys, proxy settings, SSL certificates, system variables
Build your pipeline - glue scripts, framework setup, vector database configuration
Hope it still works tomorrow — your OS update breaks the Python binary, a dependency changes, your local LLM crashes

This friction is why agents haven't reached mainstream adoption. The technology is ready, but the delivery mechanism is stuck in the same complexity trap that original web development was in before cloud platforms abstracted it away.

There's also the state problem. Most agents conflate your data, your conversations, your tasks, and your credentials into a single blob — or worse, require you to trust a third party with all of it. If their server goes down, your agent goes down. If they change their API, your workflow breaks. If they decide to stop offering a free tier, you're locked out.

Finally, there's the specialization gap. The average AI is trained on the entire internet — that's like having a thousand employees, none of whom know anything about your business. You spend half an hour re-explaining your context, your rules, and your goals every time you start a new conversation. That's not a knowledge worker; that's a repetitive onboarding process.

We built Web Agent to solve all three problems at once.

What Is Web Agent?

Web Agent (our internal codename; the project is formally called Web Agent) is a full-function AI agent that runs natively in your browser using WebContainers — the same technology that powers CodeSandbox and StackBlitz.

Think of it as Hermes Agent, but ported to the browser. Same skills system, same multi-layer memory, same ~40 built-in tools, same self-learning loop. The difference is: no installation, no server, no environment variables, no Docker.

| Typical AI Agent Setup          | Web Agent                    |
|-------------------------------|------------------------------|
| Install Python/Node/Docker    | Open browser                 |
| Configure .env file           | Set API key (encrypted local)|
| Choose vector DB              | Zero config                  |
| Maintain server uptime        | Works immediately            |
| Data leaves your machine      | Everything stays in browser  |
| Single agent per install      | 4 isolated profiles at once  |

Every profile in Web Agent gets its own:

Isolated workspace — files, shells, and project state sandboxed from other profiles
Separate memory — fact store, session memory, reflections, and learnings scoped per profile
Encrypted credentials — API keys stored locally in the browser, never transmitted to servers
Skill overrides — per-profile skill definitions that inherit from a shared base

If you profile up for personal use, one for client work, one for open-source contribution, and one for experiments — they each live in their own world, completely isolated.

Architecture: Under the Hood

Web Agent's architecture is deliberately layered to keep execution, persistence, and infrastructure as separate concerns:

┌─────────────────────────────────────────────────────┐
│  Browser: React 19 + Vite + TypeScript + xterm.js   │
├─────────────────────────────────────────────────────┤
│  Sidebar │ Terminal (xterm) │ Chat Input             │
│  Profiles│ Transcript       │ Natural Language       │
├─────────────────────────────────────────────────────┤
│            Core Orchestrator                        │
│  • Profile lifecycle management                     │
│  • WebContainer boot/shutdown                       │
│  • Credential vault (encrypted)                     │
├─────────────────────────────────────────────────────┤
│   Embedded Agent Runtime (Node.js in WebContainers)  │
│  ┌──────────────────────────────────────────────┐   │
│  │  LLM Loop (OpenRouter / Ollama / Custom)     │   │
│  │  Tool Registry (~40 built-in tools)          │   │
│  │  Skill Manager (SKILL.md loader)             │   │
│  │  Memory Layers (fact_store, session, reflect)│   │
│  │  Cron Scheduler (heartbeat + jobs)           │   │
│  │  Channel Gateway (Telegram, Email)           │   │
│  └──────────────────────────────────────────────┘   │
├─────────────────────────────────────────────────────┤
│   Persistence: IndexedDB + OPFS (browser-local)     │
│   No server-side user state                         │
└─────────────────────────────────────────────────────┘

That last line is worth emphasizing: your data never leaves your browser unless you explicitly configure an external LLM provider. The hosted demo at webagent.aratech.ae only serves the static application; every file, memory, and credential stays in your own browser's IndexedDB or OPFS storage. Even if the demo goes offline, your local data remains accessible through export/import.

This isn't a cloud product with a free tier — it's a tool that runs on your computer, delivered through the browser.

Technology Stack

Layer	Technology
Frontend	React 19.1, TypeScript 5.8, Vite 6.3, Tailwind CSS 4.1
Terminal UI	xterm.js (with unicode/fit addon)
State	Zustand 5.0
Storage	idb-keyval (IndexedDB) + browser OPFS
Runtime	WebContainers (@codesandbox/nodebox v0.1.9)
LLM Providers	OpenRouter (default), Ollama (cloud), Custom (user base URL)
Browser Tools	TinyFish (web_search / web_fetch), Resend (email)
Deployment	Static Vite build → Caddy 2.6 or any CDN

Core Features

Let's walk through what actually makes this agent useful on a daily basis.

Isolated Profiles — Multiple Agents, One Browser

Think of a profile as a dedicated workspace for your agent. Each profile has its own:

WebContainer filesystem (virtualized Node.js sandbox)
Memory layers (facts, sessions, reflections, learnings)
Credential vault
Skill overrides
Export/import snapshot

You can spin up up to 4 concurrent agents in different profiles simultaneously. One profile for work, one for personal, one for a client project, one for experimentation — they never cross-contaminate.

Knowledge Vault (PARA + Wiki)

Inspired by Karpathy's viral "AI Second Brain" concept, Web Agent has a first-class knowledge vault built in.

You can:

/wiki-setup to initialize a PARA-structured markdown vault
/wiki-sync to ingest all your memory, accumulated facts, and skill learnings into the vault
/wiki-search to query your vault when the agent needs to surface context

The vault grows over time as you use the agent. Your sessions, facts, and learnings get synthesized into structured knowledge — not just a flat transcript log. This is the compounding knowledge loop in action.

Multi-Layer Memory

Web Agent stores four distinct types of memory, each with a different purpose:

Layer	Purpose	Persistence
`fact_store`	Durable facts about user/project/env	Across sessions
`session_memory`	Rolling working notes during a conversation	Current session only
`reflections`	Agent's insights after completing a task	Across sessions
`learnings`	Cross-session procedural patterns	Across sessions

Using the /memory-layers skill, you can consciously choose what to store where and avoid context duplication. This is the same memory architecture that powers Hermes Agent's ability to "remember everything that matters and forget everything that doesn't."

Self-Learning Loop

This is the piece that turns a chatbot into an agent that actually gets better over time.

Every time the agent completes a task, it can generate:

Reflections — what worked, what didn't, what was missing
Learnings — procedural patterns that generalize across tasks
Facts — durable nuggets about your domain, preferences, and environment

These flow back into its memory and optionally into the knowledge vault. Over time, the agent doesn't just accumulate data — it assembles expertise.

Use skill_save to turn a successful, well-structured workflow into a reusable SKILL.md that the agent pulls in for related tasks in the future. Your agent's expertise grows alongside your project.

Knowledge Vault (PARA + wiki) - Expanded

Let's be concrete about how the knowledge vault works in practice:

Initialize with /wiki-setup — creates a PARA-structure in your workspace under knowledge-vault/
Feed it — drop any business data (transcripts, PDFs, goals, competitor notes, notes, voice transcripts) into the workspace
Sync with /wiki-sync — the agent compiles all that raw material into a structured, AI-native knowledge base with an index, log, and cross-linked concepts
Query with /wiki-search — the agent searches your vault before the general LLM knowledge base, producing outputs that are uniquely yours

This is how you turn generic AI slop into something that actually understands your business, your voice, your goals, and your past decisions. One query on your YouTube strategy vault produces video ideas that sound like you built them, not something that could have been generated for any channel.

~40 Built-in Tools

Web Agent ships with a comprehensive toolset out of the box:

Filesystem: read_file, write_file, edit_file, multi_edit, delete_file, move_file, make_dir, tree, find_files, grep, file_diff, file_stat

Memory: memory_save, memory_recall, memory_search, session_memory_append, session_memory_list, session_search

Skills: skill_list, skill_view, skill_save, skill_manage, skill_bulk_save, skill_delete, skill_recall

Automation: cron_register, cron_list, todo_write

Web & Vision: web_search, web_fetch, vision_analyze, youtube_transcribe, email

System: run_shell, system_info, artifact_present, apply_patch

All of these are available inside the WebContainer sandbox. They operate on your profile's isolated workspace, so you can experiment, break things, and recover without fear of losing the rest of your system.

Slash Commands & Planning Mode

Web Agent uses a slash command system borrowed from the best terminal UX patterns (Hermes Agent, Claude Code, OpenCode):

/help          — show all available tools and commands
/clear         — restart with a fresh conversation (keeps profile data)
/compact       — compress older context, keep current conversation going
/checkpoint    — save a named snapshot of the current session
/rollback      — load a checkpoint
/skills        — list/search installed skills
/plan [goal]   — enter specification-first planning mode
/stop          — interrupt current tool run
/exit          — terminate the terminal session

Planning mode (/plan) is especially powerful. When you want to tackle a complex task:

Type /plan build a landing page for our new product
Web Agent reads your workspace (read-only, no modifications yet)
It writes a full specification markdown file to .webagent/plans/ and presents it for your approval
You review, revise, or accept — then say "execute the plan" on your next message
It executes the plan step by step, with full transparency

This is how you get rigorous execution and human oversight — the plan is reviewed before any code is written.

Multi-Platform Gateway

Web Agent isn't confined to the browser window. It includes a channel gateway architecture that can connect the agent to:

Telegram — polling channel, long-running sessions in chat
Email — via Resend provider, send and receive email from the agent
Extensible — add new channels by dropping a capability module under src/capabilities/channels/ and rebuilding

On our Directus blog management workflow, we've wired Web Agent to manage scheduled posts, pull analytics, and respond to editorial queries — all through a Telegram chat interface. The agent runs in the browser (hosted demo), but the conversations happen in Telegram. That's the versatility you get from a proper channel abstraction layer.

Security & Privacy

There's a difference between "we claim we don't use your data" and "your data physically cannot leave your browser."

Web Agent does the latter. The local architecture guarantees:

Encrypted per-profile API keys — stored in browser storage, never transmitted in the clear
Workspace isolation — one profile's files and memory can't access another's
No server-side user state — the hosted demo is transit-only; closing your browser discards your session from the server
CORS proxy stateless — the fetch sidecar does not log or store traffic
Secret redaction — API keys and credentials are redacted before any log output
Tool guardrails — confirmation prompts for destructive operations, loop timeout protection

You can run Web Agent entirely offline for all local work; only LLM calls and web fetch operations require network access — and you control both credentials.

Getting Started in 60 Seconds

Here's the whole setup:

# 1. Open the demo
## → https://webagent.aratech.ae

## 2. Create a new profile (click "New Profile")
## 3. Set your LLM provider and API key (encrypted locally)
## 4. Start chatting — zero configuration required

That's it. No environment variables, no terminal, no build step. The agent boots its WebContainer runtime in ~5 seconds and you're in.

If you want to customize or contribute:

git clone https://github.com/nikola66/web-agent.git
cd web-agent && npm install
npm run dev          # local development with hot-reload
npm run build        # production static build

Deploy anywhere static files are served — Vercel, Netlify, Cloudflare Pages, a Caddy server, or a simple npx serve dist. No database, no server-side API required.

Self-Learning: The Agent That Gets Smarter Every Conversation

Let me highlight the self-learning loop one more time because it's the feature that will change how you think about AI agents.

Every interaction produces three things the system can store:

Facts — "The user prefers TypeScript over JavaScript", "Our Directus blog uses English, Arabic, Spanish, German, and French"
Reflections — "The video script task went well this time because the outline was approved before drafting", "I should check for typos when writing code examples"
Learnings — "When working with the Directus API, always fetch the post ID before attempting to assign tags"

These are not transcript logs. They are structured, retrievable, intent-bearing pieces of knowledge that the agent can recall, apply, and reflect on. Over time, the agent doesn't just "remember" your recent conversation — it understands your project's trajectory and can fill in context gaps without explicit prompting.

Use skill_save to promote a particularly good workflow (like "cross-post to 5 languages with consistent formatting") into a reusable skill. Next time you say "cross-post my article," the agent pulls in that skill, checks your Directus translations, formats everything consistently, and returns the job done — without re-learning the process from scratch.

Comparison: Web Agent vs The Alternatives

How does this compare to what's out there today?

Feature	Web Agent	Claude Desktop / Code	CustomGPT Style	LangChain ☐
Zero-install browser-native	✅	❌	❌	❌
Browser-native tool access	✅	❌	❌	❌
Isolated per-context profiles	✅	Partial	❌	Custom
Persistent multi-layer memory	✅	✅	❌	Partial
Knowledge vault (PARA)	✅	❌	❌	Partial
Skills system	✅	✅	❌	Partial
Self-learning loop	✅	❌	❌	❌
Multi-platform gateway	✅	❌	❌	Partial
Self-hostable / open-source	✅	❌	❌	Partial (Apache 2.0)
Static build / no server	✅	❌	❌	❌

The honest difference: Web Agent is unusual. Most AI agent tools are built either as an IDE extension (Claude Code) or as a bespoke cloud service (Bolt, V0, Cursor). Web Agent rethinks where the agent lives: in the browser, in your control, with zero computing prerequisites. That matters.

Real-World Usage: How We Use It

Here's a representative sample of how we use Web Agent internally:

Daily Blog Management We route our editorial workflow through a Telegram channel connected to Web Agent. The agent reads our Directus blog, identifies drafts ready for review, formats them for publication, schedules cross-posts in 5 languages, and flags anything that needs human attention.

Research & Knowledge Compilation We drop raw materials (videos, PDFs, competitor notes) into the agent's workspace, then run /wiki-sync to have the agent synthesize them into a structured knowledge vault — the same Karpathy second brain pattern we've discussed publicly. The difference: it happens automatically in the browser, not through manual prompt engineering in Claude Code.

Scheduled Automation Cron jobs run the agent in the background against its embedded Node.js runtime. One does nightly: "scan this folder for new designs, generate alt text using vision, and append to a changelog." All within the browser tab, no external server required.

Experimentation Sandbox Each profile is a disposable workspace. Trying a new git repo, running an experiment with a new API, building a quick prototype — spin up a fresh profile, do the work, export or discard. Nothing persists unless you want it to.

Open Source & Community

Web Agent is MIT-licensed. We built it to be as hackable as possible:

Drop-in capability extensions: Put a folder under src/capabilities/{tools,providers,channels,skills}/ and rebuild — the system auto-discovers and loads it
Full access to agent internals: the embedded runtime is plain TypeScript compiled to ESM; browse it, modify it, rebuild it
No gated features: everything in the repo is available in the live demo, no credit card, no invite

We'd love your contributions. If you've built an interesting skill, a new tool provider, or a creative workflow, please open a PR or open an issue to tell us about it.

Repository: https://github.com/nikola66/web-agent Live Demo: https://webagent.aratech.ae Support (if you want to buy a coffee): http://ko-fi.com/nikola66

Roadmap: What's Next

We're actively developing on the main branch. The v0.0.6 release (May 16, 2026) added the PARA knowledge vault builtins (/wiki-setup, /wiki-sync, /wiki-search), safer memory projection, and a set of Open Web Research capabilities for deep discovery tasks.

Short-term roadmap (next few weeks):

More built-in skill templates (Directus management, blog cross-posting, podcast production)
Expanded provider list (OpenAI, DeepSeek, and others as OpenAI-compatible)
Larger concurrent profile support
Test suite for tool smoke tests (in progress)
Public skill registry — share and discover community skills

Medium-term:

Plugin system for workspace-level extensions
Media-heavy workflows (audio transcription, video analysis, image generation pipelines)
Deeper insight dashboard: "what has this agent learned about my project?"
Team/shared profile modes for small teams

Conclusion

The promise of AI agents has always been: autonomous workflows that know your context, learn from your feedback, and get smarter over time. The problem has been friction — installation, maintenance, isolation, trust.

Web Agent eliminates friction. It runs in the browser, never sends your data to our servers, keeps your profiles isolated, builds a growing knowledge base about your work, and gives you the full power of autonomous AI — no Docker, no Python, no server.

It's not a toy. It's the same system we built for ourselves, now open-sourced for anyone who wants to use it, study it, customize it, or apply it to something we haven't thought of yet.

We'd love to hear how you use it.

Try it now: https://webagent.aratech.ae See the code: https://github.com/nikola66/web-agent Star the repo: ⭐ https://github.com/nikola66/web-agent

What's next for you? Join our community, build a skill, share your setup. We're building something different — with you, not just for you.

Introduction

!Web Agent in-browser architecture: self-learning loop with DOM interaction and tool execution

What if you could use a full-featured AI agent — skills, memory, tools, automation — without installing anything at all?

The Problem With AI Agents Today

Let's be honest: AI agents today are powerful, but they're also painful to use.

The standard agent setup is something like this:

Install a local runtime — Python virtual environment, node modules, Docker images, Ollama pulls
Configure your environment — API keys, proxy settings, SSL certificates, system variables
Build your pipeline - glue scripts, framework setup, vector database configuration
Hope it still works tomorrow — your OS update breaks the Python binary, a dependency changes, your local LLM crashes

We built Web Agent to solve all three problems at once.

What Is Web Agent?

| Typical AI Agent Setup          | Web Agent                    |
|-------------------------------|------------------------------|
| Install Python/Node/Docker    | Open browser                 |
| Configure .env file           | Set API key (encrypted local)|
| Choose vector DB              | Zero config                  |
| Maintain server uptime        | Works immediately            |
| Data leaves your machine      | Everything stays in browser  |
| Single agent per install      | 4 isolated profiles at once  |

Every profile in Web Agent gets its own:

Isolated workspace — files, shells, and project state sandboxed from other profiles
Separate memory — fact store, session memory, reflections, and learnings scoped per profile
Encrypted credentials — API keys stored locally in the browser, never transmitted to servers
Skill overrides — per-profile skill definitions that inherit from a shared base

If you profile up for personal use, one for client work, one for open-source contribution, and one for experiments — they each live in their own world, completely isolated.

Architecture: Under the Hood

Web Agent's architecture is deliberately layered to keep execution, persistence, and infrastructure as separate concerns:

┌─────────────────────────────────────────────────────┐
│  Browser: React 19 + Vite + TypeScript + xterm.js   │
├─────────────────────────────────────────────────────┤
│  Sidebar │ Terminal (xterm) │ Chat Input             │
│  Profiles│ Transcript       │ Natural Language       │
├─────────────────────────────────────────────────────┤
│            Core Orchestrator                        │
│  • Profile lifecycle management                     │
│  • WebContainer boot/shutdown                       │
│  • Credential vault (encrypted)                     │
├─────────────────────────────────────────────────────┤
│   Embedded Agent Runtime (Node.js in WebContainers)  │
│  ┌──────────────────────────────────────────────┐   │
│  │  LLM Loop (OpenRouter / Ollama / Custom)     │   │
│  │  Tool Registry (~40 built-in tools)          │   │
│  │  Skill Manager (SKILL.md loader)             │   │
│  │  Memory Layers (fact_store, session, reflect)│   │
│  │  Cron Scheduler (heartbeat + jobs)           │   │
│  │  Channel Gateway (Telegram, Email)           │   │
│  └──────────────────────────────────────────────┘   │
├─────────────────────────────────────────────────────┤
│   Persistence: IndexedDB + OPFS (browser-local)     │
│   No server-side user state                         │
└─────────────────────────────────────────────────────┘

This isn't a cloud product with a free tier — it's a tool that runs on your computer, delivered through the browser.

Technology Stack

Layer	Technology
Frontend	React 19.1, TypeScript 5.8, Vite 6.3, Tailwind CSS 4.1
Terminal UI	xterm.js (with unicode/fit addon)
State	Zustand 5.0
Storage	idb-keyval (IndexedDB) + browser OPFS
Runtime	WebContainers (@codesandbox/nodebox v0.1.9)
LLM Providers	OpenRouter (default), Ollama (cloud), Custom (user base URL)
Browser Tools	TinyFish (web_search / web_fetch), Resend (email)
Deployment	Static Vite build → Caddy 2.6 or any CDN

Core Features

Let's walk through what actually makes this agent useful on a daily basis.

Isolated Profiles — Multiple Agents, One Browser

Think of a profile as a dedicated workspace for your agent. Each profile has its own:

WebContainer filesystem (virtualized Node.js sandbox)
Memory layers (facts, sessions, reflections, learnings)
Credential vault
Skill overrides
Export/import snapshot

Knowledge Vault (PARA + Wiki)

Inspired by Karpathy's viral "AI Second Brain" concept, Web Agent has a first-class knowledge vault built in.

You can:

/wiki-setup to initialize a PARA-structured markdown vault
/wiki-sync to ingest all your memory, accumulated facts, and skill learnings into the vault
/wiki-search to query your vault when the agent needs to surface context

Multi-Layer Memory

Web Agent stores four distinct types of memory, each with a different purpose:

Layer	Purpose	Persistence
`fact_store`	Durable facts about user/project/env	Across sessions
`session_memory`	Rolling working notes during a conversation	Current session only
`reflections`	Agent's insights after completing a task	Across sessions
`learnings`	Cross-session procedural patterns	Across sessions

Self-Learning Loop

This is the piece that turns a chatbot into an agent that actually gets better over time.

Every time the agent completes a task, it can generate:

Reflections — what worked, what didn't, what was missing
Learnings — procedural patterns that generalize across tasks
Facts — durable nuggets about your domain, preferences, and environment

These flow back into its memory and optionally into the knowledge vault. Over time, the agent doesn't just accumulate data — it assembles expertise.

Use skill_save to turn a successful, well-structured workflow into a reusable SKILL.md that the agent pulls in for related tasks in the future. Your agent's expertise grows alongside your project.

Knowledge Vault (PARA + wiki) - Expanded

Let's be concrete about how the knowledge vault works in practice:

Initialize with /wiki-setup — creates a PARA-structure in your workspace under knowledge-vault/
Feed it — drop any business data (transcripts, PDFs, goals, competitor notes, notes, voice transcripts) into the workspace
Sync with /wiki-sync — the agent compiles all that raw material into a structured, AI-native knowledge base with an index, log, and cross-linked concepts
Query with /wiki-search — the agent searches your vault before the general LLM knowledge base, producing outputs that are uniquely yours

~40 Built-in Tools

Web Agent ships with a comprehensive toolset out of the box:

Filesystem: read_file, write_file, edit_file, multi_edit, delete_file, move_file, make_dir, tree, find_files, grep, file_diff, file_stat

Memory: memory_save, memory_recall, memory_search, session_memory_append, session_memory_list, session_search

Skills: skill_list, skill_view, skill_save, skill_manage, skill_bulk_save, skill_delete, skill_recall

Automation: cron_register, cron_list, todo_write

Web & Vision: web_search, web_fetch, vision_analyze, youtube_transcribe, email

System: run_shell, system_info, artifact_present, apply_patch

Slash Commands & Planning Mode

Web Agent uses a slash command system borrowed from the best terminal UX patterns (Hermes Agent, Claude Code, OpenCode):

/help          — show all available tools and commands
/clear         — restart with a fresh conversation (keeps profile data)
/compact       — compress older context, keep current conversation going
/checkpoint    — save a named snapshot of the current session
/rollback      — load a checkpoint
/skills        — list/search installed skills
/plan [goal]   — enter specification-first planning mode
/stop          — interrupt current tool run
/exit          — terminate the terminal session

Planning mode (/plan) is especially powerful. When you want to tackle a complex task:

Type /plan build a landing page for our new product
Web Agent reads your workspace (read-only, no modifications yet)
It writes a full specification markdown file to .webagent/plans/ and presents it for your approval
You review, revise, or accept — then say "execute the plan" on your next message
It executes the plan step by step, with full transparency

This is how you get rigorous execution and human oversight — the plan is reviewed before any code is written.

Multi-Platform Gateway

Web Agent isn't confined to the browser window. It includes a channel gateway architecture that can connect the agent to:

Telegram — polling channel, long-running sessions in chat
Email — via Resend provider, send and receive email from the agent
Extensible — add new channels by dropping a capability module under src/capabilities/channels/ and rebuilding

Security & Privacy

There's a difference between "we claim we don't use your data" and "your data physically cannot leave your browser."

Web Agent does the latter. The local architecture guarantees:

Encrypted per-profile API keys — stored in browser storage, never transmitted in the clear
Workspace isolation — one profile's files and memory can't access another's
No server-side user state — the hosted demo is transit-only; closing your browser discards your session from the server
CORS proxy stateless — the fetch sidecar does not log or store traffic
Secret redaction — API keys and credentials are redacted before any log output
Tool guardrails — confirmation prompts for destructive operations, loop timeout protection

You can run Web Agent entirely offline for all local work; only LLM calls and web fetch operations require network access — and you control both credentials.

Getting Started in 60 Seconds

Here's the whole setup:

# 1. Open the demo
## → https://webagent.aratech.ae

## 2. Create a new profile (click "New Profile")
## 3. Set your LLM provider and API key (encrypted locally)
## 4. Start chatting — zero configuration required

That's it. No environment variables, no terminal, no build step. The agent boots its WebContainer runtime in ~5 seconds and you're in.

If you want to customize or contribute:

git clone https://github.com/nikola66/web-agent.git
cd web-agent && npm install
npm run dev          # local development with hot-reload
npm run build        # production static build

Deploy anywhere static files are served — Vercel, Netlify, Cloudflare Pages, a Caddy server, or a simple npx serve dist. No database, no server-side API required.

Self-Learning: The Agent That Gets Smarter Every Conversation

Let me highlight the self-learning loop one more time because it's the feature that will change how you think about AI agents.

Every interaction produces three things the system can store:

Facts — "The user prefers TypeScript over JavaScript", "Our Directus blog uses English, Arabic, Spanish, German, and French"
Reflections — "The video script task went well this time because the outline was approved before drafting", "I should check for typos when writing code examples"
Learnings — "When working with the Directus API, always fetch the post ID before attempting to assign tags"

Comparison: Web Agent vs The Alternatives

How does this compare to what's out there today?

Feature	Web Agent	Claude Desktop / Code	CustomGPT Style	LangChain ☐
Zero-install browser-native	✅	❌	❌	❌
Browser-native tool access	✅	❌	❌	❌
Isolated per-context profiles	✅	Partial	❌	Custom
Persistent multi-layer memory	✅	✅	❌	Partial
Knowledge vault (PARA)	✅	❌	❌	Partial
Skills system	✅	✅	❌	Partial
Self-learning loop	✅	❌	❌	❌
Multi-platform gateway	✅	❌	❌	Partial
Self-hostable / open-source	✅	❌	❌	Partial (Apache 2.0)
Static build / no server	✅	❌	❌	❌

Real-World Usage: How We Use It

Here's a representative sample of how we use Web Agent internally:

Open Source & Community

Web Agent is MIT-licensed. We built it to be as hackable as possible:

Drop-in capability extensions: Put a folder under src/capabilities/{tools,providers,channels,skills}/ and rebuild — the system auto-discovers and loads it
Full access to agent internals: the embedded runtime is plain TypeScript compiled to ESM; browse it, modify it, rebuild it
No gated features: everything in the repo is available in the live demo, no credit card, no invite

We'd love your contributions. If you've built an interesting skill, a new tool provider, or a creative workflow, please open a PR or open an issue to tell us about it.

Repository: https://github.com/nikola66/web-agent Live Demo: https://webagent.aratech.ae Support (if you want to buy a coffee): http://ko-fi.com/nikola66

Roadmap: What's Next

Short-term roadmap (next few weeks):

More built-in skill templates (Directus management, blog cross-posting, podcast production)
Expanded provider list (OpenAI, DeepSeek, and others as OpenAI-compatible)
Larger concurrent profile support
Test suite for tool smoke tests (in progress)
Public skill registry — share and discover community skills

Medium-term:

Plugin system for workspace-level extensions
Media-heavy workflows (audio transcription, video analysis, image generation pipelines)
Deeper insight dashboard: "what has this agent learned about my project?"
Team/shared profile modes for small teams

Conclusion

It's not a toy. It's the same system we built for ourselves, now open-sourced for anyone who wants to use it, study it, customize it, or apply it to something we haven't thought of yet.

We'd love to hear how you use it.

Try it now: https://webagent.aratech.ae See the code: https://github.com/nikola66/web-agent Star the repo: ⭐ https://github.com/nikola66/web-agent

What's next for you? Join our community, build a skill, share your setup. We're building something different — with you, not just for you.

Key Takeaways

Table of Contents

Introduction

The Problem With AI Agents Today

What Is Web Agent?

Architecture: Under the Hood

Technology Stack

Core Features

Isolated Profiles — Multiple Agents, One Browser

Knowledge Vault (PARA + Wiki)

Multi-Layer Memory

Self-Learning Loop

Knowledge Vault (PARA + wiki) - Expanded

~40 Built-in Tools

Slash Commands & Planning Mode

Multi-Platform Gateway

Security & Privacy

Getting Started in 60 Seconds

Self-Learning: The Agent That Gets Smarter Every Conversation

Comparison: Web Agent vs The Alternatives

Real-World Usage: How We Use It

Open Source & Community

Roadmap: What's Next

Conclusion

Related Articles

Related Posts

8 Open-Source AI Tools You Missed This Week

OpenAI's 'Dreaming V3' — ChatGPT Finally Has Persistent Memory

Claude Fable 5: Anthropic Brings Mythos-Class Intelligence to the Public

Key Takeaways

Table of Contents

Introduction

The Problem With AI Agents Today

What Is Web Agent?

Architecture: Under the Hood

Technology Stack

Core Features

Isolated Profiles — Multiple Agents, One Browser

Knowledge Vault (PARA + Wiki)

Multi-Layer Memory

Self-Learning Loop

Knowledge Vault (PARA + wiki) - Expanded

~40 Built-in Tools

Slash Commands & Planning Mode

Multi-Platform Gateway

Security & Privacy

Getting Started in 60 Seconds

Self-Learning: The Agent That Gets Smarter Every Conversation

Comparison: Web Agent vs The Alternatives

Real-World Usage: How We Use It

Open Source & Community

Roadmap: What's Next

Conclusion

Related Articles

Related Posts

8 Open-Source AI Tools You Missed This Week

OpenAI's 'Dreaming V3' — ChatGPT Finally Has Persistent Memory

Claude Fable 5: Anthropic Brings Mythos-Class Intelligence to the Public