/images/portfolio/ai-agent.png

AI Agent

Autonomous web agent powered by multi-LLM reasoning

Details

Category

AI / Automation

Style

Experimental

Type

Desktop / Web Agent

Subject

Artificial Intelligence & Automation

Status

Active Prototype

Integrations

Gemini, GPT, Claude, DeepSeek

Overview
AI Agent is an autonomous browser agent capable of navigating any website, understanding context, and performing complex tasks, completely on its own. Unlike traditional web scrapers that just extract data, this system reasons, plans, and acts through integration with large language models.

What I built

  • Multi-LLM integration allowing users to select between Gemini, GPT, Claude, and DeepSeek
  • Real-time browser control that observes the DOM, interprets structure, and takes actions intelligently
  • Task orchestration layer that breaks down goals into smaller executable steps
  • Sandboxed execution environment for safety and monitoring
  • Advanced reasoning loop combining retrieval, context injection, and chain-of-thought planning

Tech stack
Python • Playwright / Puppeteer • LangChain • FastAPI • TypeScript • LLM APIs (Gemini, GPT, Claude, DeepSeek)

Why it matters
This agent moves beyond data scraping, it understands the page. It can log in, click, type, fill forms, and make decisions like a human researcher or assistant. Powered by reasoning-capable LLMs, it learns task structure dynamically instead of relying on brittle scripts.

Impact
The prototype has successfully executed multi-step tasks like booking demos, analyzing dashboards, and compiling research data across sites. It demonstrates how AI can evolve from passive text models into autonomous, action-capable systems.

Latest additions

See all tools