← Back to research
·4 min read·company

droidclaw

droidclaw turns old Android phones into AI agents via ADB — give it a goal in plain English, it reads the screen, reasons about what to do, taps and types, and repeats until the job is done. No APIs needed.

Key takeaways

  • Turns any old Android phone into an autonomous AI agent via ADB screen reading and interaction — no app APIs needed
  • Perception → reasoning → action loop with stuck detection, repetition tracking, drift detection, and vision fallback
  • Can delegate tasks to ChatGPT, Gemini, or Google Search apps on the device without API keys — uses apps like a human would
  • TypeScript/Bun runtime with accessibility tree parsing and optional screenshot-based vision for webviews and Flutter apps

FAQ

What is droidclaw?

droidclaw is an AI agent that controls Android phones via ADB. You give it a goal in plain English, and it reads the screen, decides what to tap or type, executes via ADB, and repeats until the goal is complete.

How does droidclaw work?

A perception → reasoning → action loop: dump the accessibility tree via ADB, send screen state + goal to an LLM, execute the returned action (tap, type, swipe), and loop until done.

How much does droidclaw cost?

Free and open-source. You need an old Android phone, a computer running Bun, and an LLM API key.

Who competes with droidclaw?

No direct competitors in the personal agent space — most agents use APIs, not screen reading. Conceptually similar to browser automation agents like HappyCapy but for Android devices.

Executive Summary

droidclaw turns old Android phones into autonomous AI agents. Instead of building API integrations, it controls phones the way a human would — reading the screen via ADB accessibility trees, reasoning about what to do with an LLM, and executing taps, types, and swipes. No app APIs, no custom integrations — just install apps and tell the agent what you want done. [1]

AttributeValue
Authorunitedbyai
LanguageTypeScript (Bun)
LicenseOpen Source
GitHub Stars931

Product Overview

droidclaw's core innovation is using the phone itself as the integration layer. Rather than building API connectors for every service, it reads screens and interacts with any installed app. It can even delegate questions to ChatGPT, Gemini, or Google Search on the device — no API keys needed for those services. [1]

How It Works

  1. Perceive — Dump accessibility tree via ADB, parse interactive UI elements, diff with previous screen
  2. Reason — Send screen state + goal + history to LLM, get back think/plan/action
  3. Act — Execute via ADB (tap, type, swipe), feed result back on next step
  4. Loop — Repeat until goal is done or step limit reached

Reliability Features

FeatureDescription
Stuck Loop DetectionRecovery hints after 3 unchanged screens
Repetition TrackingSliding window catches retry loops across screen changes
Drift DetectionNudges agent if it spams navigation without interacting
Vision FallbackScreenshots for webviews, Flutter apps, games (empty accessibility trees)
Action FeedbackEvery action result fed back to LLM for next step
Multi-Turn MemoryConversation history maintained across steps

Strengths

  • Universal integration — Works with any Android app without APIs
  • Hardware recycling — Gives old phones a second life as AI agents
  • Robust failure handling — Stuck detection, repetition tracking, drift detection, vision fallback
  • No API keys for target apps — Uses ChatGPT, Gemini, Google Search as a human would
  • Web dashboard — Visual monitoring at app.droidclaw.ai
  • Active development — 931 stars, companion APK for device-side setup

Cautions

  • Android only — No iOS support (ADB is Android-specific)
  • Inherently fragile — Screen-based automation is slower and less reliable than API calls
  • Requires ADB setup — USB debugging, computer running Bun
  • LLM latency — Each step requires an LLM call; multi-step tasks are slow
  • Bun-only — Won't run on Node.js; uses Bun-specific APIs
  • Early stage — 931 stars, v0.5.0, rapidly evolving

Pricing & Licensing

TierPriceIncludes
Open SourceFreeFull agent, open license

Hardware cost: Any old Android phone + computer. Hidden costs: LLM API usage per step.


Competitive Positioning

CompetitorDifferentiation
OpenClawOpenClaw uses API integrations; droidclaw uses screen reading — no APIs needed
HappyCapyHappyCapy automates browsers; droidclaw automates Android phones
ManusManus is cloud-managed; droidclaw is self-hosted phone control

Bottom Line

droidclaw is a genuinely novel approach to personal AI agents: instead of building integrations, use the phone's screen as the universal API. It's slower and more fragile than API-based agents, but it works with any app without any setup. The "turn old phones into agents" pitch is compelling for hardware recycling and for automating apps that have no API.

Recommended for: Tinkerers with spare Android phones who want to automate apps without APIs.

Not recommended for: Production workflows, time-sensitive automation, or users wanting reliable API-based integrations.


Research by Ry Walker Research • methodology