Key takeaways
- The free ride is over: Qwen Code's free OAuth tier (originally 1,000 requests/day, cut to 100, then discontinued April 15, 2026) now requires a bring-your-own API key or the $50/mo Alibaba Cloud Coding Plan Pro
- Most model-flexible of the foundation-lab agents — beyond Qwen3.6-Plus and Qwen3.5-Plus it speaks OpenAI-compatible, Anthropic, and Google GenAI protocols, plus OpenRouter, Fireworks AI, and local Ollama/vLLM
- Mature open-source footprint: 25K+ stars, 481 releases at v0.17.1, headless/CI mode, daemon mode, VS Code/Zed/JetBrains plugins, and TypeScript/Python/Java SDKs — all Apache 2.0
FAQ
What is Qwen Code?
Qwen Code is Alibaba's open-source AI coding agent that runs in your terminal, adapted from Google's Gemini CLI and optimized for Qwen-Coder models.
How much does Qwen Code cost?
The CLI itself is free (Apache 2.0). Model access requires an API key (pay-per-use via DashScope, OpenRouter, Fireworks, or any OpenAI-compatible provider), the Alibaba Cloud Coding Plan (Pro is $50/mo), or free local inference via Ollama/vLLM. The free OAuth tier ended April 15, 2026.
What models does Qwen Code support?
Qwen3.6-Plus and Qwen3.5-Plus are the headline models; the Coding Plan also includes qwen3-coder-plus, qwen3-max, glm-4.7, and kimi-k2.5. It additionally supports OpenAI, Anthropic, and Gemini models via API key, and local models through Ollama or vLLM.
How is Qwen Code different from Gemini CLI?
Qwen Code began as a Gemini CLI fork but diverged into a multi-protocol, model-flexible agent — and unlike Gemini CLI, its open-source development has continued while its free hosted tier was the part that got cut.
Executive Summary
Qwen Code is Alibaba's official open-source terminal coding agent — 25K+ GitHub stars, 2.5K forks, 481 releases at v0.17.1, Apache 2.0 — adapted from Google's Gemini CLI codebase and optimized at the parser level for Qwen-Coder models.[1] It is the most model-flexible of the foundation-lab agents: alongside Qwen3.6-Plus and Qwen3.5-Plus, it speaks OpenAI-compatible, Anthropic, and Google GenAI protocols and runs local models via Ollama or vLLM.[1][2]
The headline as of June 2026 is the pricing pivot: the free Qwen OAuth tier — once 1,000 requests/day, cut to 100/day — was discontinued on April 15, 2026. Hosted access now means a pay-per-use API key (DashScope, OpenRouter, Fireworks AI, or any compatible provider) or the Alibaba Cloud Coding Plan, whose Pro tier runs $50/month. Self-hosted open-weight Qwen models remain free.[3][4]
| Attribute | Value |
|---|---|
| Company | Alibaba (Qwen team) |
| Founded | 1999 (Qwen Code: 2025) |
| Funding | Public company (BABA) |
| GitHub Stars | 25K+ (June 2026)[1] |
| License | Apache 2.0[1] |
| Headquarters | Hangzhou, China |
Product Overview
Qwen Code is a terminal-first AI coding agent that understands codebases and automates development work, with an interactive terminal UI plus a headless mode (-p flag) for scripts and CI pipelines.[1] The project openly acknowledges its lineage: it is based on Google's Gemini CLI, adapted "to better support Qwen-Coder models" — a fork that has since accumulated 481 of its own releases.[1]
Beyond the CLI it has grown into a small platform: IDE integrations for VS Code, Zed, and JetBrains; a daemon mode (qwen serve) exposing shared sessions over HTTP+SSE; and SDKs in TypeScript, Python, and Java for embedding the agent in other tools.[1]
Key Capabilities
| Capability | Description |
|---|---|
| Headless / CI Mode | -p flag for non-interactive scripting and pipelines[1] |
| Daemon Mode | qwen serve — shared HTTP+SSE sessions[1] |
| Multi-Protocol Models | OpenAI-compatible, Anthropic, Google GenAI protocols[2] |
| Local Models | Ollama and vLLM support — fully free, fully private[1] |
| Vision | Image understanding with approval and YOLO modes[1] |
| SDKs | TypeScript, Python, Java[1] |
Product Surfaces / Editions
| Surface | Description | Availability |
|---|---|---|
| CLI | Terminal agent, interactive + headless | GA |
| IDE Plugins | VS Code, Zed, JetBrains | GA |
| Daemon | qwen serve HTTP+SSE server | GA |
| SDKs | TypeScript / Python / Java embedding | GA |
Technical Architecture
Installation:[1]
# Quick install (Linux/macOS)
curl -fsSL https://qwenlm.github.io/qwen-code/install.sh | bash
# npm / Homebrew
npm install -g @qwen-code/qwen-code
brew install qwen-code
Key Technical Details
| Aspect | Detail |
|---|---|
| Deployment | Local CLI, daemon server, or embedded via SDK |
| Model(s) | Qwen3.6-Plus (Apr 2026), Qwen3.5-Plus; Coding Plan adds qwen3-coder-plus, qwen3-max, glm-4.7, kimi-k2.5[1][2] |
| Integrations | VS Code/Zed/JetBrains, OpenRouter, Fireworks AI, Ollama, vLLM[1] |
| Open Source | Yes (Apache 2.0), fork of Gemini CLI[1] |
Authentication options:[2]
- Alibaba Cloud Coding Plan — subscription API key (
sk-sp-...) against a dedicated DashScope coding endpoint; Beijing and international regions - API key — OpenAI-compatible (OpenAI, Azure, OpenRouter, ModelScope, DashScope), Anthropic, or Google GenAI protocols, plus custom endpoints
- Local — Ollama or vLLM, no account required
Qwen OAuth— free tier discontinued April 15, 2026[2][4]
Strengths
- Model flexibility unmatched among lab agents — OpenAI-compatible, Anthropic, and Gemini protocols plus OpenRouter, Fireworks, and local Ollama/vLLM, where Claude Code, Codex, and Gemini CLI are locked to their own labs[2]
- Genuinely free path survives — open-weight Qwen models self-hosted via Ollama/vLLM cost nothing, unlike the hosted tiers[3]
- Rapid release cadence — 481 releases, with v0.17.1 shipped June 3, 2026 and pushes as recent as June 11, 2026[1]
- Platform breadth — headless CI mode, daemon mode, three IDE plugins, and three SDK languages from one repo[1]
- Strong model reputation — Hacker News commenters called Qwen's coding models "the most capable agentic coding model I've tested at that size by far," with viable 20–70 tokens/sec local inference on consumer hardware[5]
Cautions
- Free tier rug-pull — the OAuth tier went from 1,000 requests/day to 100 to discontinued (April 15, 2026) in a matter of months, part of a broader industry retreat from free coding-agent tiers[3][6]
- Open-source direction uncertain — a 783-point Hacker News thread documented tension between Qwen's research and product teams, key researcher departures, and speculation that future Qwen models may go closed and proprietary[5][3]
- Model steering quirks — users report Qwen models deciding mid-task that it would be "simpler" to abandon detailed instructions[5]
- Alibaba Cloud dependence for hosted use — the Coding Plan routes through Alibaba Cloud endpoints (Beijing or international regions), a compliance question for some Western enterprises[2]
- Fork lineage — architecture inherits from Gemini CLI rather than a ground-up design; differentiation lives mostly in model support and the daemon/SDK layer[1]
Pricing & Licensing
| Tier | Price | Includes |
|---|---|---|
| Self-hosted (Ollama/vLLM) | $0 | Open-weight Qwen models, local inference[3] |
| API key (BYOK) | Pay-per-use | DashScope, OpenRouter, Fireworks, OpenAI/Anthropic/Gemini, custom endpoints[2] |
| Alibaba Cloud Coding Plan Pro | $50/mo | qwen3.5-plus, qwen3-coder-plus, qwen3-max, glm-4.7, kimi-k2.5 via dedicated endpoint[3][2] |
| Discontinued | Was 1,000 req/day, cut to 100, ended April 15, 2026[3][4] |
Licensing model: Open source (Apache 2.0) CLI + paid hosted model access (subscription or pay-per-use)[1]
Hidden costs: Hosted Qwen access now always costs money; self-hosting frontier-class coder models requires serious GPU hardware[3]
Competitive Positioning
Direct Competitors
| Competitor | Differentiation |
|---|---|
| Claude Code | Claude Code is Anthropic-only with a stable subscription path; Qwen Code is multi-protocol and self-hostable |
| Codex | Codex spans desktop/web/mobile surfaces on OpenAI models; Qwen Code is terminal-first with BYOK flexibility |
| Gemini CLI | Qwen Code's upstream — Gemini CLI's individual tiers sunset June 18, 2026, while the Qwen fork keeps shipping (but lost its own free tier) |
| OpenCode | Both model-agnostic and open source; Qwen Code is first-party from a model lab with tuned Qwen-Coder support |
When to Choose Qwen Code Over Alternatives
- Choose Qwen Code when: You want a lab-built agent that runs on open-weight models you host yourself, or need one CLI that can point at OpenAI, Anthropic, Gemini, OpenRouter, or local endpoints
- Choose Claude Code when: You prefer Anthropic models and a predictable individual subscription
- Choose Codex when: You want the multi-surface OpenAI ecosystem with desktop apps and integrations
- Choose Gemini CLI when: You hold a Gemini Code Assist Standard/Enterprise license that survives the individual sunset
Ideal Customer Profile
Best fit:
- Developers self-hosting open-weight models via Ollama/vLLM who want a first-party agent for free
- Teams needing one agent across multiple model providers (BYOK, OpenRouter, local)
- Builders embedding agents via the TypeScript/Python/Java SDKs or daemon mode
- Asia-Pacific organizations already on Alibaba Cloud
Poor fit:
- Developers who chose it for the free hosted tier — that path is gone[3]
- Western enterprises with restrictions on Alibaba Cloud-hosted inference
- Teams that want guaranteed open-weight Qwen releases long term, given the proprietary-direction signals[5]
Viability Assessment
| Factor | Assessment |
|---|---|
| Financial Health | Strong — Alibaba is a public megacap; monetization now explicit via Coding Plan[3] |
| Market Position | Challenger — 25K+ stars vs 90K+ (Codex CLI) and 105K+ (Gemini CLI), but the only lab agent with full model flexibility[1] |
| Innovation Pace | Fast — 481 releases, active pushes through June 2026[1] |
| Community/Ecosystem | Solid but uneasy — strong model praise alongside concerns about Qwen's open-source future[5] |
| Long-term Outlook | Tool likely persists as Apache 2.0; the open question is whether future Qwen models stay open-weight[5] |
Alibaba can fund Qwen Code indefinitely, and the Apache 2.0 license plus self-hosted model path makes it more rug-pull-resistant than Gemini CLI proved to be — the code and the weights are both in users' hands. The risk is upstream: reported research-team departures and a drift toward proprietary models would erode exactly the openness that differentiates it.[5][3]
Bottom Line
Qwen Code is the foundation-lab coding agent for people who don't want to be locked to a foundation lab: Apache 2.0, forked from Gemini CLI, tuned for Qwen-Coder models, but able to drive OpenAI, Anthropic, Gemini, OpenRouter, or fully local models. With 25K+ stars and 481 releases it is past the experiment stage.[1]
The April 15, 2026 free-tier shutdown reframed it: hosted convenience now costs $50/month (Coding Plan Pro) or pay-per-use, while the self-hosted path stays free — making Qwen Code effectively two products: a paid Alibaba Cloud client and a free open-source harness for local models.[3][6]
Recommended for: Developers running open-weight models locally, multi-provider teams wanting one BYOK terminal agent, and Alibaba Cloud customers.
Not recommended for: Anyone seeking a free hosted coding agent, or enterprises that can't route inference through Alibaba Cloud.
Outlook: Watch whether Qwen's next model generation stays open-weight. If it does, Qwen Code is the natural home for Gemini CLI's free-tier refugees; if Qwen goes proprietary, the tool becomes just another paid lab client — albeit one whose Apache 2.0 code can't be taken back.[5]
Research by Ry Walker Research • methodology
Sources
- [1] Qwen Code GitHub Repository
- [2] Qwen Code Docs: Authentication
- [3] Decrypt: Free Qwen Is Dead — Alibaba Shuts Down Qwen Code Free Tier
- [4] Qwen Code Issue #3316: Update authentication methods to reflect OAuth discontinuation
- [5] Hacker News: Something is afoot in the land of Qwen
- [6] Homo Ludditus: Free AI coding agents are becoming scarce