Why another coding model matters
Every few months the internet names a new best tool for writing software. Last winter it was GPT 4o, then Claude Code, then DeepSeek V3. Each model could spit out neat snippets, but none felt like a real teammate. Qwen3 Coder lands differently. It does not just generate functions. It plans, executes, tests, refactors, and keeps going until the job is done. In other words, it behaves like an AI coding agent capable of holding a screwdriver instead of sketching one.
Today we will unpack what makes Qwen3 Coder special, where the family sits after the 2026 follow-up releases, how you can run the local checkpoints, when the managed Qwen3 API makes more sense, and why its open release still matters for teams that want serious coding agents without handing every repository to a closed vendor.
May 2026 update: This review now treats the model as a family, not a single launch. The practical choices are the 480B-A35B flagship for maximum open-weight quality, the 30B-A3B model for lighter self-hosting, Qwen3-Coder-Next for local coding agents, and Alibaba Cloud’s qwen3-coder-plus or qwen3-coder-flash routes when you want managed API access.
My short recommendation: use Qwen3-Coder-Next for local agent experiments, qwen3-coder-plus for hosted production workflows, and keep a frontier closed model available for the hardest reasoning-heavy reviews.
Table of Contents
1. From clever parrot to senior dev
Many language models sound brilliant at first blush, yet collapse when forced to run their own code. They hallucinate imports, miss edge cases, or forget to close files. They are the bright intern at a whiteboard.
Qwen3 Coder moved the goalposts at launch, and the 2026 family makes the choice more practical. The flagship Qwen3-Coder-480B-A35B-Instruct remains a 480 billion parameter Mixture of Experts model with 35 billion active weights per token. The newer lineup adds Qwen3-Coder-30B-A3B-Instruct and Qwen3-Coder-Next, an 80B total / 3B active model designed specifically for local coding agents.
Taken together, the family can:
- Spin up a REPL, feed its own code through a linter, catch exceptions, then patch the bug without human nudging.
- Parse a Pull Request, weight the risk of each change, and suggest the safest merge path.
- Stretch context windows to 256K tokens natively, while the managed qwen3-coder-plus route supports up to a one million token context for repo-scale prompts.
That package turns the model into a practical engineering tool. The important 2026 shift is choice: run a smaller open checkpoint locally when privacy matters, call the managed route when speed and operational simplicity matter, and reserve the huge 480B model for deeper offline jobs.
Reinforcement learning on real code
Alibaba’s public launch notes say the pretraining mix reached 7.5 trillion tokens with a 70 percent code ratio, then used Qwen2.5-Coder to clean and rewrite noisy data. The agentic recipe matters because it rewards passed tests, tool recovery, and long-horizon execution instead of pretty prose alone. Qwen’s training stack also used 20,000 parallel environments to pressure-test candidate policies at cloud scale.
2. Qwen3 Coder versus Claude Code in a live Pomodoro build off

Talk is cheap, so I ran a test you can replicate: ask each model to build a browser based Pomodoro timer from a single prompt.
- Qwen3 Coder scaffolded HTML, CSS, and vanilla JS, added a work break toggle I never requested, wired keyboard shortcuts, and shipped a tidy UI that fit Bootstrap breakpoints.
- Claude Code produced functional code yet skipped responsiveness and offered no extras.
- GPT 4o delivered clean markup but failed to debounce the start button, so timers doubled on rapid clicks.
The new model’s proactive streak echoed Karpathy’s “feature scent.” It guessed that anyone using a timer would appreciate a break switch, then built it. That is the difference between “generate” and “do.”
3. The Hard Data: Reading the LiveCodeBench Leaderboard Like a Pro

Benchmarks rarely tell the whole story, but they stop a review from drifting into vibes. For this refresh I am leaning on official Qwen model cards, Alibaba Cloud Model Studio pricing, the Qwen3-Coder-Next technical paper, and the coding leaderboard work we maintain in our best LLM for coding benchmark hub. The useful question is no longer “where did one launch model rank in July 2025?” It is “which Qwen coding route should a builder use now?”
The table below replaces the old single-row leaderboard view with the current family map. It separates open-weight checkpoints from managed API routes, because those are different buying decisions for a real engineering team.
| Model / Route | Status | Context | Official Benchmark Signal | Best Use |
|---|---|---|---|---|
| Qwen3-Coder-Next | Open-weight, 80B total / 3B active | 256K native | 70.6 on SWE-bench Verified, 44.3 on SWE-bench Pro, 36.2 on Terminal-Bench 2.0 | Local coding agents, Cline/Qwen Code workflows, private repo experiments |
| Qwen3-Coder-480B-A35B-Instruct | Open-weight flagship, 480B total / 35B active | 256K native, extendable to 1M with YaRN | 38.7 on SWE-bench Pro, 23.9 on Terminal-Bench 2.0, 78.16 on EvasionBench | Highest-quality open-weight offline runs, research, heavy agentic evaluations |
| Qwen3-Coder-30B-A3B-Instruct | Open-weight smaller MoE checkpoint | 262K class context window | Positioned by Qwen for coding, browser-use, and tool-use workflows | Cheaper self-hosting, LoRA experiments, lower-memory dev servers |
| qwen3-coder-plus | Managed Alibaba Cloud Model Studio route; stable version qwen3-coder-plus-2025-09-23 | Up to 1M tokens | Hosted production route for the Qwen3 coding family | Production API calls, long repository prompts, teams that prefer managed infra |
| qwen3-coder-flash | Lower-cost managed route with context-cache support | Up to 1M tokens in pricing tiers | Priced for fast, lower-cost coding workloads | Drafting, iterative IDE help, cheaper high-volume coding assistance |
The three lenses that change the picture
1. Cost performance ratio
The old eighty-eight-cent output claim is no longer the clean way to explain this model. Alibaba now splits the coding routes by deployment mode and token tier. International qwen3-coder-plus still starts at $1 input / $5 output per million tokens for prompts under 32K, while Global pricing lists the same stable plus route at $0.574 input / $2.294 output per million tokens. For a broader market view, keep this page linked to our LLM pricing comparison rather than freezing a single bargain number forever.
2. Latency trade off
The 480B model is not the model I would put in a tight edit-compile loop. Qwen3-Coder-Next and qwen3-coder-flash are the better daily-driver candidates, while qwen3-coder-plus fits long-context managed tasks. For raw frontier comparisons, send readers to the regularly updated best LLM for coding table instead of pretending one July 2025 leaderboard still settles the argument.
3. Open-weight leverage
The strategic advantage is still local control. Qwen3-Coder-Next gives teams a credible agentic coding model they can test inside their own environment, and the 480B model remains valuable for labs that can afford the hardware. That matters for regulated codebases, private repos, and teams building custom agents on top of open weights.
What the table really shows
The table no longer shows one neat winner. It shows a product ladder. Qwen3-Coder-Next is the practical open-weight answer for local coding agents, the 480B model is the heavy research-grade checkpoint, and qwen3-coder-plus is the managed route for teams that care more about uptime and API simplicity than owning the whole stack.
Viewed through those lenses, the story is stronger than the launch-day leaderboard. Alibaba turned one impressive release into a usable coding model family, and that is the part builders should care about in 2026.
4. Why open models change more than pricing
Releasing a top tier coder under an open source AI license is not charity. It is strategy. An open model:
- Breaks platform lock in. Startups can fine tune Qwen3 on private repos without leaking IP to Anthropic or OpenAI.
- Enables edge inference. Telecoms can embed Qwen3 Coder in local build farms that never touch public clouds.
- Spawns a plugin gold rush. We already see Qwen routes showing up across IDE agents, OpenAI-compatible clients, vLLM, SGLang, Cline, and local quantized workflows. The ecosystem matters almost as much as the checkpoint.
- Drives research parity. Academics finally get a model within shooting distance of o3 performance that they can probe, patch, and publish against.
Alibaba benefits too. Every pull request that optimizes a kernel or fixes a tokenization bug flows back upstream, cutting R&D spend. The same flywheel powered PyTorch and TensorFlow adoption. Qwen aims to repeat the trick.
5. Field Test: Five Real World Sprints With Qwen3 Coder

Drop theory, boot up reality. I installed Qwen3 Coder on a single H100 box and threw five messy problems at it, the kind that chew through weekends. What follows is a blow by blow account of how the agent worked, where it stumbled, and why it kept surprising me. Every case ran live, no cherry picking.
1. Refactor a Legacy Payment Gateway
The starting point was a spaghetti Java monolith that still used SHA 1 signatures. My prompt:
pgsqlCopyEditMigrate all signing code to SHA 256.
Keep the public interface stable.
Write integration tests for Stripe, PayPal, and our fake sandbox.
Qwen3 Coder parsed nine interconnected packages, found each MessageDigest.getInstance("SHA1"), and swapped in SHA 256. Then it rewired a brittle reflection hack by introducing a factory method. The agent wrote three JUnit tests, spun an in memory H2 database, and ran Maven twice to prove green checks. Latency was brutal at first compile, almost six minutes, yet the final diff came out spotless. When I merged to main, Jenkins stayed green. GPT 4o did the same job faster but left one deprecated import that broke in Java 21.
2. Auto generate REST Docs
I had 47 JSON endpoints spread across FastAPI. Documentation lagged months behind. Prompt:
cssCopyEditRead every route in src/api.
Build an OpenAPI spec.
Generate Markdown docs with code fences and cURL examples.
Qwen3 Coder crawled each decorator, captured path, query, and body models, and built a correct OpenAPI 3.1 file. Then it wrote human readable docs and pushed them into docs/api.md. The swagger file validated on the first try. Claude Code needed a second pass because it missed nested Union schemas. Qwen3 Coder nailed them. This moved the docs task from half a day to twelve minutes.
3. Hardening Terraform in FinTech Staging
Security flagged a public S3 bucket. My prompt:
pgsqlCopyEditScan infra/terraform for public resources.
Lock them down.
Explain each change in a CHANGELOG entry.
The agent listed every aws_s3_bucket block, detected acl = "public-read", and switched to private while adding block_public_acls = true. It turned on versioning for free. After a quick policy lint it wrote a CHANGELOG with bullet points and linked CVE references. The plan applied without manual edits. DeepSeek V3 caught the same bucket but forgot replication rules, which broke logs. Qwen3 Coder kept everything intact.
4. Teaching SQL by Example
A junior dev kept asking why window functions beat subqueries. I opened Chat devtools:
pgsqlCopyEditCreate an interactive tutorial that shows the difference between a
GROUP BY subquery and a window function on the sales table.
Include runnable PostgreSQL snippets.
Qwen3 Coder emitted a Jupyter notebook with two cells: one seeded mock data, the next ran both queries and plotted execution time with matplotlib. It used EXPLAIN ANALYZE, parsed the timing, and graphed bars. The notebook rendered immediately on VS Code. Karpathy style, the code was dense yet readable. The junior dev watched the bar chart and never asked again. GPT 4o produced a notebook too, but skipped the bar chart and used vague text.
5. Automated Pull Request Triage
Our repo sees ten PRs daily. I wanted a bot that labels each PR as bug, feature, or chore, assigns reviewers, and comments if no tests were changed. Prompt to tools = [read_file, write_file, list_directory]:
sqlCopyEditFor every open PR, run tests.
If coverage drops label needs tests.
Add a friendly comment.
Otherwise merge to develop.
The agent cloned the repo, checked diff stats, and called GitHub GraphQL to set labels. It merged two trivial PRs, opened review discussions on another three that lacked tests, and left markdown formatted comments citing specific lines. Latency per PR averaged ninety seconds, slow yet acceptable. Claude Code refused to merge automatically, citing company policy. Qwen3 Coder followed instructions without backtalk.
Takeaways
- Qwen3 Coder excels when the task is “open the hood, twist bolts, rerun tests.”
- Long context means complete understanding. The agent rarely loses variable references across files.
- Latency is the tax you pay. For interactive coding you’ll switch to o4 Mini; for overnight refactor jobs this model rules.
- It writes fluent English. Comments read like a mid career engineer, not a textbook.
- The cost per million tokens stays low. My month of experiments burned less than five dollars of output credit.
Overall, Qwen3 Coder replaced half a sprint’s grunt work with a few prompts and patient coffee breaks. It is not magic, but it’s the first open model that lets me focus on architecture rather than string parsing.
After fifteen days of pairing with Qwen3 Coder on a real micro service migration, three patterns emerged.
| Pain Point | How We Fixed It |
|---|---|
| Long latency on first response when context > 200 K. | Pre chunked the repo and streamed only the diff. Latency dropped from 420 s to 70 s. |
| Occasional phantom imports from obscure Python libs. | Added pip check to the RL loop. Model learned to stick to stdlib unless asked. |
| Over zealous auto refactor touching obsolete legacy files. | Scoped prompts with a file allow list passed as a JSON tool. |
In contrast, the thrills:
- The agent learned our Git hooks, so pull requests arrived with green checks on the first push.
- It wrote migration docs while tests ran, saving an afternoon of technical writing.
- It passed 88 % of our internal bug fix tickets on the first try, beating GPT 4o by six points.
Beyond Coding: Unexpected Use Cases
- Data Engineering: Feed the agent a Snowflake schema, and it writes incremental ELT jobs in Airflow.
- DevRel Blog Generation: Point it at a Pull Request diff, get a markdown change log complete with code fences.
- Security Audits: Run the model over Terraform files. It flags public S3 buckets, then auto patches with least privilege policies.
- Education: Instructors generate dozen variant assignments, each with a hidden test suite, then let the same model grade submissions.
6. Hands On Guide: Putting Qwen3 Coder to Work
Drop the theory and choose the route. In 2026, the question is not simply whether you can run Qwen3 Coder. It is whether you should run Qwen3-Coder-Next locally, self-host the larger checkpoints, or call Alibaba’s managed endpoint for production work.
6.1 Pick the Right Box
| Workflow | Recommended Route | Why | Watch-out |
|---|---|---|---|
| Local private coding agent | Qwen3-Coder-Next via vLLM, SGLang, llama.cpp, LM Studio, Ollama-compatible quantizations, or Cline | 80B total / 3B active design keeps agentic capability realistic for local or small-server deployment | Use official or trusted quantizations and test file-edit behavior before letting it write to production repos |
| Managed production API | qwen3-coder-plus-2025-09-23 | Stable Alibaba route, long context, OpenAI-compatible client path | Token tiers get expensive above 256K, so trim vendored files and logs before sending prompts |
| Lower-cost API drafting | qwen3-coder-flash | Cheaper pricing tiers and context-cache support for iterative workflows | Use plus or a stronger frontier model for high-risk architecture and security decisions |
| Research-grade open-weight runs | Qwen3-Coder-480B-A35B-Instruct on serious multi-GPU infrastructure | Largest open Qwen coding checkpoint with strong agentic coding/tool-use positioning | Hardware and serving complexity are the project, not a footnote |
The biggest practical correction is this: do not start with the 480B model unless you actually need it. Start with Qwen3-Coder-Next for local coding-agent work, then escalate to qwen3-coder-plus or the 480B checkpoint when the task, privacy requirement, or benchmark experiment justifies the extra cost.
6.2 Run Locally With Qwen3-Coder-Next
bashCopyEditpip install "vllm>=0.15.0"
vllm serve Qwen/Qwen3-Coder-Next \
--port 8000 \
--tensor-parallel-size 2 \
--enable-auto-tool-choice \
--tool-call-parser qwen3_coder
Then point your IDE agent at the OpenAI-compatible local endpoint. If memory is tight, reduce the served context window for development and only expand it when you truly need repo-scale prompts. For a more step-by-step local workflow, use our Qwen3-Coder-Next local install guide.
6.3 Self Host Through Hugging Face + Docker
bashCopyEditdocker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Qwen/Qwen3-Coder-Next" \
--host 0.0.0.0 \
--port 30000
The official model card documents vLLM, SGLang, Docker Model Runner, and quantized community routes. For internal bots, keep the endpoint behind your VPN, log every file operation, and run the agent against a branch rather than directly against main.
6.4 Call the Managed Qwen3 API
javascriptCopyEditimport OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});
const completion = await client.chat.completions.create({
model: "qwen3-coder-plus-2025-09-23",
messages: [
{ role: "system", content: "You are a meticulous senior engineer." },
{ role: "user", content: "Refactor this Python script for async IO." }
]
});
console.log(completion.choices[0].message.content.trim());
Swap this into any existing OpenAI-compatible workflow. If you subscribe to Alibaba’s Coding Plan, use the plan-specific API key and base URL instead of the general pay-as-you-go Model Studio key.
6.5 Fine Tune on Your Private Repo (PEFT)
pythonCopyEditfrom peft import LoraConfig, get_peft_model
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments, Trainer
base_id = "Qwen/Qwen3-Coder-30B-A3B-Instruct"
tok = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_id,
torch_dtype="bfloat16",
device_map="auto",
trust_remote_code=True
)
peft_cfg = LoraConfig(
r=16, lora_alpha=32, lora_dropout=0.1,
target_modules=["q_proj", "v_proj"]
)
model = get_peft_model(model, peft_cfg)
trainer = Trainer(
model=model,
args=TrainingArguments("qwen_ft", learning_rate=1e-4, num_train_epochs=1),
train_dataset=my_private_dataset
)
trainer.train()
Use a smaller open checkpoint for PEFT experiments unless you have a serious multi-GPU training budget. Keep the adapter, evaluation set, and base model version pinned together so you can reproduce results later.
6.6 Watch the Meter: Tiered Pricing
| Route | Token Tier | Input $/M | Output $/M | Use When |
|---|---|---|---|---|
| qwen3-coder-plus International | 0-32K | $1.00 | $5.00 | Managed production coding with moderate context |
| qwen3-coder-plus International | 256K-1M | $6.00 | $60.00 | Only when the full repository context is worth the bill |
| qwen3-coder-plus Global | 0-32K | $0.574 | $2.294 | Lower-cost Global deployment is available for your account and region |
| qwen3-coder-flash Global | 0-32K | $0.144 | $0.574 | Cheap drafting, iterative edits, and high-volume assistant work |
| Alibaba Coding Plan Pro | Subscription | $50/month | Quota based | Interactive coding tools such as Claude Code or OpenClaw, not backend batch jobs |
Alibaba’s pricing page lists a one million token free quota for eligible International Model Studio models, valid for 90 days after activating Model Studio. The Global deployment mode has no free quota, so check the region before quoting a cost estimate.
6.7 Latency vs. Workflow
For tight feedback loops, use Qwen3-Coder-Next locally or qwen3-coder-flash through the API. Use the 480B model and million-token prompts for background work: repo-wide audits, migration planning, documentation refreshes, and pull request triage where minutes of latency do not break flow.
6.8 Checklist Before You Ship
- Scope the context. Chunk gigantic repos so you do not pay for stray vendor folders.
- Pin the model tag. Use explicit version IDs such as
qwen3-coder-plus-2025-09-23to avoid surprise updates. - Stream output. The API supports SSE, so pipe tokens directly into your IDE for faster perceived performance.
- Log tool calls. When the agent reads or writes files, capture those events for audit trails.
With these steps in place, you can hand Qwen3 Coder a thousand-file legacy codebase on Friday and come back Monday to a reviewable branch, not an un-audited miracle. The human still owns merge authority.
7. Inside the training lab
Qwen engineers pulled three levers simultaneously:
- More code-heavy data: Qwen reports 7.5 trillion training tokens with a 70 percent code ratio, enough to preserve general reasoning while specializing for software work.
- Cleaner synthetic data: The team used Qwen2.5-Coder to clean and rewrite noisy code data, which is exactly the kind of preprocessing that reduces brittle completions and messy variable choices.
- Long-horizon agent training: The newer Qwen3-Coder-Next work emphasizes tool use, recovery from execution failures, and adaptation to real IDE/CLI scaffolds rather than one-shot benchmark answers.
The training platform itself scaled horizontally, with Qwen describing 20,000 parallel environments for agent reinforcement learning. That is the key reason the model family feels more like a coding worker than a snippet generator: it has been trained to observe failures and keep moving.
8. What this means for builders
- Solo devs should start with Qwen3-Coder-Next, especially if they want local repository help without a recurring API bill.
- Enterprises can self-host open checkpoints for sensitive work, then route overflow or long-context jobs to qwen3-coder-plus with clear logging and review gates.
- Tool vendors can support multiple Qwen routes: Next for local agents, flash for cheaper cloud drafting, plus for longer managed tasks, and stronger closed models for the toughest edge cases.
- Educators gain a coding tutor that can generate assignments, explain failing tests, and demonstrate fixes without locking the whole classroom into one proprietary vendor.
We are watching the same pattern that played out in image generation. Open models stopped being demos and became infrastructure. Qwen3 Coder is strongest when you treat it as part of a routing strategy, not as a single universal replacement for every coding model.
Looking Ahead
The smaller descendants are no longer hypothetical. Qwen3-Coder-30B-A3B-Instruct and Qwen3-Coder-Next already moved the family toward efficient local agents, while Alibaba’s managed routes now compete on subscription and token pricing. The next thing to watch is not just a bigger benchmark score; it is whether open coding agents can become reliable enough for routine pull request ownership with human review instead of human babysitting.
Final thoughts
Qwen3 Coder is no longer just a shiny 2025 launch story. It is a coding model family with a sensible ladder: Qwen3-Coder-Next for local agents, 30B-A3B for lighter open experiments, 480B-A35B for heavy open-weight work, and qwen3-coder-plus for managed production tasks.
Will it beat the strongest closed coding systems on every hard repo repair? No. Will it replace careful senior review? Also no. But it gives builders an unusually strong open option, and that changes budget, privacy, and experimentation in a way a closed API alone cannot.
If you write code for a living, the right move is not blind hype. Test Qwen3-Coder-Next on a real branch, compare the result against your favorite frontier model, measure review time saved, and only then decide where it belongs in your stack.
Ready to try it? Start with the local Next checkpoint or the pinned qwen3-coder-plus API route, run it on a non-production repo, and measure whether it actually reduces review time.
Azmat — Founder of Binary Verse AI | Tech Explorer and Observer of the Machine Mind Revolution.
For current coding-model rankings, start with our best LLM for coding benchmark hub. To compare API bills before you ship, use the LLM pricing comparison. If you want a hands-on local setup, continue with the Qwen3-Coder-Next local install guide.
For questions or feedback, feel free to contact us or browse more insights on BinaryVerseAI.com.
What makes Qwen3-Coder a true AI coding agent?
Qwen3-Coder is trained for agentic coding workflows: it can plan, call tools, inspect execution results, recover from failed runs, and continue toward a software task rather than only generating a snippet.
Is Qwen3-Coder free to use for commercial projects?
The open Qwen3-Coder checkpoints are released under Apache 2.0, so teams can use, deploy, and adapt them commercially. Managed Alibaba Cloud routes still have token or subscription costs.
How does Qwen3-Coder perform in AI coding benchmarks?
The official Qwen3-Coder-Next card reports 70.6 on SWE-bench Verified, 44.3 on SWE-bench Pro, and 36.2 on Terminal-Bench 2.0. The 480B-A35B model card reports 38.7 on SWE-bench Pro, 23.9 on Terminal-Bench 2.0, and 78.16 on EvasionBench.
What are the hardware requirements to run Qwen3-Coder locally?
For local work, start with Qwen3-Coder-Next or a trusted quantization rather than the full 480B model. The 480B-A35B checkpoint is a serious multi-GPU serving project, while Next is the practical local-agent path.
How does Qwen3-Coder compare to GPT and Claude coding models?
Qwen3-Coder is strongest when privacy, open weights, cost control, and local experimentation matter. Closed frontier systems may still win the hardest reasoning-heavy repairs, so serious teams should benchmark both on their own repos.
Is Qwen3 good for coding tasks?
Yes. The Qwen3-Coder branch is specifically optimized for software engineering and agentic workflows, including multi-file code work, tool use, documentation, testing, and refactoring.
What is the best Qwen version for coding?
For most developers in 2026, Qwen3-Coder-Next is the best starting point for local coding agents. Use qwen3-coder-plus for managed production work and the 480B-A35B checkpoint for heavier research-grade open-weight runs.
What is the context length limit for Qwen3-Coder?
The flagship Qwen3-Coder launch describes 256K native context and up to 1M with YaRN extrapolation. Alibaba Cloud pricing also lists qwen3-coder-plus tiers up to 1M tokens for managed API use.
Is Qwen3-Coder open-source and free?
The open-weight checkpoints can be downloaded and used under Apache 2.0. API usage through Alibaba Cloud is not free beyond any eligible trial quota or subscription plan.
