An engineer’s field-notes on the future of work, one conversation at a time
1. A Morning After the Headline
I woke up to a push notification that screamed, “Microsoft study says millions of jobs replaced by AI.” The phrasing felt familiar, almost lazy, yet impossible to ignore. Every cycle has its apocalypse headline, but this one carried fresh weight. It wasn’t a hypothetical model built in a vacuum. It was data from 200,000 real conversations with Bing Copilot, neatly anonymized, then dissected by a research team that reads like a dream roster of computational social scientists: Kiran Tomlinson, Sonia Jaffe, Will Wang, Scott Counts, and Siddharth Suri.
Microsoft’s paper, “Working with AI: Measuring the Occupational Implications of Generative AI,” tries to quantify how generative systems seep into daily work. They do it by mapping chat transcripts to O*NET Work Activities and scoring each occupation’s “AI applicability.” It is the first serious attempt to track jobs replaced by AI through direct usage rather than thought experiments.
Yet the raw paper is a forest of statistical footnotes. You and I need a trail. This article builds that trail by translating the findings into plain English, adding a layer of lived engineering context, and challenging a few assumptions along the way. We will see that “replacement” rarely means a pink slip on day one. Often, it means something subtler: a shift in what counts as valuable human contribution.
Table of Contents
2. Why This Study Matters

Most research on AI job loss falls into two camps:
- Forecast models. Economists ask experts to estimate how much of a job’s task list an LLM could automate.
- Productivity trials. Small, controlled experiments measure time saved for tasks like code completion.
Both yield insight, but neither mirrors the messy reality of workers improvising with a chatbot at 2 a.m. The Microsoft team flips the script. They watch what people already do with AI and trace those actions back to occupations. That direct lens tracks the real pipeline from tool adoption to jobs replaced by AI or, just as often, jobs reshaped by AI.
This shift in analysis parallels similar methodological revolutions seen in AI cognition research and real-world impact studies like those exploring AI misinformation spread.
3. The Anatomy of an “AI Applicability” Score
The researchers break every chat into two lenses:
- User goal. What task does the human want done?
- AI action. What task does the model actually perform?
They classify both against 332 Intermediate Work Activities (IWAs) from O*NET. Then they blend five ingredients:
Ingredient | Meaning | Why it matters |
---|---|---|
Coverage | How many of an occupation’s IWAs show up in Copilot chats | Reveals breadth of AI touchpoints |
Task success | Thumbs-up ratio and LLM-judged completion | Tells whether the model truly helps |
Scope | Portion of a work activity the AI handles (minimal → complete) | Gauges depth of impact |
User goal score | Weighted by coverage, success, scope | Measures augmentation potential |
AI action score | Same formula, from the AI side | Measures automation potential |
Finally, they average the two for a single AI applicability figure.
Key takeaway: Applicability isn’t binary. Instead of stating that jobs replaced by AI will spike overnight, the score tracks how deeply AI intertwines with each profession’s fabric.
4. The Surprising Winners and Losers

Here’s a distilled slice of their findings, rewritten for clarity. Think of the left column as “most exposed” and the right as “least exposed,” not as immediate layoffs but as terrain where workflows will change fastest or slowest.
Rank | High-Applicability Occupations | Why AI Fits | Low-Applicability Occupations | Why AI Lags |
---|---|---|---|---|
1 | Interpreters & Translators | Text manipulation, context recall | Phlebotomists | Physical blood draws |
2 | Historians | Archival research, narrative synthesis | Nursing Assistants | Hands-on patient care |
3 | Passenger Attendants | Real-time Q&A, schedule queries | Dishwashers | Manual labor, low text load |
4 | Sales Reps (Services) | Scripted info delivery, CRM notes | Roofers | Outdoor physical tasks |
5 | Writers & Authors | Drafting, re-drafting, stylistic edits | Water Treatment Operators | Mechanical monitoring |
Notice something: high-applicability slots are dominated by knowledge work and customer communication, domains where language is the primary tool. That meshes with personal observation from the coding floor. When a teammate grabs ChatGPT for a “write the first draft” moment or a “summarize this RFC,” the delta in speed is blatant.
This trend is echoed in emerging benchmarks, such as the AI Math Olympiad Benchmark and language-focused tools like ChatGPT O3 Pro, which show language models outperforming expectations in structured domains.
5. Rethinking “Replacement”

If your role sits near the top of that table, you might gulp. Are jobs replaced by AI about to become your reality? The study itself is careful not to overstate. An LLM can handle some fraction of a task, but success and scope rarely hit 100%.
Consider technical writers. Copilot excels at reorganizing bullet points, explaining concepts, even injecting humor. But validating engineering accuracy, interviewing subject-matter experts, and aligning with brand nuance remain human-heavy. AI applicability may be high, yet full replacement is far from guaranteed.
Conversely, nursing assistants deal with people’s bodies. Chatbots don’t change diapers or lift patients. Even if the nurse uses AI for paperwork, the core value, physical caregiving, stays human for now.
Rule of thumb: The closer your daily output maps to text or structured data, the faster AI pushes from assistant to potential stand-in. Where muscle memory, fine motor skill, or empathy are core, the slope is much flatter. This pattern is also visible in recent studies like GPT-4o’s cognitive dissonance patterns and emergent misalignment.
6. Zooming Out: Occupation Groups
Aggregating individual roles to 22 SOC major groups paints an even clearer picture. Here’s one table that underlines the pattern (percentages are AI applicability scores lifted directly from the paper’s methodology but expressed in rounded figures for readability):
Major Group | Applicability Score | U.S. Workforce Size |
---|---|---|
Sales & Related | 0.32 | 13 M |
Computer & Mathematical | 0.30 | 5 M |
Office Support | 0.29 | 18 M |
Community & Social Service | 0.25 | 2 M |
Construction & Extraction | 0.08 | 6 M |
Healthcare Support | 0.05 | 7 M |
Three of the largest employment pools, Sales, Office Support, and Software, sit at the top. Translation: a huge number of citizens will eventually feel some reshuffling. That does not mean jobs replaced by AI equals 36 million layoffs. It means the scope for hybrid workflows is enormous. Expect recruiters demanding “Copilot-native sales scripts” and secretaries who juggle AI schedulers as deftly as they once managed fax machines.
This evolution reflects similar trends explored in AI in academia, where routine knowledge work is increasingly handled by copilots, leaving humans to focus on judgment and nuance.
7. Wages and the Myth of the Elite Threat
A common narrative claims that only well-paid knowledge workers face AI job loss. Reality is less tidy. The Microsoft data shows wage correlation is tiny, r ≈ 0.07. That weak signal implies both baristas and software architects may see tasks shift, though shapes differ.
The bigger wage-linked effect appears in education. Occupations needing a bachelor’s degree have slightly higher applicability than those requiring less schooling, but variance is wide. A bachelor’s security guard? Low exposure. A certificate-level web developer? High.
Lesson: Training level predicts less than task structure itself. If your day is talk, text, or numbers, assume AI will clip off the most routine layers, no matter your salary band.
This principle has surfaced in other domains too. For instance, the LLM hallucination problem affects both junior analysts and high-level advisors, challenging assumptions about which tasks are “safe.”
8. From Augmentation to Automation: A Two-Step Dance
Microsoft’s user-goal versus AI-action split lets us watch augmentation bleed into automation over time.
- Augmentation phase (User goal).
Workers ask the model to assist. The human steers, edits, and finalizes. Productivity bumps. Examples: drafting marketing emails, data summarization, code scaffolding. - Automation phase (AI action).
Teams notice the assistant can run without oversight for particular sub-tasks. Scripts wrap the model, API endpoints replace chat windows. Examples: auto-reply agents for basic tickets, nightly log analysis bots.
The paper’s 40% conversation stat, where AI-action IWAs don’t overlap user-goal IWAs, shows augmentation and automation can coexist in the same chat. A copywriter asks for brand-compliant slogans. Copilot writes, then also advises on A/B testing strategy. One chunk assists, another takes over.
These blurred boundaries are also reshaping how tools like Gemini 2.5 Deep Think are used, not just for answers but for decision flows.
9. Case Study: Sales Scripts on Autopilot
Take Sales Representatives of Services, high on the applicability list. Reps flood Copilot with prompts like:
Initially, the rep curates. Over months, marketing ops notices that certain replies need zero tweaks, especially FAQs. They integrate Copilot via API, and those snippets populate the CRM automatically. The rep now focuses on bespoke enterprise deals while standard objections run on autopilot.
Here the model didn’t nix the position. It unbundled it. The routine layer became code. The strategic layer, relationship building, nuanced negotiations, creative bundling, rose in prominence.
That is why “jobs replaced by AI” headlines can be misleading. Whole jobs seldom evaporate overnight. Layers within them do. The human either climbs to a new layer or leaves the ladder.
The Cursor Vibe Coding Guide illustrates a similar trend: automation at the base, human design at the top.
10. What About Microsoft’s Own Workforce?
Ironically, the study emerges from a company racing to embed Copilot across every SKU. Microsoft’s internal culture has long rewarded engineers who automate themselves out of toil. Ship a PowerShell script today, claim a promotion tomorrow. The difference now is scale and accessibility. Non-technical teams can build no-code GPT chains, amplifying that philosophy company-wide.
So, if Redmond bets that jobs replaced by AI will outpace new roles, they risk cannibalizing billable seats for Word, Excel, and Teams. They likely predict a net expansion: more users, more data, more premium AI tiers. It’s the same paradox Amazon faced when automating fulfillment, cut headcount per box but grow boxes dramatically.
11. The Human Levers Still Intact
Here are three domains where Copilot’s applicability remains low, illustrating why many AI safe jobs stay safe, at least for the next few model upgrades:
- Complex Motor Control. Electricians threading conduit, chefs juggling heat and timing, language models can’t hold a spatula.
- High-stakes Accountability. A mis-fired diagnosis can end a life. Physicians might lean on AI, but ultimate liability is human.
- Novel Physical Environments. Disaster relief crews improvise in chaotic conditions. Robots exist, but chatbots do not parse rubble.
In other words, jobs that AI can’t replace share two traits: reliance on a non-linguistic medium (muscle, metal, or risk) and absence of abundant text-based telemetry to learn from. These same traits explain why AI in neurology enhances support roles but doesn’t substitute core care.
12. When Applicability Meets Policy
Governments eyeing tax revenue and social stability wonder how to cushion transitions. Four policy levers stand out, each interacting with Microsoft’s metrics:
- Up-skilling stipends for mid-applicability workers so they can climb to layers AI can’t handle.
- Task-level wage insurance rather than job-level, acknowledging partial displacement.
- AI literacy certification woven into vocational licenses, making augmentation the default.
- Robust data privacy rules to ensure that real-chat telemetry, the lifeblood of studies like this, isn’t misused.
Any plan ignoring the gradient nature of jobs replaced by AI will misfire. Applicability scores are continuous, not binary switches.
13. Coding, Media, Health – A Sector-by-Sector Reality Check
Sector | Top Tasks in Copilot Logs | AI applicability | Immediate Risk of jobs replaced by AI | Human Edge |
---|---|---|---|---|
Software Engineering | Unit-test scaffolds, docstrings, bug-hunt queries | High | Repetitive CRUD may vanish first | System design, debugging in live prod |
Digital Media | Headline rewrites, thumbnail slogans, ad-copy tweaks | High | Low-tier content farms already downsizing | Original reporting, investigative depth |
Healthcare (Clinical) | ICD-10 coding help, discharge summary drafts | Medium | Limited, because body care is immune | Empathy, hands-on procedure |
Finance & Accounting | Variance explanations, commentary on ledgers | Medium | Some clerical postings axed by bots | Strategic judgment, regulatory nuance |
Manufacturing | Machine-maintenance procedures in text form only | Low | Few jobs affected by AI yet | Physical upkeep, hardware intuition |
Key pattern: risk climbs when the core deliverable is text, code, or numbers. It drops when the output is physical, tactile, or anchored in direct human trust.
This mirrors findings in tools like Claude 4 Opus and OpenAI O3, which show text-based domains adopting AI fastest.
14. The Engineer’s Playbook for a Post-Copilot Desk
- Map your personal task stack.
Split yesterday’s work into three piles: rote, routine, and rare. Circle everything that already looks like a Copilot prompt. - Automate the rote tier first.
Every saved minute is leverage. Even if your title is “Senior UX Researcher,” let a script label survey transcripts. Rote elimination lowers stress while guarding against AI job loss later. - Turn routine into rare.
Use the assistant so skillfully that what felt routine becomes one-click rare. Example: a marketer who once drafted eight campaign variants now spawns eighty in ten minutes, then spends the afternoon on narrative polish. - Broadcast your new throughput.
Hiring managers crave proof that you bend capacity curves. Showing you triple volume with Copilot paints you as future-proof, not as another entry in the “jobs replaced by AI” column. - Invest in human flair.
Comedy timing, story arcs, negotiation empathy. These remain AI safe jobs. They grow in value as mechanical writing shrinks.
15. Manager’s Checklist: Replace Tasks, Not People
- Inventory workloads monthly. Watch which duties shrink as models mature.
- Promote cross-training. When chatbots grab ticket triage, nudge agents toward escalation mastery.
- Tie bonuses to augmentation adoption. Reward staff for measurable output gains, not for guarding turf.
- Document decision boundaries. Make clear which calls stay human, especially in regulated zones. Microsoft’s own practice is to chain Copilot to Azure policy engines so no prompt can leak patient data.
- Budget for reskilling. Redirect freed headcount hours into courses on prompt engineering, data literacy, or product sense. That lowers long-run severance and reputational costs.
This adaptive approach is critical to avoid repeating the history’s gravest mistake in AI adoption—underestimating its transformative potential until it’s too late.
16. Blind Spots in the Microsoft Lens
- Language bias. Data is U.S.-centric English. Urdu-speaking accountants in Karachi may use Copilot differently, yet they’ll still appear in macro articles about AI job loss.
- Privacy filtering. The anonymizer nixes PII, which can also nuke context. A conversation about sensitive negotiations might be scrubbed so thoroughly that the detected IWA becomes “Provide information,” understating later disruption.
- One-tool view. Users juggle Gemini, Perplexity, and local LLaMA mixes. Copilot logs miss cross-tool workflows where code leaves one chat window and re-enters another.
- Task granularity drift. O*NET updates lag. TikTok brand strategists are forced into 2010 marketing titles, masking brand-new jobs replaced by AI scenarios, like auto-generated short-form hooks.
17. Build Your DIY Applicability Meter
- Step 1: List your top ten weekly outputs.
- Step 2: Ask a public LLM to perform each one. Grade accuracy one to five.
- Step 3: Multiply by the time that output normally takes.
- Step 4: If the product of time and accuracy exceeds 0.5 of your original effort, flag it as red. Those reds represent micro-jobs replaced by AI inside your role.
Do this quarterly. Trend lines trump static fear.
18. Forecast: Five Years out
- Coding copilots spread into low-code IDEs. Expect fewer intern-level commits, more API orchestration.
- Microsoft AI jobs pivot from sheer model tuning toward trust layers: security filters, audit trails, prompt risk analytics.
- Jobs affected by AI in legal research reach paralegal core. Brief drafts go bot-first. Human counsel focuses on precedent stitching and courtroom storytelling.
- AI applicability in education jumps once multimodal GPT-style tools grade math scratch work via smartphone photos. Tutors evolve into motivation coaches.
- Governments roll out future of work certificates, verifying workers can instruct and verify AI agents. Think of forklift licenses, but for prompt safety.
19. Will the Dominoes Fall All at Once?
History rarely moves like a guillotine. The internet arrived, yet print designers still walk among us, albeit with Figma layers and CSS in their bloodstream. The same slow glue will bind this shift.
Yes, some chat transcripts hint at whole jobs replaced by AI, especially template-heavy telemarketers. More often, we’ll watch a gradual carve-out. Tasks die, titles live, pay brackets wobble, then settle.
The big constant is agency. Workers who treat Copilot as an adversary forfeit control. Those who treat it as a collaborator gain bargaining power, precisely because they understand the machine’s blind spots.
20. Closing Circuit
Microsoft’s study gives us a rare dashboard: a real-world signal of where conversational AI already grips the labor graph. It shouts that writing, translation, sales scripts, and code boilerplate stand closest to the cliff. It whispers that caregiving, craftsmanship, and embodied judgment remain firm ground. And it quietly reminds us that jobs replaced by AI is not a binary headline. It’s a sliding bar that moves each time someone figures out one more clever prompt.
So, the next time you read an alarmist tweet about robots taking every gig, pull up your own task sheet. Mark what is truly text or data, mark what is muscle or interpersonal nuance. The model eats text faster than you can imagine. The rest is yours to defend, rework, or elevate.
The future of work will belong to people who keep moving that boundary forward, one intelligently augmented deliverable at a time.
Citation:
Tomlinson, K., Jaffe, S., Wang, W., Counts, S., & Suri, S. (2025, July 22). Working with AI: Measuring the occupational implications of generative AI (Version 3) [Preprint]. arXiv. https://arxiv.org/abs/2507.07935v3
Azmat — Founder of Binary Verse AI | Tech Explorer and Observer of the Machine Mind Revolution.
Looking for the smartest AI models ranked by real benchmarks? Explore our AI IQ Test 2025 results to see how today’s top models stack up. Stay updated with our Weekly AI News Roundup, where we break down the latest breakthroughs, product launches, and controversies. Don’t miss our in-depth Grok 4 Review, a critical look at xAI’s most ambitious model to date.
For questions or feedback, feel free to contact us or browse more insights on BinaryVerseAI.com.
Which jobs is AI most likely to replace?
AI is most likely to replace jobs that involve repetitive, text-based, or rule-driven tasks. According to Microsoft’s 2025 study, occupations such as customer service agents, data entry clerks, telemarketers, content moderators, and low-level copywriters are at high risk. These roles often involve tasks that generative AI tools like Copilot, ChatGPT, or Gemini 2.5 can now perform faster and more cheaply.
What jobs will survive AI in the long run?
Jobs that require emotional intelligence, physical dexterity, or complex decision-making in dynamic environments are more likely to survive AI in the long run. Roles in healthcare, education, cybersecurity, creative direction, plumbing, and mental health counseling are considered AI-resilient. These professions rely on human connection, trust, and contextual judgment, areas where AI still falls short.
What jobs will be lost by 2030 or 2050 due to AI?
By 2030, jobs that involve template-based writing, basic financial analysis, low-level coding, and routine scheduling are expected to see widespread automation. By 2050, experts project deeper displacement across paralegal work, junior software development, logistics planning, and even some medical diagnostics, especially if multimodal AI tools continue to evolve rapidly. However, the timeline varies depending on regulation, economic shifts, and public adoption.
What degree or education is safe from AI disruption?
Degrees emphasizing critical thinking, interdisciplinary learning, and human-centric skills are considered safer from AI disruption. These include psychology, education, nursing, philosophy, social work, and cybersecurity. STEM degrees focused on AI development, robotics, and ethics of technology are also future-proof, but pure technical knowledge alone may not suffice without creativity and adaptability.
Why aren’t cybersecurity jobs on the Microsoft AI replacement list?
Cybersecurity jobs are largely absent from Microsoft’s Copilot logs because they often involve sensitive, offline, or confidential systems not exposed to AI-assisted workflows. Moreover, these roles demand continuous adaptation, human judgment, and real-time threat response, areas where current generative AI tools remain limited. Instead of being replaced, cybersecurity roles are expected to grow in demand as AI introduces new vulnerabilities.