AI in Software Engineering — Beyond the Code Completion Era
Code completion was the first wave. Agentic coding tools, AI-driven IDEs, and autonomous-bug-fix services are the second. The picture in 2025 is more nuanced than either the boosters or the sceptics suggest.
The AI software engineering tooling landscape has evolved through 2024 and into 2025. Code completion (Copilot, the first wave) is now table stakes. AI-driven editors (Cursor, Continue), agentic coding assistants (Claude Code, Aider, Devin), and integrated AI in major IDEs have changed how engineering teams work in subtle but real ways. The picture is more nuanced than either the boosters or the sceptics suggest.
This piece is a practitioner view of what's actually working in AI-assisted software engineering in 2025, where the value lands, and what discipline determines whether it compounds or erodes.
The tooling categories
By 2025 the tooling has differentiated:
Inline assistance (the Copilot pattern)
Suggestions as you type, integrated into the IDE. Now standard; most enterprise engineering teams use one tool or another.
AI-native editors (Cursor, Continue, Windsurf)
Editors built around AI as a primary interaction model. More aggressive than the Copilot pattern; the editor knows the codebase context, can apply changes across files, can navigate via natural language.
Agentic coding tools (Claude Code, Aider, Devin)
Tools where the AI executes meaningful units of work — implementing a feature, fixing a bug, refactoring a module — with the engineer reviewing the result. The boundary between assistant and agent.
Embedded in CI/development infrastructure
AI-driven PR review, test generation in CI, automated bug fixing in issue trackers. Not interactive; runs as part of the development flow.
Specialised tools
AI for code review, code search, documentation, migration. Narrow but valuable in their niches.
Where the productivity lands
Across the engineering teams we have observed:
Boilerplate and scaffolding
Inline assistance accelerates boilerplate consistently. New files, standard patterns, type definitions, test scaffolding. Saves real time across the day.
Familiar pattern recall
When the engineer knows what to do and just needs to type it. AI assistance shortens the typing. The cumulative effect is real.
Cross-language navigation
For engineers working across multiple languages or frameworks, AI bridges unfamiliar areas. The engineer can move with reasonable productivity in languages they don't use daily.
Test coverage
Across all categories of tools, test generation is the place AI has improved engineering output most measurably. Tests that wouldn't have been written get written; coverage rises.
Documentation
Documentation that was deferred ships. AI doesn't produce perfect documentation; it produces drafts that are easier to polish than to write from scratch.
Refactoring with structure
Agentic tools handle refactoring across multiple files better than the Copilot pattern. The engineer specifies the refactor; the tool applies it with reasonable accuracy; the engineer reviews.
Migration tasks
Framework migrations, version upgrades, syntax modernisation. Tasks where the patterns are well-defined and the volume is high. AI tools are reliably useful here.
Bug fix proposals
Given an issue and a codebase, AI tools propose fixes. The proposals are starting points. Engineers review, adjust, ship. The productivity gain is real on routine bugs.
Where the productivity doesn't land
Original architectural design
Designing new systems, choosing patterns, making consequential architectural decisions. AI assists at the margins but the substance is human work.
Domain-specific reasoning
Code that requires deep understanding of a specific business domain. The AI produces plausible code; the engineer corrects for domain reality.
Production debugging
When something is broken in production, AI tools help with hypotheses but don't replace the human-driven investigation. The decision-making is human.
Performance optimisation at scale
System-level performance work requires reasoning about cross-cutting concerns the AI doesn't see. AI tools help with local optimisations; system-level optimisation is human.
Security-critical code
Code that touches security boundaries needs review at a level AI assistance doesn't yet provide. Engineers stay in the driver's seat.
Complex codebases with house conventions
Codebases with strong conventions the AI hasn't seen produce friction. Suggestions don't match house style; agentic actions don't follow patterns. Engineers spend correction time.
What changes with agentic tools
Agentic coding tools (Claude Code, Aider, Devin) are the new shape in 2024-2025. They produce different patterns from inline assistance:
Bounded autonomy
The engineer gives the tool a task. The tool does the work — reads files, edits files, runs tests, iterates. The engineer reviews the result.
This works when:
- The task is bounded and specific
- The tests provide feedback to the tool
- The engineer can review meaningfully before commit
It doesn't work when:
- The task is open-ended
- There's no fast feedback signal
- The engineer skips review
Productivity profile
Agentic tools produce more code per engineer hour than inline assistance. They also produce more code that needs review, refactoring, or rejection. The net productivity gain is real but smaller than the gross output suggests.
Skill profile shifts
Senior engineers use agentic tools differently from juniors. Seniors set up tasks well, review aggressively, reject confidently. Juniors can lean too hard on the tool's output; the productivity gain is smaller for them than for seniors with strong review discipline.
Code review evolves
A PR from an engineer using agentic tools looks different. More uniform style; sometimes more verbose; sometimes missing context the engineer would have included. Reviewers learn the patterns and adapt their review.
The discipline that determines compounding
The pattern that distinguishes teams where AI tools compound productivity from teams where they erode quality:
The engineer is the author
The AI drafts. The engineer authors. The engineer's judgment is what determines what ships.
Review discipline holds
PRs are reviewed at the same bar as before. AI-suggested code doesn't get a pass.
Test coverage stays meaningful
Tests verify behaviour, not just exist. AI-generated tests are scrutinised for whether they actually catch failures.
Architecture stays human
System-level decisions are made by engineers. AI assists with specific choices, not with overall design.
The codebase stays maintainable
When AI tools produce code that is less maintainable than what engineers would have written, engineers tighten it before merge.
Without these disciplines, productivity in the short term comes at the cost of technical debt that surfaces over the next year.
What we keep seeing
Patterns in 2025 enterprise engineering teams:
Productivity gains are real but bounded. Net 15-30% productivity improvement is typical. Vendor claims of 50%+ rarely materialise at the team level.
Quality holds where discipline holds. Teams with strong review and testing discipline maintain quality. Teams that loosen the discipline accumulate debt.
Agentic tools are useful for narrower tasks than the demos suggest. They work for bounded tasks with good feedback signals. Open-ended use cases fail more often than they succeed.
The senior-junior productivity gap widens. Tools amplify existing skill; seniors use them more effectively than juniors.
Code review skill becomes more important. Reviewers are the quality layer for AI-assisted code. The skill at reviewing matters.
Tool fatigue is real. Engineering teams have been through multiple waves of new tooling. Adopting effectively requires intent, not just enthusiasm.
What we recommend
For enterprise engineering teams in 2025:
- Choose tools based on team fit, not on marketing leadership. The right tool is the one your team uses well.
- Maintain review discipline at the same bar as before. AI-suggested code is not pre-approved.
- Invest in test discipline. AI tools help generate tests; engineers verify the tests actually test.
- Use agentic tools for bounded tasks with feedback signals. Open-ended use cases waste effort.
- Coach junior engineers on effective use. The productivity gain depends on usage skill.
- Measure productivity, quality, and debt over months. The full picture takes time to emerge.
- Resist tool sprawl. Pick a small set; use them well.
AI in software engineering in 2025 is a real productivity tool with real but bounded impact. The teams that use it deliberately and maintain discipline ship faster and with quality. The teams that adopt enthusiastically and let discipline slip ship faster initially and accumulate problems later. The pattern is the same as previous productivity tools; the magnitude of the effect is larger than any previous tool we have seen.
RELATED READING
More from the field.
Service practices the article draws on, related programmes, and other pieces on adjacent topics.
Service practices
Related pieces
AI Code Assistants in Enterprise — What's Actually Shipping
GitHub Copilot rolled out broadly; Cursor and similar editors emerged; competitive options from Anthropic and Codeium gained ground. The enterprise picture for AI-assisted development in mid-2024 is more nuanced than the productivity claims suggest.
Three Years of Enterprise AI — What We Got Right and Wrong
A practitioner reflection on three years of enterprise AI work — the patterns I called correctly, the calls I got wrong, and what to take from each into 2026 and beyond.
The 2026 AI Infrastructure Shift — What's Changing Underneath
The infrastructure layer for enterprise AI is shifting in 2026. New hardware, new deployment patterns, new economics. A look at what's actually different and what it means for architecture decisions.
Discuss this work
Bring an enterprise programme.
If anything in this piece resonates with what you're building, talk to us. Senior practitioners engage directly on architecture and delivery.
Work with the practitioners
Bring an enterprise programme.
Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.