The Real Role of LLMs in Software Development
2025-05-06
Introduction
I’ve always been fascinated by the idea of automating software development workflows. As a senior backend engineer, code review is something I deal with daily — it's repetitive, it requires focus, and often it slows teams down. So when large language models (LLMs) like GPT-4 and CodeGemma started gaining traction, I had one thought: "Why not automate code review entirely?"
I imagined a world where my Git workflow included a smart reviewer that spots bugs, catches anti-patterns, and even suggests elegant refactors — all without a human involved. No bias, no fatigue, just pure reasoning.
With that vision in mind, I started a side project — an AI-powered code reviewer. The goal was simple: take modified source code from GitLab or Bitbucket, send it to an LLM, and get back structured, insightful feedback. At least, that’s what I thought would happen.
But reality had other plans.
The Experiment
At first, I thought the problem was simple. I built a bot that hooked into GitLab Merge Requests, pulled the changed files, and sent them directly to an LLM via the Ollama API. I tried multiple models — DeepSeek, CodeGemma, even ChatGPT — and tuned the prompts carefully.
I expected specific feedback:
"This method doesn't handle nulls."
"You're not closing the stream."
"This Kafka config will cause consumer lag under load."
But what I got instead was disappointing and frustrating.
When the code contained a real bug — like an unhandled exception, a missing null-check, or a resource leak — the model didn’t highlight the issue. Instead, it rewrote the code entirely and said something vague like:
“Here's a slightly improved version of your method. I've made it cleaner and more maintainable.”
It didn’t point out what was wrong or why it needed to change. It just changed it.
And when the code was already clean, well-structured, and followed best practices, the model still returned useless fluff like:
“Great job overall! You might consider renaming variables for readability, or breaking down large methods.”
Even after tweaking prompts to ask "strictly highlight bugs only" or "only respond if something is clearly wrong" — the model still tried to be creative or polite instead of critical and helpful.
I experimented with low temperature settings to reduce creativity, wrote extremely precise instructions, and tested the same inputs across multiple models.
None of it helped.
No matter the temperature or how strict the prompt was, the model continued to generate vague suggestions, unnecessary rewrites, or motivational filler text.
It didn’t behave like a reviewer.
It behaved like a content creator trying to sound useful.
That’s when I realized: I was using LLMs for the wrong task.
3. A Shift in Perspective
After weeks of trying to mold LLMs into code reviewers, I stepped back and asked a simple question:
What are they actually good at?
They're great at assisting — not deciding.
Once I stopped expecting LLMs to behave like senior engineers, things started to click.
I began treating them as flexible assistants: tools for brainstorming, summarizing context, or even turning vague ideas into drafts.
They’re not precise tools for enforcement.
They’re creative tools for exploration.
That mindset shift made all the difference.
4. The Real Role of LLMs in Development
In real-world software projects, LLMs shine not as judges of code, but as collaborators.
They help generate boilerplate, suggest patterns, and even challenge your assumptions in design.
They're perfect for moments like:
- “How would I structure this class in a more idiomatic way?”
- “Can you summarize what this function does?”
- “What are some edge cases I might miss here?”
Instead of forcing them into static roles like “code reviewer” or “linter,”
I now use them for exactly what they do best: supporting the creative and cognitive parts of programming.
They help me think faster, not judge better.
But there's a danger: when developers start delegating too much,
they unconsciously become passive.
You stop catching obvious mistakes you used to notice instantly —
meanwhile the model keeps trying and failing to grasp the full concept for minutes or even hours.
LLMs are here to support us, not replace our common sense.
5. Conclusion
Working closely with LLMs taught me a simple truth:
they are not silver bullets, but they are powerful tools.
When treated as assistants — not replacements — they can boost creativity,
help you move faster, and make coding more enjoyable.
But if you expect them to think for you, you'll be disappointed.
Real intelligence still lies in the developer —
the model just helps bring it to life a little quicker.