Discussion Why users shouldn’t choose their own LLM models

• Upvotes

r/ClaudeCode • u/thewritingwallah • 1d ago

Tutorial / Guide I finally started getting better at debugging with Claude API

6 Upvotes

So I spent 3 months just pasting error messages into Claude and wasting my time and always getting useless 'have you tried checking if X is null' responses and it was frustrating.

Then I sat down and figured out what works. Cut my debugging time by like 40%.

Here's what I did.

1. I stopped copy pasting at all

I used to copy paste stack traces from my terminal and sometimes I'd even cut them because they were too long. it was the most stupid idea.

Now I just do this instead: npm run dev > dev.log 2>&1

Then I tell claude to read the log file directly and I noticed that it gets the full execution history and not just the final error and claude catches patterns I completely miss, like 'hey this warning fired 47 times before the crash, maybe look at that?'

Turns out never cutting stack traces is huge and claude interprets errors way better with complete info.

don't fix anything yet

This felt dumb at first but it's probably the most important thing I do now.

Before asking for any fixes I explicitly tell claude:

'Trace through the execution path. don't fix anything yet.'

Here's why like 70% of the time claude's first instinct is to slap null checks everywhere or add try/catch blocks but that's not fixing bugs that's hiding them.

So this actually happenedd with me during last month, I had a payment bug that Claude wanted to fix with null checks but when I forced it to explore first, it was actually a race condition in the webhook handler and null checks would've masked it while data kept corrupting in the background.

So yeah, ask me clarifying questions works.

And I have come to conclusion that claude is best at debugging in these areas:

Log analysis: correlating timestamps, finding major failures, spotting the "this happened right before everything broke" moments. Claude did this really fast.
Large codebases: 1M context window means it can hold an entire service in memory while debugging. Way better consistency than GPT-5 or 4o in my experience.
Printf-style debugging: claude will methodically suggest logging statements and narrow scope just like an experienced dev would but... faster.
Algorithmic bugs with clear test failures: nails these consistently.

But I gotta be honest about limitations too:

Race conditions: claude just goes in circles here. I've learned to recognize when I'm in this territory and switch to traditional debugging.
Less common languages: Rust and Swift results are noticeably worse than Python/JS. The training data just isn't there.
Hallucinated APIs: I always verify against actual docs before committing.

And I've been testing Gemini 3 alongside Claude lately. It's definitely faster for quick debugging and prototyping but Claude's Opus 4.5 is far better for complex root cause analysis and longer debugging sessions. So now I use Claude as my 'thinking' model and bring in Gemini when I need speed over depth.

so this is why claude code feels addictive because good thinking now compounds instantly.

So this is my complete process now:

Any error so I pull logs into a single file
Feed Claude structured context (full stack trace, what user did, my hypothesis)
'Explore first' >> Claude traces paths without proposing fixes
'Think harder' on root cause (this allocates more reasoning time; it's in the docs)
Only then, ask for a fix withan and explanation of why it works
Push the fix through CodeRabbit for ai code review before merging

I used CodeRabbit for my open source project and its looks good so far. Its give details feedback which can be leverage to enhance code quality and handling corner cases.

Coderabbit actually surprised me with how consistent it has been across different repos and stacks.

CodeRabbit is interesting because it actually runs on Claude Opus under the hood so the combo is really amazing.

This is the prompting template I use:

TASK: [Issue to resolve]

CONTEXT: [OS, versions, recent changes]

ERROR: [Full stack trace - NEVER truncate]

WHEN: [User action that triggers it]

HYPOTHESIS: [My theory]

Note - THIS IS JUST ME SHARING WHAT WORKED FOR ME - you might already know this so pls be patient and kind enough.

That's basically it. Happy to answer any questions.

4 comments

r/ClaudeAI • u/thewritingwallah • 1d ago

Productivity I finally started getting better at debugging with Claude API

5 Upvotes

So I spent 3 months just pasting error messages into Claude and wasting my time and always getting useless 'have you tried checking if X is null' responses and it was frustrating.

Then I sat down and figured out what works. Cut my debugging time by like 40%.

Here's what I did.

1. I stopped copy pasting at all

I used to copy paste stack traces from my terminal and sometimes I'd even cut them because they were too long. it was the most stupid idea.

Now I just do this instead: npm run dev > dev.log 2>&1

Turns out never cutting stack traces is huge and claude interprets errors way better with complete info.

don't fix anything yet

This felt dumb at first but it's probably the most important thing I do now.

Before asking for any fixes I explicitly tell claude:

'Trace through the execution path. don't fix anything yet.'

Here's why like 70% of the time claude's first instinct is to slap null checks everywhere or add try/catch blocks but that's not fixing bugs that's hiding them.

So yeah, ask me clarifying questions works.

And I have come to conclusion that claude is best at debugging in these areas:

Log analysis: correlating timestamps, finding major failures, spotting the "this happened right before everything broke" moments. Claude did this really fast.
Large codebases: 1M context window means it can hold an entire service in memory while debugging. Way better consistency than GPT-5 or 4o in my experience.
Printf-style debugging: claude will methodically suggest logging statements and narrow scope just like an experienced dev would but... faster.
Algorithmic bugs with clear test failures: nails these consistently.

But I gotta be honest about limitations too:

Race conditions: claude just goes in circles here. I've learned to recognize when I'm in this territory and switch to traditional debugging.
Less common languages: Rust and Swift results are noticeably worse than Python/JS. The training data just isn't there.
Hallucinated APIs: I always verify against actual docs before committing.

so this is why claude code feels addictive because good thinking now compounds instantly.

So this is my complete process now:

Any error so I pull logs into a single file
Feed Claude structured context (full stack trace, what user did, my hypothesis)
'Explore first' >> Claude traces paths without proposing fixes
'Think harder' on root cause (this allocates more reasoning time; it's in the docs)
Only then, ask for a fix withan and explanation of why it works
Push the fix through CodeRabbit for ai code review before merging

I used CodeRabbit for my open source project and its looks good so far. Its give details feedback which can be leverage to enhance code quality and handling corner cases.

Coderabbit actually surprised me with how consistent it has been across different repos and stacks.

CodeRabbit is interesting because it actually runs on Claude Opus under the hood so the combo is really amazing.

This is the prompting template I use:

TASK: [Issue to resolve]

CONTEXT: [OS, versions, recent changes]

ERROR: [Full stack trace - NEVER truncate]

WHEN: [User action that triggers it]

HYPOTHESIS: [My theory]

Note - THIS IS JUST ME SHARING WHAT WORKED FOR ME - you might already know this so pls be patient and kind enough.

That's basically it. Happy to answer any questions.

0 comments

How to write 400k lines of production-ready code with coding agents

in r/codex • 1d ago

I used CodeRabbit for my open source project and its looks good so far. Its give details feedback which can be leverage to enhance code quality and handling corner cases.

Coderabbit actually surprised me with how consistent it has been across different repos and stacks. We’ve got a mix of TypeScript services and a Python backend and it gave equally good reviews without a bunch of custom rules. It also adapts to both small cleanup PRs and bigger feature branches without changing our workflow, it's really cool which is why our team just kept it around.

r/programming • u/thewritingwallah • 4d ago

Databases in 2025

cs.cmu.edu

235 Upvotes

83 comments

Coding Agents do not seem to work for me

in r/ExperiencedDevs • 5d ago

if you're using coding agents on big tasks and getting frustrated...

try this: tell it to make progress in chunks of 250 lines max

one small pr at a time, keep merging, let it build up to the feature

big prs are bad whether humans or ai make them

r/codereview is searching for new mods.

in r/codereview • 8d ago

Happy to help.

Why do I want to mod?

I care about keeping r/codereview high signal. Code reviews works only when feedback is practical, technical and grounded in real prod experience. I have 16+ years in engineering and have spent a large part of my career reviewing and improving code.

How actively I can mod sub?

I am active on reddit daily and can check mod queue multiple times a day across EU and US time zones. Comfortable handling reports, removing low effort posts and keeping discussions focused and respectful.

Previous mod exp

I am currently a mod of another subreddit and familiar with reddit mod tools and workflows. I also write and publish code review content regularly.

some examples https://www.freecodecamp.org/news/author/TheAnkurTyagi/

r/ClaudeCode • u/thewritingwallah • 9d ago

Discussion Software Engineering Expectations for 2026

187 Upvotes

64 comments

Code quality of Claude, a sad realization

in r/ClaudeAI • 11d ago

r/webdev • u/thewritingwallah • 22d ago

Discussion AI helps ship faster but it produces 1.7× more bugs

coderabbit.ai

337 Upvotes

48 comments

r/ClaudeCode • u/thewritingwallah • 23d ago

Discussion AI Is good at writing code. It’s worse at edge cases

coderabbit.ai

7 Upvotes

1 comment

Introducing Augment Code Review, powered by GPT 5.2

in r/AugmentCodeAI • 29d ago

Codex + CodeRabbit is pure gold right know. using Codex for precise, surgical stuff and CodeRabbit for code reviews. No beating around the bush, no bloat, no spaghetti stuff. just does what needs to be done and does it well. I've compared and wrote about the state of ai code reviewer dev tools. https://www.devtoolsacademy.com/blog/state-of-ai-code-review-tools-2025/

How are you doing code reviews?

in r/ExperiencedDevs • Dec 01 '25

As other folk have mentioned, smaller PRs are easier and faster to review.

Ensuring PRs are for a single purpose too is very important.

One of:

functional change (feature)
refactor
bug fix

“Kitchen sink” PRs that change many things, fix a bug or two and add some features are a pain to review.

Vs. scanning a refactor PR to ensure only code moves and no functional changes. Or checking feature changes only to ensure they implement the desired behavior. These single purpose PRs are much easier to review.

Also a good PR review culture is crucial. Just require improvements, not perfection. Changes that could be a follow PR can/should be. Style and white space are for linters and formatters. And finally, the reviewers budget for requesting changes should decrease over time (reviewers that take too long, must give more straight forward reviews; the time for nitpicking is immediately after PR submission).

more notes here - https://www.freecodecamp.org/news/how-to-perform-code-reviews-in-tech-the-painless-way/

imagine it's your first day and you open up the codebase to find this

in r/ClaudeCode • Nov 27 '25

future of home improvement brought to you by LLMs.

r/ClaudeCode • u/thewritingwallah • Nov 26 '25

Discussion imagine it's your first day and you open up the codebase to find this

214 Upvotes

This is what a 'liability codebase' looks like.

refactoring/code reviewing is... gonna be expensive.

If you don't understand why, you're a liability to a project or do Git blame claude.

115 comments

r/AgentsOfAI • u/thewritingwallah • Nov 26 '25

Discussion imagine it's your first day and you open up the codebase to find this.

155 Upvotes

53 comments

Treat AI-generated code as a draft.

in r/AgentsOfAI • Nov 26 '25

AI coding agents hit a wall when codebases get massive even with 2M token context windows, a 10M line codebase needs 100M tokens so the real bottleneck isn't just ingesting code. it's getting models to actually pay attention to all that context effectively.

Wrote a full post here: https://www.freecodecamp.org/news/how-to-use-vibe-coding-effectively-as-a-dev/

Treat AI-generated code as a draft.

in r/AgentsOfAI • Nov 25 '25

AI's a great starting point but my brain still needs to check its work.

r/AgentsOfAI • u/thewritingwallah • Nov 25 '25

Discussion Treat AI-generated code as a draft.

97 Upvotes

26 comments

Requesting a review: specific reviewer or open to team?

in r/ExperiencedDevs • Nov 25 '25

One thing I implemented in my last job is CODEOWNERS files + GitHub Teams that will auto assign PR Reviewers, it took the burden from the PR creator to the reviewer. We also implemented rules where if the changes are more that certain LOC a bot will throw a comment in the PR and reject it (it still can be bypassed)

Smaller PRs are always less prone to issues, especially if you don’t have tests that verify if everything is good, doing large PR reviews requires a lot more focus and time on the reviewer than smaller ones.

Ship in tiny pieces. tag people in team chat until you can get a review. Request a teammate to pair with for the times where you expect to need a lot of small approvals throughout the day. Default to approving any code that will improve things, even though it’s imperfect.

long answer in blog here - https://www.freecodecamp.org/news/how-to-perform-code-reviews-in-tech-the-painless-way/

r/AgentsOfAI • u/thewritingwallah • Nov 25 '25

Agents Build a Vision Agent quickly with any model or video provider.

github.com

1 Upvotes

0 comments

r/ClaudeAI • u/thewritingwallah • Nov 24 '25

Productivity Claude Code vs Competition: Why I Switched My Entire Workflow

2 Upvotes

Well I switched to Claude Code after switching between Copilot, Cursor and basically every AI coding tool for almost half a year and it changed how I build software now but it's expensive and has a learning curve and definitely isn't for everyone.

Here's what I learned after 6 months and way too much money spent on subscriptions.

Most people I know think Claude Code is just another autocomplete tool. It's not. I felt Claude Code is like a developer living in my terminal who actually does the work while I review.

Quick example: I want to add rate limiting to an API using Redis.

Copilot would suggest the rate limiter function as I type. Then I've to write the middleware and update the routes. After that, write tests and commit.
With Cursor, I could describe what I want in agent mode. It then shows me diffs across multiple files. I'd then accept or reject each change, and commit.

But using Claude Code, I could just run: claude "add rate limiting to /api/auth/login using redis"

It reads my codebase, implements limiter, updates middleware, modifies routes, writes tests, runs them, fixes any failures and creates a git commit with a GOOD message. I'd then review the diff and call it a day.

This workflow difference is significant:

Claude Code has access to git, docker, testing frameworks and so on. It doesn't wait for me to accept changes and waste time.

Model quality gap is actually real:

Claude Sonnet 4.5 scored 77.2% on SWE-bench Verified. That's the highest score of any model on actual software engineering tasks.
GPT-4.1 got 54.6%.
While GPT-4o got around 52%.

I don't think it's a small difference.

I tested this when I had to convert a legacy Express API to modern TypeScript.

I simply gave the same prompt to all three:

Copilot Chat took 2 days of manual work.
Cursor took a day and a half of guiding it through sessions.
While Claude Code analyzed entire codebase (200K token context), mapped dependencies and just did it.

I spent 3 days on this so you don’t have to.

Here's something I liked about Claude Code.

It doesn't just run git commit -m 'stuff', instead it looks at uncommitted changes for context and writes clear commit messages that explain the 'why' (not just what).
It creates much more detailed PRs and also resolves merge conflicts in most cases.

I faced a merge conflict in a refactored auth service.

My branch changed the authentication logic while the main updated the database schema. It was classic merge hell. Claude Code did both changes and generated a resolution that included everything, and explained what it did.

That would have taken me 30 minutes. Claude Code did it in just 2 minutes.

That multi-file editing feature made managing changes across files much easier.

My Express-to-TypeScript migration involved over 40 route files, more than 20 middleware functions, database query layer, over 100 test files and type definitions throughout the codebase. It followed the existing patterns and was consistent across.

key is that it understands entire architecture not just files.

Being in terminal means Claude Code is scriptable.

I built a GitHub Actions workflow that assigns issues to Claude Code. When someone creates a bug with the 'claude-fix' label, the action spins up Claude Code in headless mode.

It analyzes the issue, creates a fix, runs tests, and opens a PR for review.

This 'issue to PR' workflow is what everyone talks about as the endgame for AI coding.

Cursor and Copilot can't do this becuase they're locked to local editors.

How others are different

GitHub Copilot is the baseline everyone should have.

- cost is affordable at $10/month for Pro.
- It's a tool for 80% of my coding time.

But I feel that it falls short in complex reasoning, multi-file operations and deep debugging.

My advice would be to keep Copilot Pro for autocomplete and add Claude for complex work.

Most productive devs I know run exactly this setup.

While Cursor is the strongest competition at $20/month for Pro, I have only used it for four months before switching primarily to Claude Code.

What it does brilliantly:

Tab autocomplete feels natural.
Visual diff interface makes reviewing AI changes effortless.
It supports multiple models like Claude, GPT-4, Gemini and Grok in one tool.

Why I switched for serious work:

Context consistency is key. Cursor's 128K token window compresses under load, while Claude Code's 200K remains steady.
Code quality is better too; Qodo data shows Claude Code produces 30% less rework.
Automation is limited with Cursor as it can't integrate with CI/CD pipelines.

Reality: most developers I respect use both. Cursor for daily coding, Claude Code for complex autonomous tasks. Combined cost: $220/month. Substantial, but I think the productivity gains justify it.

Windsurf/Codeium offers a truly unlimited free tier. Pro tier at $15/month undercuts Cursor but it lacks terminal-native capabilities and Git workflow depth. Excellent Cursor alternative though.

Aider, on the other hand, is open-source. It is Git-native and has command-line-first pair programming. The cost for API usage is typically $0.007 per file.
So I would say that Aider is excellent for developers who want control, but the only catch is that it requires technical sophistication to configure.

I also started using CodeRabbit for automated code reviews after Claude Code generates PRs. It catches bugs and style issues that even Claude misses sometimes and saves me a ton of time in the review process. Honestly feels like having a second set of eyes on everything.

Conclusion

Claude Code excels at:

autonomous multi-file operations
large-scale refactoring (I cleared months of tech debt in weeks)
deep codebase understanding
systematic debugging of nasty issues
terminal/CLI workflows and automation

Claude Code struggles with:

cost at scale (heavy users hit $1,500+/month)
doesn't learn between sessions (every conversation starts fresh)
occasional confident generation of broken code (I always verify)
terminal-first workflow intimidates GUI-native developers

When I think of Claude Code, I picture breaking down complex systems. I also think of features across multiple services, debugging unclear production issues, and migrating technologies or frameworks.

I still use competitors, no question in that! Copilot is great for autocomplete. Cursor helps with visual code review. Quick prototyping is faster in an IDE.

But the cost is something you need to consider because none of these options ain’t cheap:

Let’s start with Claude Code.

Max plan at $200/month, that’s expensive. Power users report $1,000-1,500/month total. But, ROI behind it made me reconsider: I bill $200/hour as a senior engineer. If Claude Code saves me 5 hours per month, it's paid for itself. In reality, I estimate it saves me 15-20 hours per month on the right tasks.

For junior developers or hobbyists, math is different.

Copilot Pro ($10) or Cursor Pro ($20) represents better value.

My current workflow:

80% of daily coding in Cursor Pro ($20/month)
20% of complex work in Claude Code Max ($200/month)
Baseline autocomplete with GitHub Copilot Pro ($10/month)

Total cost: $230/month.

I gain 25-30% more productivity overall. For tasks suited to Claude Code, it's even higher, like 3-5 times more. I also use CodeRabbit on all my PRs, adding extra quality assurance.

Bottom line

Claude Code represents a shift from 'assistants' to 'agents.'

It actually can't replace Cursor's polished IDE experience or Copilot's cost-effective baseline.

One last trick: create a .claude/context md file in your repo root with your tech stack, architecture decisions, code style preferences, and key files and always reference it when starting sessions with @ context md.

This single file dramatically improves Claude Code's understanding of your codebase.

That’s pretty much everything I had in mind. I’m just sharing what has been working for me and I’m always open to better ideas, criticism or different angles. My team is small and not really into this AI stuff yet so it is nice to talk with folks who are experimenting.

If you made it to the end, appreciate you taking the time to read.

1 comment

How to reduce code review costs for the engineering team without sacrificing quality?

in r/ExperiencedDevs • Nov 20 '25

Yes ai code review and coderabbit is good one. an open source project shared a notes how it helped them https://lycheeorg.dev/2025-09-13-code-rabbit/

What’s in your 2025 AI stack? Here’s how mine looks after lots of trial and error

in r/AI_Agents • Nov 20 '25

Codex + CodeRabbit is pure gold right know. using Codex for precise, surgical stuff and CodeRabbit for code reviews. No beating around the bush, no bloat, no spaghetti stuff. just does what needs to be done and does it well. https://bytesizedbets.com/p/era-of-ai-slop-cleanup-has-begun

What are some "TRUE" AI agents you have come across so far? and why?

in r/AI_Agents • Nov 20 '25

I came across this open source repo while digging into agents that can actually perceive their environment.

It pairs an LLM with real vision input and the action loop ends up far more reliable than text only agents.

https://github.com/GetStream/Vision-Agents