Built your product with Lovable, Bolt, or Cursor and now need it production-ready? A 2026 guide for Aus founders or orgs on auditing, cleaning up, and scaling AI-built software, with realistic costs and timelines.
A 2026 guide for founders and operators who shipped with AI and need to productionise
You used Lovable, Bolt.new, Cursor, or Replit Agent to build a product without writing the code yourself. It works. Customers are using it.
You shipped something in three weeks that would have cost A$60,000 and six months a year ago.
That part isn't a problem. That part is the most important thing that's changed about software in 2026.
The problem is what comes next.
The product is live, you don't fully understand the code that's running it, the AI tool's subscription is the only thing connecting you to the system, and something just broke that you can't fix by prompting your way out of.
Or it works fine for now, but you're about to raise capital, hire your first engineer, or expand to enterprise customers, and someone is going to look at the codebase and ask hard questions.
This guide is for that situation. It is not a critique of AI-assisted development. Senior developers who work fluently with Claude Code, Codex CLI, and Cursor are shipping production-grade code every day without touching most of it themselves.
That workflow is fine. The gap this article is about is different: it's the gap between a non-technical person building working software and a working product that is safe, maintainable, and ready for scale.
Those are two different things, and 2026 is the first year where the first one is easy enough that the second one has become a category of professional service in its own right.
How we got here, briefly
The numbers from the past 12 months tell the story:
- 25% of Y Combinator's Winter 2025 batch ran on codebases that were 95% AI-generated. YC's own framing is important: these were not non-technical founders; these were technical people choosing to let the AI do the work. But the same tools that let YC founders move faster also let truly non-technical founders ship for the first time, and that is the cohort the cleanup market is built on.
- Lovable hit A$600M ARR in its first year. Bolt.new, Replit Agent, and v0 each crossed nine-figure ARR by early 2026. These platforms are specifically optimised for non-developers: describe the product in plain English, get a working app, hit deploy. They didn't exist in usable form before late 2024.
- Gartner forecasts 60% of all new software code will be AI-generated by the end of 2026. The split inside that 60% is the whole story: some of it is shipped by senior developers with proper engineering rigour, and some of it is shipped by founders who learnt SQL on a Tuesday and shipped to production on a Thursday.
- In Australia specifically, AU mid-market adoption of these tools is roughly tracking the US at a 6 to 9 month lag, with disproportionate uptake in financial services, healthtech, and government-adjacent operators where engineering headcount is hard to recruit and AI tooling has been treated as a workaround.
What this means in practice: more AU products are now being built by non-engineers than at any point since the early no-code era, and the quality bar of what those products can do has gone up by an order of magnitude.
The previous generation of no-code tools produced demos. This generation produces real software. Which means the failure mode has also become more serious.
The specific situation this guide is for
You probably recognise yourself in one of these:
You're a founder who built the MVP yourself. You had an idea, the engineering market was too expensive or too slow, and you used Lovable or Cursor to ship something. It works. Customers pay.
But you can't extend it confidently, the codebase is a black box even to you, and you can't hire a senior engineer with this code as their first impression of the company.
You're an operator inside a larger organisation who built an internal tool. Maybe an ops dashboard, a reporting tool, a customer portal extension. It started as a personal project and ended up being relied on by the team.
IT now wants to know what's running on the network. Compliance wants to know where the data is going.
You need to either bring it inside the engineering function or shut it down.
You're a non-technical co-founder whose technical co-founder left. They built it fast, they used AI heavily, and now you have a working product and no one who can answer questions about it.
The next conversation with an investor or a customer is going to surface the gap.
You're a product lead at a mid-market AU business who let a "citizen developer" ship something. It's running in front of real customers. It probably shouldn't be. You need someone to assess what's there and tell you what to do next.
These are different situations but they share a structure: a product exists, it works, and the gap between "works for now" and "works for a real business" needs to be closed by someone who actually knows what they're doing.
What an experienced developer finds when they open the codebase
Most non-technical-built codebases share the same problems. Not because the AI tools are bad (they're remarkable) but because the tools optimise for getting to a working demo, not for the boring, expensive parts of software that don't show up until you have real users.
Authentication that handles the happy path and nothing else. The login screen works. The signup flow works. But token expiry, password reset edge cases, role escalation paths, session fixation, and concurrent-login race conditions weren't in the prompt because nobody thought to put them there. This is where the worst vulnerabilities tend to sit.
Secrets and keys in places they shouldn't be. Stripe credentials hardcoded into the frontend. Database URLs committed to GitHub. API keys in plain text. The AI tools have gotten better at flagging this, but a non-developer working in plain English doesn't always understand why a warning matters.
No tests, or tests that don't actually test anything. The codebase has a tests/ folder. The tests pass. But the assertions are trivial ("the function returns a value") and don't cover the actual logic. The first time someone changes a real behaviour, nothing catches it.
A database schema optimised for "make it work today." Foreign keys are inconsistent. Indexes are missing or applied indiscriminately. Migration files don't exist because the AI tool abstracted the database away. The day you need to change anything structural, this becomes a multi-week problem.
No observability. No structured logging. No error tracking with Sentry or Bugsnag. No performance monitoring. No alerts. The system runs, but when it stops running you find out from a customer support ticket, not a dashboard.
Inconsistent patterns layered on top of each other. A non-developer prompting an AI tool repeatedly over weeks doesn't get a coherent architecture. They get fragments. Data fetching is done three different ways across the codebase.
State management drifts between four different approaches. The codebase doesn't have a style; it has the accumulated residue of different prompting sessions.
Most importantly: institutional fragility. Even if every line of code is fine, nobody currently inside the business can reason about the architecture well enough to extend it safely. The founder can describe what they wanted; they can't describe what the AI built.
Adding a feature requires another prompting session that breaks something unrelated. The codebase isn't dead, but it can't be steered.
The last one is the real one. It's not really a code problem. It's a knowledge problem. The cleanup work has to fix both.
What good remediation looks like
A typical engagement for productionising a non-technical-built codebase runs 1.5 to 3 weeks of focused work.
That timeline surprises people who are used to traditional consultancy engagements measured in months, but it's accurate for what a senior developer using current AI tooling can deliver. The work has four phases:
Phase 1: Audit and recommendation (2 to 5 days)
A senior developer reads the codebase, runs static analysis tools, reviews the database, tests the authentication and authorisation paths, checks for hardcoded secrets, and assesses the overall architecture.
With modern tooling (Claude Code or Cursor 3 running across the full repo, plus CodeScene or Sonar for measurement), this work is days, not weeks. The output is a prioritised report: what's urgent, what's important but not urgent, what can wait, and (critically) whether the codebase should be remediated or rebuilt.
Sometimes the honest answer is rebuild. A good audit will tell you which situation you're in. This phase alone often pays for itself by preventing a wrong decision in the next.
Phase 2: Stabilise (1 to 3 days)
The highest-risk items get fixed first. Hardcoded credentials get moved into proper environment management. Critical auth vulnerabilities get patched.
Observability gets wired in so the next time something breaks, you find out from a dashboard rather than a customer email.
Characterisation tests get written around the core flows, locking in what the system currently does before anything gets refactored. The codebase isn't pretty after this phase; it's safe.
Phase 3: Refactor and harden (4 to 8 days)
Now the structural work happens. Inconsistent patterns get consolidated. The database schema gets cleaned up, with proper migrations. Component structure gets rationalised.
The tests written in Phase 2 prevent regressions. This is where the AI tooling earns its place: the developer runs Claude Code for cross-file architectural changes and Cursor 3 with parallel agents for the smaller refactors, with the engineering judgement to steer everything.
Work that took 3 to 4 weeks of manual refactoring in 2024 takes 4 to 8 days in 2026 when a senior developer is composing the right agents.
Phase 4: Document and hand over (1 to 2 days)
This is the phase non-technical founders consistently undervalue and consistently regret undervaluing. Architecture documentation gets written.
The README explains how to run, deploy, and extend the system. CI/CD pipelines get formalised. The developer runs a structured handover session.
If you're about to hire a permanent engineer, this is what they'll inherit. If you're not, this is what lets the next contract developer ramp in days instead of weeks.
A subset of engagements skip straight from audit to a structured rebuild decision. If the codebase is far enough gone, two days of cleanup work followed by a five-month rebuild is genuinely worse than just rebuilding.
A good developer will tell you that on day two of the audit, not on day twenty.
How an experienced developer actually does this work in 2026
Here's the part founders sometimes miss: a cleanup developer in 2026 is using AI tooling extensively to do the cleanup. They're not hand-typing every line. That would make the work prohibitively expensive. What they bring is the judgement to use those tools properly.
The current production stack for AU developers doing this kind of work, as of mid-2026:
- Claude Code (Anthropic, terminal-based) is the dominant choice for large refactors and architectural analysis. It handles multi-file codebase-wide work better than any other tool currently available. Most cleanup engagements have Claude Code at the centre of the workflow.
- Cursor 3 (launched April 2026) is the daily IDE. Its Agents Window lets a developer run multiple parallel agents - for example, one refactoring the auth layer while another writes tests against the data layer - significantly compressing the time per phase.
- Codex (OpenAI) handles long-running autonomous tasks. The pattern: kick off a multi-hour refactor before lunch, review the proposed changes after.
- CodeScene, Sonar, and Augment Code provide the audit-and-measurement layer. They quantify the problem at the start and track the improvement during.
- Model Context Protocol (MCP) is now the connector standard. A cleanup developer typically sets up MCP servers for the project's GitHub repo, its database, and its observability stack on day one.
What the developer brings that the AI tools don't: knowing what good architecture looks like, recognising the specific failure patterns that AI tools introduce, and having the judgement to know when to refactor versus when to leave well enough alone.
This is the actual product Expert360 connects you with. The AI tools are commodity. The engineering judgement to use them at production standard is not.
And as a quick note - the tools devs are using are changing rapidly, so the above list may be out of date. A great software engineer or developer will always be at the frontier of tool use to maximise their productivity.
What it costs in Australia, mid-2026
Engagement pricing for this work tends to sit at the upper end of the AU contract developer range because the skill set is genuinely narrow. Cleanup specialists typically come from the senior-to-principal tier and charge accordingly.
The below rates are indicative only. Experts in our network set their own rates, and you'll be able to compare real rates after requesting a talent shortlist.
Audit-only engagement (2 to 5 days): A$2,500 to A$7,500. A senior developer with the right tooling (Claude Code or Cursor 3 running against the full repo, plus CodeScene or Sonar for static analysis) can produce a thorough audit in days, not weeks.
The output is a prioritised remediation plan and a defensible recommendation on remediate-versus-rebuild. This is almost always the right starting point if you're not yet sure what you have, and at this price point the decision to start one shouldn't take long.
Full cleanup engagement (1.5 to 3 weeks): A$10,000 to A$30,000. This is the surprising one for buyers expecting agency-style timelines. A senior cleanup specialist using Claude Code and Cursor 3 properly can compress what used to be a multi-month refactor into a couple of weeks.
Specialists in this work typically charge A$1,100 to A$1,500 per day, and a typical 10-day engagement at A$1,300/day lands at A$13,000.
Codebases that conflated multiple AI tools across multiple founders over more than a year sit at the upper end. Genuinely complex cases occasionally push to 4 weeks.
Ongoing development post-cleanup: A$900 to A$1,300 per day on standard contract terms.
The comparison case is what makes the maths obvious: rebuilding from scratch with a permanent senior developer hire typically means 4 to 6 months to recruit (A$165,000 to A$215,000 fully loaded annually) plus 3 to 6 months of build time.
A cleanup engagement that lands at A$15,000 and takes 2 weeks is a fundamentally different commitment.
The 10% of cases where rebuilding is the right call usually involve codebases that conflated multiple AI tools across multiple founders over more than 12 months, or systems where the underlying data model is wrong for the business. A competent audit will tell you which category you're in within the first three days.
When to start the conversation
Most engagements don't start from crisis. They start when a founder or operator realises the speed advantage that got them to product-market fit has become a liability for whatever comes next.
The typical trigger is one of these:
- You're about to raise capital and the codebase needs to survive technical due diligence.
- You're hiring your first engineer and want them to inherit something they can maintain, not something they immediately propose rebuilding.
- The original development cycle has slowed: changes that used to take an afternoon now take a week because the codebase has accumulated enough drift that everything affects everything else.
- You have paying customers and the system has had its first real incident - performance degradation, data inconsistency, a security scare.
- You're operating in a regulated context (APRA CPS 234, the 2026 Privacy Act amendments, healthcare data, government tenders) and unreviewed code is no longer a tenable position.
- You realised you don't actually understand what the codebase does, and that's making product decisions harder than they should be.
If two or more of these resonate, the conversation is worth having now rather than after the next problem.
How Expert360 fits in
Expert360's contract developer network includes a growing cohort of AU senior engineers who specialise specifically in this work: taking products built by non-technical founders or operators and making them production-grade.
They work fluently with Claude Code, Cursor 3, Codex CLI, and the underlying MCP/observability stack. They've done it across multiple codebases, in fintech, healthtech, and AU government-adjacent contexts. They understand the full lifecycle from audit through handover.
Briefs are matched not just on language and framework, but on the specific AI tool your codebase was built with (Lovable codebases have different shape than Bolt codebases, which have different shape than Cursor-agent codebases), the industry compliance context, and the cleanup-versus-rebuild experience the engagement is going to need.
A curated shortlist is available within 48 hours of submitting a brief. The format that tends to produce the best match is straightforward: tell us what you built, what you built it with, what's currently working and what isn't, and what the next milestone is that's forcing the conversation (raise, hire, customer, incident, audit).
From there, we'll get you in front of two or three developers who have done specifically this kind of work before, and you can decide who's the right fit.
If you're not yet sure what you have, start with an audit. A few days, a clear report, and an informed decision about what comes next. That's almost always the right first step.