Bunny Honey ClubBunny Honey/blog
Subscribe
← back to indexblog / saas / building-tradoki-ai-trading-education-saas-in-8-weeks
SaaS

Building Tradoki: an AI trading-education SaaS in 8 weeks

From idea to paying customers in eight weeks. The ai trading education saas build log — stack, decisions, mistakes, and the 300-user beta that paid for itself.

A
ArthurFounder, Bunny Honey Club AI
publishedOct 17, 2025
read11 min
Building Tradoki: an AI trading-education SaaS in 8 weeks

In July 2025 three of us — a discretionary options trader, a Next.js engineer, and a designer — decided to build a small AI-driven education tool for retail traders. The product is Tradoki, live today at https://www.tradoki.com, and this is

In July 2025 three of us — a discretionary options trader, a Next.js engineer, and a designer — decided to build a small AI-driven education tool for retail traders. The product is Tradoki, live today at https://www.tradoki.com, and this is the build log from idea to paying users. We gave ourselves eight weeks to reach paying customers or kill the project. We shipped on day 54 with 23 paying beta users at €19/month. By the end of September we had 312 paying users, €5,920 of monthly recurring revenue, and enough data to decide whether the thing was a real business or a pretty side project. The ai trading education saas build log, honestly, is a story about how much faster a three-person team can ship in 2025 than it could have in 2023 — we used Claude Code to draft 70% of the initial codebase, v0 for the UI exploration, and kept the team under three people on purpose — and also a story about how the last 30% of any software product is still human work that AI can't accelerate. This is the complete build, the decisions, and the eight-week commercial outcome.

The problem we saw

The trader on the team, Jonas, has been trading options discretionarily for nine years and running a small options-education Discord since 2023. The Discord has around 800 members. Most of the recurring questions in the channel are variations on "is this trade plan sane?" — a user pastes a proposed trade (strikes, expiration, thesis, max loss), and Jonas or one of the senior members tells them what's wrong with it.

The same patterns come up every week. Buying wings on earnings. Choosing an ATM call when a credit spread is the more sensible structure. Ignoring skew. Overpaying on theta.

Jonas had been answering these questions for two years. The conversion from "user asks a bad trade plan" to "user asks a better trade plan next week" was around 30%. In his words: "I'm a bad teacher at scale. I need to say the same thing to each person individually, and I don't."

The thesis was narrow: if an LLM can handle the 70% of the feedback that's pattern-matching ("you're buying premium in a high-IV environment, consider selling instead"), Jonas can focus on the 30% that requires genuine market understanding, and a much larger number of traders can get the bad-trade critique at the moment they'd otherwise have clicked "buy."

The eight-week plan

Week 1: scope, stack, first working prototype of the critique engine. Week 2: authentication, payment plumbing, minimal dashboard. Week 3: full user flow end-to-end, beta-invite email list live. Week 4: content library — 40 canonical "bad trade" patterns with pre-written critiques as RAG context. Week 5: feedback loop from beta-list users; fix top 10 complaints. Week 6: public landing, pricing page, Stripe live, first 20 paying users. Week 7: onboarding polish, reduce time-to-first-critique from 3 minutes to 30 seconds. Week 8: growth experiments, ship a referral mechanic, target 200 paying users.

We hit most milestones inside a week of plan. We slipped week 4 by three days (the RAG content library took longer than expected). We landed week 6's paying-user target a day early. We overshot week 8's target by 50%, arriving at 312 users at day 56 instead of 200 at day 49.

8 wkidea to 200 paying users
€5,920MRR at day 56
14%waitlist-to-paid conversion
68%month-2 retention

The stack, and why

We picked the stack in a two-hour meeting and didn't revisit it after. The constraint was velocity over portability; we wanted to ship, not to build a platform.

Next.js 16 with the App Router, deployed on Vercel. Server actions for the form submissions, server components for the rendering, edge functions for the stream. Deployed from a single repo with zero custom infra.

Postgres on Neon. Single database. Raw SQL via Drizzle ORM for queries we cared about; direct Postgres for anything heavy. Branching on Neon paid for itself in week 2 when we needed isolated schemas for the beta.

Claude Sonnet 4.6 for the critique engine. We tried four model routes — GPT-4.1, Gemini 2.5 Pro, Claude Sonnet 4.6, Claude Opus 4.6. Sonnet 4.6 at the 50th-percentile latency gave us the best quality-per-cost curve for structured output critique. Opus was better on hard cases but 4.5x more expensive per call and slower.

Stripe for billing. Subscriptions, Customer Portal, nothing fancy. Four hours of setup.

Resend for transactional email. Onboarding, receipts, weekly digests. Trivial to wire up.

v0 for UI exploration. We used v0 to iterate on the dashboard layout in week 1 before committing to a final shape.

Claude Code for ~70% of the code. One of the founders (me) ran most of the backend and plumbing work out of Claude Code. The engineer on the team ran Cursor for the frontend and for anything requiring deep-file-level care. The split produced more code in week 1 than our prior projects had shipped in two months.

Total software cost at 300 users: €338/month, broken down as Vercel €40, Neon €25, Stripe fees (netted against revenue) ~€180, Claude API calls €63, Resend €20, domain/misc €10. Well inside the scaling bounds for the current growth rate.

Week 1: the prototype

Day 1 we put together a single-page prototype. A textarea on the left for the user to paste their trade plan; a panel on the right for Claude's response. Prompt template was twenty lines long. Response time was six seconds.

Jonas tested it with twenty of his canonical "bad trade" examples. Seventeen of Claude's responses were good. Two were fine. One was subtly wrong in a way a retail trader wouldn't catch but a pro would — and it was the kind of wrong that could encourage a bad trade, which was the exact outcome we were trying to prevent.

That one wrong response set the architecture for the entire product. We were not going to ship a chat-with-AI trading tool. We were going to ship a structured critique tool that Claude operated inside of, constrained by Jonas's taxonomy of known-good responses. The LLM was going to be the rendering engine, not the source of truth.

The rest of week 1 was building that constraint. The critique engine now takes a user's trade plan, classifies it into one of 47 "trade archetypes" that Jonas had pre-written critiques for, pulls the archetype's critique template from the database, and fills in the specifics from the user's plan using Claude. The model's latitude is small by design — it can personalize the critique, flag unusual specifics, but it can't invent a new critique pattern outside Jonas's taxonomy.

This constraint is the feature. It's what makes the product shippable under "education not advice."

Week 2: auth, billing, minimal flow

Auth: email + magic link, no password. Four hours to ship.

Billing: Stripe Checkout, subscription model, three tiers (€19 starter, €39 plus, €79 pro), 14-day trial. The trial is tracked in Stripe, not in our database, so we don't have edge cases where trial state drifts from billing state. Six hours to ship including webhook handlers.

Dashboard: a list of the user's past critiques, a "new critique" button, a settings page. Eight hours to ship, almost all in the UI.

End of week 2: a user could sign up, enter a trade plan, get a critique, and see their history. Paywall gates the second critique; the first is free.

Weeks 3–4: the content library

This was the longest and least glamorous part of the build.

Jonas wrote 47 trade archetype critiques, averaging 350 words each. Each archetype is keyed by structural properties: trade type (credit spread, debit spread, naked call, iron condor, etc.), market context (IV rank, time to earnings, correlated asset moves), and typical retail mistake pattern.

The classification step — given a user's pasted trade plan, which archetype is this? — runs through Claude with the archetype definitions in the prompt. We evaluated it against 400 hand-labeled trade plans and got 87% accuracy on the first pass, 94% after three rounds of prompt iteration.

The 6% error rate lives as a known limitation. When classification is ambiguous, we show the user the top two candidate archetypes and ask them to pick. This is more honest than pretending the system is perfect, and in beta testing it increased trust rather than decreasing it.

Week 5: the feedback loop

We launched a private beta on day 28 with 140 users from Jonas's Discord. First session feedback was unanimous on one point: the critique response was too long. Our templates averaged 450 words; beta users wanted closer to 150.

We spent three days rewriting every critique template to a shorter format with progressive disclosure — headline finding, three-sentence rationale, "read more" for the deep breakdown. Engagement lifted immediately; time-to-second-critique dropped from 22 minutes to 8 minutes; users started rating critiques at higher accuracy scores.

Second-tier feedback: users wanted to see a practice scenario, not just a critique. "Here's what's wrong with your plan — now show me a better one to paper-trade." This took a week of work to add — a secondary generation step that produces a modified version of the user's trade plan with the critique applied. This feature became the single most-cited reason users kept subscribing past the first month.

Third-tier feedback: mobile. The dashboard was designed mobile-first in Figma but built desktop-first in the rush of week 2. Users were about 55% on mobile. We spent four days rewriting the dashboard to work properly on small screens. Conversion from trial to paid lifted 5 points after the mobile fix.

Week 6: the public launch

Day 36 we flipped the landing page public. Stripe went live. Jonas posted once to his Discord (which now numbered around 1,400 members), once on X with a 40-second demo video, and once on a Reddit subforum relevant to options traders.

First 20 paying users landed in four days. First 50 in a week. We were at 80 paying users by day 44, pacing ahead of the week 8 plan.

The landing page was plain. A headline ("Critique your options trade plans before you place them"), a 10-second looping video, a single CTA, the three pricing tiers, three testimonials from beta users, a FAQ. No pop-ups. No discount spinner. No scrolly marketing animation. We spent four hours on the landing and didn't revisit it for six weeks.

What worked: the demo video. It's a screen recording of a user pasting a plan and getting back a critique, set to 60 seconds of narration from Jonas. That video drove the majority of the inbound. A written landing without the video would have converted at half the rate.

Week 7: onboarding polish

At day 44, time-to-first-critique for a new user was 3 minutes. We set a goal of 30 seconds. It took three days to close.

Main changes:

  • Removed the mandatory profile-setup step. It wasn't doing anything useful; the user's trade plan contains everything the system needs.
  • Added a one-click sample trade. "Don't have a trade to analyze? Try this one." Sampled a canonical bad trade plan with known critique.
  • Deferred Stripe onboarding to the second critique. The first is still free; we just stopped asking for billing info upfront.

Time-to-first-critique dropped to 38 seconds. Trial-to-paid conversion lifted from 9% to 14%.

I paid because the first critique told me something that was going to lose me $400. The second critique was the proof. By the third one, this was a cost I was going to pay regardless of the price.

a user in our beta feedback survey, September 2025

Week 8: the referral loop

We added a simple referral mechanic: invite a friend, both get a month free when they upgrade. Shipped on day 51, by day 56 it had generated 48 new signups with 7 conversions. Not a home run; workable baseline for month 2.

We also ran one paid experiment — €400 on targeted Reddit ads on r/options. It converted at 3%, which is poor for paid acquisition, and we paused it. The organic channels (X, Discord, word-of-mouth) were producing better economics, and we had no reason to force paid.

End of day 56: 312 paying users, €5,920 MRR, 68% month-2 retention on the first cohort. The three of us met in person for the first time in two months and agreed the project had passed the "is this real?" threshold and was worth a second quarter.

The team composition, honestly

Three people, none of whom were full-time on this project. Jonas continued to trade his own book. The engineer continued his main day job. I split attention across this and three other businesses. Total combined human-hours into the eight-week build: roughly 640, which is about four weeks of a single full-time person.

The reason we shipped in that timeframe is not that we're unusually fast. It's that the tools compressed the work. Claude Code wrote the plumbing. Cursor handled the deep file work. v0 iterated on the UI ideas. What was left for the humans was: product decisions, trade archetype writing (Jonas), visual taste (the designer), and the hundred small choices that require judgement. That work is still slow; the volume is smaller than the volume of code, so total time compressed.

This is not a defense of AI-assisted coding generally. It's an observation specific to shipping narrow software products: when the product is well-scoped, the content layer is expert-authored, and the team has discipline about what to build, eight weeks is a plausible timeline now. In 2022 it wasn't.

The unit economics

LineMonthly, at 300 users
Gross MRR5,920
Stripe fees (3.5%)(207)
Refunds (~2%)(118)
Net revenue5,595
Infrastructure (Vercel, Neon, Resend)(75)
Claude API(63)
Domain + misc(10)
Net contribution5,447

Fully-loaded founder time is not in this table. None of the three founders took compensation from the project in the eight-week window. We agreed to revisit once MRR crossed €15,000, which we're pacing toward for Q1 2026.

CAC is hard to measure honestly because most acquisition is organic. Counting the Reddit ad cohort, which was the only paid channel: €400 spent, 12 conversions, CAC of €33 per user at a €19 ARPU. Payback month one-ish, workable but not remarkable.

What broke

Rate-limiting on Claude API. We hit the quota twice in the first month during traffic spikes, and some users experienced the "critique is taking longer than usual" spinner for 30+ seconds before timing out. We moved to a larger tier of the Anthropic API and set up a retry with exponential backoff as a belt-and-braces. Hasn't happened since.

A pricing A/B test that went badly. Week 5 we ran a "pay what you want" trial on half of new signups. The PWYW half paid 40% less on average, didn't retain differently, and confused our analytics for three weeks. We killed it. The lesson: don't A/B test pricing on a product with 300 users. You don't have the sample size to learn from noise.

A critique bug in week 6. One of the archetype templates had a bug in the Claude prompt that occasionally produced a critique in the wrong tone ("dismissive / sarcastic" instead of "constructive / teacherly"). Four users flagged it before we caught it. We apologized, refunded them a week each, and tightened the prompt test suite. None of the four churned.

What's next

We've given ourselves Q4 2025 and Q1 2026 to get to 1,000 paying users and €20,000 MRR. If we hit that, two of us take full-time on the project. If we don't, we keep it running as a part-time side project and move on to other bets.

The product roadmap is narrow on purpose. We're adding: more archetypes (Jonas is at 63 now, targeting 120), a weekly portfolio-review feature, and a tighter mobile experience. We are not adding: social features, community, any form of execution integration (that's a whole different regulatory problem), or expansion into non-options asset classes.

The broader story: a three-person team with modern tooling can ship a real software product inside eight weeks now, and charge for it, and make some money. This is not going to make anyone rich in the next quarter. It is a structural change in what "launching a SaaS" looks like compared to three years ago, and it is why, if you are sitting on a narrowly-scoped software idea in 2026, the risk of not trying it is higher than the risk of trying.

— share
— keep reading

Three more from the log.