# Reddit Researcher Prototype Process

Date: 2026-05-11

This prototype is a static end-to-end demo of a completed research run for the exact product space we are evaluating: a tool that turns Reddit and adjacent public conversations into interview-ready customer discovery evidence for founders.

The current prototype is intentionally narrower than a generic Reddit researcher. It validates three jobs:

1. Pain Evidence Finder: ranked pain clusters, exact quotes, links, current workarounds, existing solutions, and caveats.
2. Interview Candidate Finder: public posters, context, quote, why relevant, safest interview angle, questions to ask, and outreach variants.
3. Competitor Gap Miner: positive and negative competitor mentions, missing capabilities, pricing/status facts, switching triggers, and source-backed gaps.

## Scope

The prototype excludes the `reddit_capture` browser extension. That extension remains an adjacent OSS/marketing hook for capturing Reddit browsing sessions in LLM-friendly Markdown. The paid flow shown here assumes a Codex-style backend can perform Reddit and web research directly, cite sources, cluster evidence, and return structured outputs.

The report should validate two things for us:

1. Solution: is the strongest product an interview-ready evidence set rather than a dashboard, score, or market-verdict generator?
2. UX: does the interface make the evidence actionable enough that a founder knows what is real, who has the pain, what people do today, why current options fail, and who to talk to next?

## Manual Analysis Process

1. Defined the target customer:
   - Solo technical founders and indie SaaS builders who can build but struggle to find concrete problems, current alternatives, and people worth interviewing.

2. Defined the customer-discovery questions every pain cluster must answer:
   - Is this a real problem?
   - Who exactly has it?
   - What do they do today?
   - Why are current solutions insufficient?
   - Who can I talk to next?

3. Reviewed local captured Reddit data through the `reddit_data` symlink:
   - 114 Markdown captures were available in `reddit_data/2026-05`.
   - 8 primary source threads were selected for this report because they contained direct founder pain, manual research workflows, competitor/tool mentions, outreach trust constraints, or build-before-validation failures.

4. Preserved source-level evidence before synthesis:
   - Exact quote text.
   - Thread URL.
   - Subreddit.
   - Public username when available.
   - Why the quote matters.
   - What to ask the poster next.

5. Clustered pain by customer job, not by subreddit:
   - Conversation-to-problem synthesis.
   - Build-before-validation failure.
   - Manual Reddit/G2 research being trusted but slow.
   - Monitoring tools not closing the loop.
   - Reddit outreach trust risk.
   - Category demand plus Reddit-platform risk.

6. Checked current competitor facts where they materially change product strategy:
   - F5Bot establishes the free/cheap alert baseline.
   - GummySearch validates Reddit audience research demand but also proves platform/commercial continuity risk.
   - Brand24 establishes serious monitoring budgets but is framed and priced for brand monitoring, not pre-build founder validation.
   - Reddit Pro is official business tooling, not founder research synthesis.
   - Leadmatically / ParseStream / IndiePilot-style tools validate conversation discovery but pull toward spam unless the UX stays interview-first.

7. Selected interview candidates:
   - Prioritized public posters who described the exact pain, named a workaround, mentioned insufficient current tools, or could critique the product format.
   - Generated outreach variants as interview requests, not sales or automated lead-gen messages.

8. Added UX implications to the evidence:
   - Replace "market verdict" with evidence-set summary.
   - Replace "opportunity score" with pain evidence indicators.
   - Make source cards first-class.
   - Treat candidate users as a mini interview workbench.
   - Keep competitor gaps tied to pain clusters and quotes.
   - Make anti-spam guardrails visible in message generation.

## Product Conclusions Reflected In The Prototype

1. The strongest v1 is Pain Evidence Finder plus Interview Candidate Finder.
2. Competitor Gap Miner is valuable, but should support pain discovery rather than become a generic social-listening product.
3. The buyer is not paying for raw Reddit access. They are paying for a reusable, source-linked evidence set they can use to decide what to validate next.
4. The UX should not pretend to know the market verdict. It should show evidence quality, caveats, current behavior, alternatives, and reachable people.
5. Candidate users are one of the highest-value outputs, but outreach must stay human-reviewed, specific, and research-oriented.
6. The product must preserve provenance. If a founder cannot inspect the original quote and context, the report loses trust.

## Prototype Data Caveats

1. The prototype uses real quotes from local captured Reddit files, lightly normalized for punctuation and readability.
2. The run is static and does not call Reddit, Codex, or a backend.
3. The run counters describe this evidence set and local captured corpus, not a fresh live Reddit crawl.
4. Product pricing/status facts can change; the current report includes source links so those facts can be refreshed.
5. Interview candidates are public Reddit usernames from captured threads. A production product needs privacy, safety, rate-limit, and anti-spam guardrails.
6. Reddit evidence is discovery evidence, not final validation. The next real test is whether target founders will pay for or meaningfully use the interview-ready evidence set.
