# Evidence Appendix

Date: 2026-05-11

This appendix packages the prototype's source provenance so the demo does not depend only on an untracked local `reddit_data` symlink. Local capture paths are listed for reproducibility on this machine, but the stable identifiers below are what the prototype UI refers back to.

## Source IDs

| ID | Source | URL | Local capture | Role in prototype |
| --- | --- | --- | --- | --- |
| SRC-001 | r/startups customer-research thread | https://www.reddit.com/r/startups/comments/1t7iypn/customer_research_how_to_i_will_not_promote/ | `reddit_data/2026-05/r-startups__2026-05-10__customer-research-how-to-i-will-not-promote.md` | Core pain: Reddit conversations do not naturally become concrete product problems. |
| SRC-002 | r/ycombinator $100/month thread | https://www.reddit.com/r/ycombinator/comments/1t50tr9/what_problems_would_or_do_you_pay_100month_for/ | `reddit_data/2026-05/r-ycombinator__2026-05-10__what-problems-would-or-do-you-pay-100-month-for.md` | Money signal and willingness-to-pay skepticism. |
| SRC-003 | r/founder $100/month thread | https://www.reddit.com/r/founder/comments/1t51lty/what_problems_would_or_do_you_pay_100month_for/ | `reddit_data/2026-05/r-founder__2026-05-11__what-problems-would-or-do-you-pay-100-month-for.md` | Direct market-research and customer-interview quotes. |
| SRC-004 | r/SaaS Reddit marketing tool census | https://www.reddit.com/r/SaaS/comments/1s4blqi/why_your_reddit_marketing_tool_will_fail/ | `reddit_data/2026-05/r-saas__2026-05-11__why-your-reddit-marketing-tool-will-fail.md` | Competitor mention counts, category critique, strategic-loop gap. |
| SRC-005 | r/SaaSMarketing F5Bot / filtering thread | https://www.reddit.com/r/SaaSMarketing/comments/1sl2tfo/built_a_tool_that_finds_filters_reddit_posts/ | `reddit_data/2026-05/r-saasmarketing__2026-05-11__built-a-tool-that-finds-filters-reddit-posts-where-users-are-literally-asking-for-what-you-offer.md` | Low-yield alert pain and manual keyword-search friction. |
| SRC-006 | r/AskMarketing organic Reddit / AEO thread | https://www.reddit.com/r/AskMarketing/comments/1t4lnc7/anyone_doing_organic_reddit_marketing_to_improve/ | `reddit_data/2026-05/r-askmarketing__2026-05-10__anyone-doing-organic-reddit-marketing-to-improve-llm-search-results.md` | Spam risk, official Reddit participation constraints, outreach guardrails. |
| SRC-007 | r/EntrepreneurRideAlong validation-after-failure thread | https://www.reddit.com/r/EntrepreneurRideAlong/comments/1t4j8ev/im_finally_doing_what_everyone_says_to_do_first/ | `reddit_data/2026-05/r-entrepreneurridealong__2026-05-11__i-m-finally-doing-what-everyone-says-to-do-first-validate-with-real-conversations-before-launching-this-is-now-my-5th-platform-after-4-fai.md` | Build-before-validate failure pattern. |
| SRC-008 | r/startups founder-mistakes thread | https://www.reddit.com/r/startups/comments/1t7hrmi/what_thing_do_you_think_youve_done_wrong_thats/ | `reddit_data/2026-05/r-startups__2026-05-10__what-thing-do-you-think-you-ve-done-wrong-that-s-stopped-or-most-limited-your-startup-s-success-what-advice-would-you-give-earlier-yourself-.md` | Founder time-cost and platform-before-demand evidence. |
| SRC-009 | Customer problem discovery methodology memo | `/Users/gabedottl/.clawsy/research/customer-problem-discovery-methodology-2026-05-11.md` | same | Methodology, scoring rubric, search process, expert synthesis. |
| SRC-010 | Reddit problem discovery domain memo | `/Users/gabedottl/.clawsy/research/reddit-problem-discovery-2026-05-11.md` | same | Earlier cross-domain evidence synthesis and quote preservation. |

## Evidence Samples

| ID | Quote | Prototype use |
| --- | --- | --- |
| SRC-001 | "It feels incredibly hard to actually go from talking to strangers on Reddit to getting valuable insights that translate into concrete problems I can solve." | Pain cluster: raw conversations do not become clear customer problems. |
| SRC-002 | "Market research for deep customer problems is worth 100 a month if the insights are fresh and not generic. Most research tools are surveys or reddit scrapers. That is not worth 100." | Money signal plus warning against generic scraping. |
| SRC-003 | "High quality user research / customer interviews that actually turn into clear insights, not just nice to know data." | Buyer language for insight quality. |
| SRC-003 | "Market research i still do manually on Reddit and G2 because i have not found anything that surfaces the real complaints better than just reading them myself." | Direct competitor-gap quote: existing tools fail to surface real complaints. |
| SRC-004 | "Monitoring tools treat channels as data sources but never close the loop." | Competitor gap: monitoring is not strategic synthesis. |
| SRC-005 | "Like many of you, I used F5bot to find Reddit posts where my product could actually help. The problem is you can find only 2-3 in those 50 posts." | Low-yield alert/filtering pain. |
| SRC-005 | "I usually just search keywords manually and it is such a massive time sink." | Manual Reddit-search friction. |
| SRC-006 | "Bots or forced brand drops will just create spam signals." | Outreach safety guardrail and no-auto-DM scope decision. |
| SRC-007 | "Since then, I've built 4 platforms and each one has failed... I would get an idea, do surface-level research and get to work." | Build-before-validate pain. |
| SRC-008 | "Early on, I spent three hours a day manually hunting for leads on Reddit, which is a massive waste of founder time." | Time-cost and reachability evidence. |

## Negative Controls And Rejected Evidence

| Rejected group | Why rejected | How it changed the prototype |
| --- | --- | --- |
| Generic Reddit-tool launch posts | Launch replies and founder promotion are weak demand evidence unless users independently describe the pain. | The UI separates source ledger status from positive claims and does not treat tool-launch applause as validation. |
| Broad AI persona survey idea | Comments suggested the concept was already common and did not show paid urgency. | The product is framed as cited pain discovery, not generic AI personas. |
| AEO manipulation / auto-posting queries | Strong spam-risk warnings appeared; auto-participation would damage trust. | Outreach is gated behind manual review and framed as interview requests only. |
| Third-party competitor mention counts | Useful category signal, but not independently recounted during this prototype. | Competitor counts are labeled as reported counts everywhere they appear. |

## Score Model

Raw evidence score is the sum of five 0-20 dimensions: specificity, repetition, current workaround, money signal, and reachability. Adjusted confidence subtracts a reducer for weak source quality, self-promotion risk, Reddit-only evidence, third-party count caveats, or spam-sensitive workflows.

The prototype intentionally shows both values so a founder can see the difference between "there is a lot of signal" and "we should trust this enough to build software."
