VCBench: Benchmarking LLMs for Founder Success Prediction in Venture Capital
VCBench has been launched as the inaugural benchmark aimed at assessing large language models (LLMs) in their ability to forecast founder success in the venture capital (VC) sector. This benchmark tackles a field marked by limited signals and unpredictable results, where even leading investors show only modest performance. Initially, the market index records a precision of 1.9%, with Y Combinator exceeding it by 1.7 times and top-tier firms by 2.9 times. VCBench comprises 9,000 anonymized founder profiles, standardized to maintain predictive integrity while minimizing identity exposure, demonstrating a reduction of over 90% in re-identification risk through adversarial testing. Evaluations included nine advanced LLMs, such as DeepSeek-V3 and GPT-4o, with DeepSeek-V3 achieving more than six times the baseline precision, while GPT-4o secured the highest F0.5 score, surpassing human benchmarks in most cases. This resource is intended as a public and continuously updated dataset to enhance AI-driven analysis in venture capital.
Key facts
- VCBench is the first benchmark for predicting founder success in venture capital.
- Market index precision at inception is 1.9%.
- Y Combinator outperforms the index by a factor of 1.7x.
- Tier-1 firms are 2.9x better than the index.
- VCBench provides 9,000 anonymized founder profiles.
- Adversarial tests show more than 90% reduction in re-identification risk.
- DeepSeek-V3 delivers over six times the baseline precision.
- GPT-4o achieves the highest F0.5 score among evaluated models.
Entities
Institutions
- Y Combinator
- VCBench