Researchers Introduce CapitalVX: A Machine Learning (ML) Model That May Help Investors Identify The Next Unicorn

Launching a start-up is a lot harder than one’s imagination, and the failure rate is high. According to studies, the start-up’s failure rate is about 90%, where 21% of start-ups fail within their first year. This presents a considerable risk to venture capitalists and other investors in early-stage companies. 

With an aim to identify companies that are more likely to succeed, researchers from Venhound Inc. and Santa Clara University propose a new machine-learning model that accurately predicts whether a start-up firm will fail or become successful. This ML model named CapitalVX (for Capital Venture eXchange) is trained on the historical performance of over 1 million companies. 

This research demonstrates how ensembles of non-linear machine-learning models applied to big data can map large feature sets to business results, which is impossible with typical linear regression methods.

The proposed ensemble of models includes the combined contribution of the models, which outweighs the predictive potential of each model individually. Each model classifies the company, assigning it to one of the success or failure categories with a specific probability. For example, a company may be very likely to succeed if the ensemble predicts that it will be in the IPO (listed on the stock exchange) or ‘acquired by another company’ category, while only 25% of its predictions will be in the failed category.

Their findings suggest that these models may accurately forecast a company’s outcome up to 90% of the time, meaning 9 out of 10 businesses are likely to be assessed appropriately.

Given Crunchbase’s crowdsourced nature, it’s not surprising that some companies’ entries have missing information. This judgment inspired them to calculate the quantity of missing data for each organization and use this value as an input to the model. Then they combined Crunchbase findings with patent data from the United States Patent and Trademark Office (USPTO). This insight provided one of the most critical factors in deciding whether a business would be purchased or fail on its own.

The ensemble of models with new data features offers a significant level of accuracy, precision, and recall compared to previous studies. This study enables investors to quickly evaluate prospects, raise potential red flags, and make more informed decisions on the composition of their portfolios.



Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.

↗ Step by Step Tutorial on 'How to Build LLM Apps that can See Hear Speak'