What do airline ticket prices, car sales and thousands of corporate filings have in common? In each of these areas, we applied advanced big data techniques to tackle an equity investing conundrum that couldn’t be solved by human researchers alone.

Figuring out how to use big data is the next frontier for the asset-management industry. But applying analytical tools such as machine learning and artificial intelligence to disparate investment research themes also presents huge challenges. There’s no simple recipe for translating oceans of data into investment advantages. Equity investors must have the right culture—and ask the right questions—to successfully integrate data science into research and investment processes.

Why Is Big Data So Important?

There’s a colossal amount of data available to investors today. For example, more than 8,000 US-listed companies produce quarterly 10-Q reports and annual 10-K reports, each hundreds of pages long. We’ve collected 675,000 of these reports that were filed over the past 26 years. Globally, companies also conduct about 20,000 earnings calls a year in English, each yielding detailed transcripts. And if you include non-English corporate documents, the data mountain would mushroom exponentially.

In theory, portfolio managers have a fiduciary obligation to pore over thousands of pages of data to fully gauge the risks and opportunities that a company faces. Practically speaking? It simply isn’t humanly feasible to parse so much information efficiently. For the US market alone, you would need at least 30 full-time research analysts just to read every filing and call transcript. Investment firms need to ask whether this is a cost-effective effort that would lead to good investing outcomes.

Data science offers a solution by applying machine learning and artificial intelligence to the mountains of information. Yet even the smartest software requires human direction and expertise to translate data into actionable fundamental research conclusions.

For example, consider an analyst researching the cycle of a car sale. The information available today is varied and vast. During the initial engagement, potential US-based customers might conduct web research on sites such as Caranddriver.com or Edmunds.com while consuming information from social media marketing campaigns. As customers near a decision, price comparison websites create more data, along with surveys, apps and webcarts. Completed transactions generate email receipts. After a sale, customer satisfaction is gauged via online reviews.

Collectively, all these data should provide an analyst with unprecedented intelligence. But to generate insight, the analyst must ask questions that make sense out of the data: How many customers are coming into the store to research the product? What’s the competitive pricing environment? Which product features are being praised or panned in reviews?

These questions must be as specific as possible. Better questions will lead to better outcomes and generate the investment insight that can support a high-conviction portfolio position.

How Can Skills Be Brought Together to Generate Insight?

Generating this insight requires combining a broad set of investment skills. Large data sets must be crunched and combined with complex statistical and economic models. Investment organizations rooted in quantitative research may seem more attuned to data science, but they might not be equipped to make sense of the information.

Fundamental analysts can apply research intuition by asking the right questions needed to extract useful information from huge pools of data, but they may lack the technical skills to process it efficiently. In the following case studies, we aim to show how a hybrid approach that draws upon diverse analytical skill sets—combining data science with fundamental and quantitative analysis—can help investment teams rise to the data challenge.

Case Study 1: Big Data to Study Airline Capacity Utilization

Question: How does additional airline capacity affect pricing power and what are the implications for specific airline holdings?

As any traveler knows, airfare pricing is extremely complex and opaque. Prices on different routes can fluctuate dramatically from one day to the next, as seat supply and passenger demand shift to the tune of multiple market forces. This makes it very difficult for a transportation analyst to draw conclusions about an airline’s capacity, its pricing and ultimately its profitability.

So in 2018, we set out to mine big data in order to learn more about how airline capacity affects pricing power. The project reinforced the importance of applying thoughtful fundamental research techniques to derive insights from the data.

The research question was straightforward: How does additional capacity impact pricing power? To answer the question, we scraped millions of rows of ticket pricing data from airline websites. But the raw, unstructured data first needed to be aggregated and cleaned up by our data scientists.

This was a good start, but our fundamental analysts weren’t satisfied. The pricing data alone couldn’t really answer the research question. What we needed was more information on aircraft capacity and how many seats were actually being filled on each flight (Display). For that, we turned to a data set provided by the Department of Transportation (DOT) on airline capacity.

We set our data scientists to overlay more than 1 million rows of DOT data with a six-month lag with airline-reported data by route. Working with the big data team, our analysts started by matching quarterly fare and capacity data for 264,000 pairs of cities, then consolidated the airlines and removed monopoly markets from the data set to focus on the effects of competition. This reduced the quarterly city pairs to 54,000.

The study revealed two clear conclusions. First, airlines typically add capacity to routes with higher-than-average growth of passenger revenue per available seat mile (or PRASM, the industry measure of the profitability of a given route). Second, within four quarters, the increase in capacity tends to slow. This triggers a recovery in PRASM.

These conclusions were not academic. By gaining a better understanding of pricing dynamics, we developed a clearer view of the earnings potential of one of our holdings, which underpinned our conviction and allowed us to increase our position. What’s more, the research has enabled us to monitor real-time airline pricing more accurately, and we use these data to engage management in discussions about their plans for future route expansions.

Case Study 2: Measuring Consumer Sentiment of New Car Sales

Question: Can we predict the success of Subaru’s new vehicles, which are key to helping the company recover its operating margin?

For an auto industry analyst, it can be hard to know which way the winds are blowing when it comes to brand popularity. In 2018, our analyst was researching Subaru to find out whether the Japanese company’s new and refreshed models would gain sales momentum and help the company recover its operating margin.

To answer the question, we scraped nine different consumer car websites, collecting more than 50,000 historical SUV reviews. When studying the reviews, we looked at aggregate ratings for different models and applied natural language processing (NLP) to extract key topics and create sentiment scores based on review texts and historical vehicle sales. Fundamental analysts provided quarterly revenue data for each vehicle, which helped guide us toward the important topics and car features that were good predictors of changes in quarterly sales.

We found that sentiment around technology, driving dynamics and styling were important harbingers of success for a model. And by monitoring real-time sentiment, we found that consumer attitudes toward the new model were declining at the same time as quality concerns around other Subaru models began to materialize. This led us to initiate calls with the company’s management to investigate the issues further.

More broadly, the analysis has allowed us to monitor consumer sentiment toward new vehicles, which supports more accurate forecasts that don’t rely on corporate reports. Independently generated insight also creates opportunities to engage with management in deeper conversations about their products and financial trends.

Don’t Dismiss the Human Touch

In these cases, getting big data to work involves getting the structure and culture of an organization right. We think the data analysis function must be integrated closely with investment teams and industry analysts. And given the costs involved, it’s important to choose the right projects, where data analysis can make the biggest impact.

Investment firms that address these challenges will have a greater chance of success at leveraging the benefits of data science to deliver portfolio returns for clients, in our view. There are no easy answers. But one guiding principle is clear: even when applying the most sophisticated artificial intelligence systems, adding a human touch to the equity research process is the best way to turn big data into big investment insights.

Nelson Yu is Head of Quantitative Research at AllianceBernstein (AB).

Chris Hogbin is Chief Operating Officer—Equities at AllianceBernstein (AB).

The views expressed herein do not constitute research, investment advice or trade recommendations, do not necessarily represent the views of all AB portfolio-management teams, and are subject to revision over time.

Clients Only

The content you have selected is for clients only. If you are a client, please continue to log in. You will then be able to open and read this content.