In the unprecedented environment of 2020, alternative data finally arrived. With its wide variety of datasets and rapid cadence, alt data proved invaluable for many newcomers trying to make sense of market conditions the likes of which they had never encountered before.
At least twenty years in the making and often shrouded by secrecy, the sector has finally come of age, assisted by developments in both technology and demand. Market participants agree that by 2030 the alternative data sector will be exponentially larger than today, with one regularly-referenced research note seeing the market set for 10x growth by 2027. It may even, by that point, have crossed the final maturity threshold to be known as merely ‘data’.
What is Alternative Data?
Defining alternative data has always been a challenge. The hive mind has decided that alternative data is ‘anything that is not traditional data’, with the latter meaning the kinds of corporate and government financial and market data on which investment decisions have been based for decades.
While the phrase alternative data is generally used in financial markets, a corporate buyer might refer to the exact same information as ‘external data’. These often novel data sources can be viewed as a means for an outside party to get beyond the official information a company wants you to see, and to glimpse the underlying streams being analyzed in its boardroom.
How it all began
It is hard to pinpoint when alternative data was born. Some have gone back as far as the Babylonian custom of measuring the Euphrates as intelligence ahead of business investments; or to Venetian traders studying the flags of incoming ships to get advance knowledge of what produce they might unleash on the domestic market. More recently, fans of ‘80s cinema might remember the market value afforded by an early crop report in “Trading Places”.
I see three major step-changes. The soft start was at the turn of the millennium, when some of today’s longer-toothed major players first began producing data for the market; the second stage began in 2015, when a Wall Street Journal article first thrust the shadowy sector into the limelight and anointed ‘alternative data’ with its given name; and the third and current act begins in 2020, when the pandemic rendered countless models rudderless in alien new market conditions, and the call for high frequency alt data became a clamour.
Alt data: the universe
The types of alternative data are innumerable and growing, but some major segments attract the most activity and attention, while countless smaller players contribute to its diversity. As alt data aims usually to reveal economic behavior close to real time, aggregated credit card transactions have long been a powerful source of insight, as an almost instant map of customer behavior, be it around a particular brand, location or time of day. Geo-location data from mobile phones can be an indirect route to the same answers and more. Satellites and alternative data have long gone hand in hand, with hedge funds using imagery from space to count cars in parking lots, or get an early jump on the health of crops. New imaging technology and the increasing availability of data can reward the innovative practitioner with many creative use cases.
One of the earliest but still highly prevalent forms is web-scraping. That is, capturing publicly-available information online and turning it into functional data. Scraping product prices on a retail website and seeing how they change, for example, can give an investor or a competitor insights into the finances and market strategies of both the retailer and product creators. Equally, tracking a company’s job postings or hires on websites can give strong hints as to what and how that company is doing in a particular segment or geography. Recently some of the hottest alt data has been in the ESG space, where the ability to scrape and parse quantities of obscure local news outlets, or track the diversity of company boards, has powerfully aided investors.
Forces behind the rise
Alternative data has not grown in a vacuum. Many of the factors driving its rise are familiar, such as exponential computing power growth and the Internet Of Things, with appliances such as refrigerators becoming ‘smart’ and opening up countless opportunities for relevant data creation.
Simultaneously, processing capabilities were advancing. In the early ‘00s the work of the ‘godfather of AI’ Geoffrey Hinton opened the door to huge strides in deep learning and neural networks. These in turn unleashed Natural Language Processing, enabling new meaning such as sentiment to be extracted from words scraped from twitter, or even from the tone of senior management during earnings calls. Perhaps most appositely, it has created a general data culture of kaggle competitions and prompted the use of machine learning to achieve ever higher degrees of predictive accuracy. This increased predictive capacity has in turn unleashed immense value when combined with new data sources and brought to bear on financial markets.
Until now, the major beneficiaries from alternative data have been hedge funds. These early adopters have had the agility and the financial clout to find, ingest and map alternative data, using it to extract astronomical amounts of market value. This was to be expected in a new sector with large up front costs and marked competitiveness around information-sharing.
But as a sector matures, the barriers to entry begin to fall away. In alternative data we see this in the increasing numbers of companies springing up to facilitate the process of managing and extracting value from alternative data. There are companies that clean data, companies that transport data in bulk, marketplaces to ease the buying and selling, and companies that understand the techniques developed by hedge funds to find the ‘signals’ in the data and are able to aid investors in the process.
Marketplace chatter often centres around the concept of ‘filling the mosaic’. The idea is that combining various forms of alternative data with traditional data in one place could form a single holistic ‘mosaic’ picture of a stock. This is viewed as the holy grail, since the challenges in obtaining, manipulating and aligning diverse data sets are substantial.
Alternative data: the future
Market participants today agree on two seeming certainties for the near future: consolidation and expansion. Currently over 1,000 alternative data providers are attempting to differentiate themselves in a market around $1.7bn in size, with consolidation amongst the existing field appearing inevitable. That said, the constant birth of new datasets will also create a ‘comet’s tail’ that will always keep the picture from simplifying too much.
The imminent, universally-predicted, explosion in the size of the alt data ecosystem will be driven by a range of players. Private equity is increasingly cognisant of what alt data can provide when sourcing new targets or conducting due diligence. Corporations, often stimulated by consulting firms, see the value in using alt data as intelligence to feed into their own strategies. Governments want ever more granular means of measuring their domestic and relative economic performance, with clunky GDP surveys now seen as old-fashioned.
The wider asset management sector still holds some of the lowest-hanging fruit for alternative data. An AIMA survey in 2021 showed that 50% of large hedge funds are already investing in alternative data. Investment management meanwhile is waking up to the opportunities offered by alternative data, with internal Data Science teams an increasingly regular presence.
Exabel was founded in Oslo in 2016 by Øyvind Grotmol, a brilliant serial winner of international competitions in maths, physics and programming. In fact, those competitions proved a fertile recruiting ground for Exabel’s early team, many of whom still make up the company’s core, and who have now been supplemented by world class talent from the investment management, product and technology domains.
Five years on, and the Exabel team have created a flexible and intuitive technology platform to simplify both the provider and consumer sides of the alternative data market. Data vendors use the platform to build and distribute easy-to-consume branded data insight offerings on top of their raw data assets. Investment teams can consume these insights and use Exabel’s powerful machine learning analytics and modelling toolset to further refine, combined and evolve them. Exabel creates the zone in which ‘filling the Alt-Data mosaic’ can become reality.
by Mark Fleming-Williams, Host of The Alternative Data Podcast.