“Data! Data! Data!” he cried impatiently. “I can’t make bricks without clay.” – Sherlock Holmes.
Arthur Conan Doyle’s famed detective may have been tackling a crime mystery in “The Adventure of the Copper Beeches” when he proclaimed the above, but today his drive to amass reams of data would be as at home in Canary Wharf as it once was in 221B Baker Street.
While the concept of “big” data (datasets so large or complex that traditional methods cannot process them) has become a familiar one, the investment world’s efforts to identify innovative and exotic sources of returns have given rise to a new theme: “alternative” data.
Broadly speaking, anything that falls outside typical sources of investment information, such as financial reports and economic indices, can be classified as alternative data. This can range from now well-known indicators that attempt to glean market sentiment from Twitter babble or Google searches, to sophisticated algorithms that automatically listen in and analyze earnings calls, or even scrutinize high-resolution satellite imagery of retailers’ parking lots.
As the number of alternative data sources grow, so do the number of companies promising to provide new and profitable information. Start-ups like RavenPack, which provides a variety of alternative data services, have grown in prominence, while established data providers, including Bloomberg and Thomson Reuters, have developed their own proprietary sentiment data offerings.
Even companies from outside the industry are getting in on the act, with some discovering that they are sitting on a trove of user data that can be shared, at a price of course. For example, the social networking app Foursquare now enables investors to analyze where users have been "checking in,” potentially uncovering trends in business activity. Paired with the growth of the so called “Internet of Things,” through which Silicon Valley envisages the complete inter-networking of our lives, investors willing to pay for the privilege could soon have formidable access to information relating to every aspect of our lives.
Bringing together these myriad sources of alternative data are new data aggregators, which are attempting to disrupt an industry dominated by big players such as Bloomberg. One such company is Quandl, a data provider which has amassed over 200,000 users since it was founded in 2011 and has now received over $20 million in venture capital funding.
Although the growth in alternative vendors is providing investors with cheaper access to important data, the choice presents a headache; much like media-streaming companies Netflix or Amazon Prime, each data provider promises exclusive access to certain content, leaving subscribers with the choice of either missing out on potentially valuable information, or paying for a growing set of data providers.
“Big data is not about the data” – Gary King, Harvard University
Sourcing alternative data is one thing, but turning it into effective investment signals is quite another. In some cases, the sheer scale of the information can be too much for traditional desktop software. Innovative data storage solutions from the likes of Apache Hadoop and Microsoft’s Azure cloud computing offering have stepped forward.
The need to decipher the cacophony of modern data also brings in the other major tech theme of the day: artificial intelligence (AI). Specifically, machine learning, the subfield of AI that specializes in autonomously learning complex relationships in data, can be required to analyze alternative data in order to produce investment signals. This is an area of active research within Aberdeen.
For several years we have partnered with the Scottish Financial Risk Academy to provide industrial placements to Quantitative Finance MSc students undertaking their dissertations. Last year we worked with students from the University of Edinburgh and Heriot-Watt University to investigate whether using alternative data sources could enhance investment decisions.
"The study demonstrated that using this alternative data enhanced the performance of various investment strategies”
“The study demonstrated that using this alternative data enhanced the performance of various investment strategies”
For this project we used two different data sources, which automatically read through thousands of news articles from around the world (“data-scraping” in technical jargon), applying the machine learning field of natural language processing to discern whether a particular article has favorable or unfavorable sentiment towards certain asset classes. The study demonstrated that using this alternative data, which gives investors a real-time insight into the ebb and flow of public sentiment, enhanced the performance of various investment strategies forecasting regional equity markets and trading crude oil contracts.
While providers scramble to sell alternative data, some investors may remain skeptical; the bar is high to prove that expensive new data sources provide sufficient extra information and alpha over traditional analyses to justify their price. The proliferation of alternative data sources also raises questions around market efficiency, as well as whether any market advantage will be eroded away as the information becomes more easily accessed.
The growing debate around data privacy could also pose a challenge to those wishing to monetize their customers’ activity. Some consumers may be more wary of smartphone apps’ terms and conditions when they realize their location data is being sold on to Wall Street. Nevertheless, financial analysts now have unprecedented access to an abundance of sophisticated information, and the key question today is whether we will drown in a sea of bytes, or we will manage to find the signal in the noise.
Companies mentioned are for illustrative purposes only and are not intended to be a recommendation to buy or sell any security.
Image credit: Hazel Stevenson / Alamy Stock Photo