“The race is not always to the swift nor the battle to the strong, but that’s the way to bet” is attributed to Damon Runyon, who wrote about Broadway in the 1940s. Some investors use this as a guiding principle for predicting winners and losers. Notwithstanding the advice on post-Depression Broadway, we created a list of Big Data players based on a simplistic set of criteria summarized in five words:
Business value was a basic requirement. We also chose to not consider the “MISO” (Microsoft, IBM, SAP, Oracle) and equivalent large players such as Cisco, HP and EMC because they run multi-faceted businesses. The Sand Hill 50 represents a unique set of players of all shapes, sizes and hues that stood out of the crowd. Here they are in alphabetical order.
1) Actian — Business-oriented data management solutions to transact, analyze and take automated action across business operations. They have successfully incorporated technologies such as Ingres, Pervasive and ParAccell. 10,000 paying customers are a major asset.
2) Actifio — Infrastructure player with a compelling ROI value proposition of minimizing copies of data — a key hygiene factor in managing Big Data in the enterprise. They have a lot of momentum with a potential IPO in 2014.
3) Aerospike — Real-time Big Data analytics with a hybrid approach. They promise the speed of an In-Memory database with the persistence of rotational drives. They are classified as the only “visionary” in the Gartner Magic Quadrant for operational database management systems.
4) Alpine Data Labs — Predictive analytics platform using Hadoop. Targeted at customers that have taken the first step with Hadoop and want to deploy advanced analytics solutions. They have several banking customers including Barclays. Other customers include Sony, Nike and Kaiser Permanente.
5) Alteryx — SAS alternative for statistical analysis applications such as marketing analytics with an advanced visualization story based on R statistical-programming language. Their success will depend on how well they execute on the consumer-friendly promise with traditional users. Customers include Paychex, Kroger, Michaels and Equifax.
6) Appfluent — Addresses an immediate practical requirement to manage the coexistence of Hadoop in the traditional IT environment. They promise to reduce waste by analyzing business activity and data usage across traditional data warehouses and identify data that can be offloaded to Hadoop. Customers include Pfizer and Union Bank of California.
7) Attivio — Advanced content analytics across data silos with a few twists such as intelligent correlation. This is a variation of Endeca (acquired by Oracle) with a technical value proposition with an engineering centric DNA from Mathworks and Ab Initio.
8) Ayasdi — Machine learning with high-end visualization of complex data sets based on topological data analysis. Partnering with Texas Medical Center and Lawrence Livermore National Laboratory. Customers include UCSF, Merck and GE.
9) C3global — Predictive operational analytics for manufacturing, energy and utilities based in Scotland with a measurable ROI value proposition. Customers include Chevron, National Grid (UK) and SA Water (Australia).
10) ClearStory — High-speed data analytics and visualization using In-Memory database technology andApache Spark clustering system. Google pedigree from the designers of Google Analytics and Google Adwords. Customers are the Dannon Company, Kantar Media and DataSift (see below).
11) Cloudera — Market leader that was a pioneer in 2009 with the Hadoop platform and founders from Google, Yahoo, Facebook and Oracle. They have parlayed their pioneer status to become an influential member of the Big Data ecosystem.
12) DataKind — Outstanding story of non-profit of data scientists for social change. They bring high-end skills to disenfranchised communities and social organizations and tackle complex problems such as natural disasters and crimes using data analytics.
13) Datameer — Brings Big Data technologies to business users familiar with using spreadsheets for analyzing and presenting data for traditional BI solutions. Extensive list of customers includes Sears, Workday and Visa.
14) DataSift — Leading data aggregator and reseller for Twitter and other social media sources. Based in the UK. Major player in emerging data ecosystem around Twitter. Prominent customers include Dell, Yum Brands and CBS interactive.
15) DataStax — Ecosystem player and commercial vendor for enterprise-ready Casandra, Apache Hadoop and Apache Solr. Rapid adoption in the last two years leading to 300 customers including Adobe, eBay, Thomson Reuters and Netflix as well as 20 of the Fortune 100.
16) Elasticsearch — Open search alternative to Solr that combines search and analytics, with over two million downloads and widespread adoption by enterprises. Company provides enterprise-grade support, consulting and training. They have success stories with customers such as McGraw Hill, Klout and FourSquare.
17) Gnip — Ecosystem player for data aggregation from social media sources including Twitter, Klout, Tumblr and WordPress. Their customers include IBM, Adobe, Pivotal, Salesforce and 95 percent of the Fortune 500.
18) GoodData — Solution to integrate data from standard data sources such as Salesforce and create visualizations and dashboards. Their customer base of 20,000 includes Target, Time Warner Cable and GitHub.
19) Guavus — Analytics solution focused on telecommunication companies and network providers, both of which have large volumes of data. Their customers include industry leaders in these areas.
20) Hadapt — Analytic platform to natively integrate SQL with Apache Hadoop enabling easier querying of large data sets by mainstream users. Use cases cited by the company are in the areas of advertising, security and electronic discovery.
21) Hazelcast — Open source In-Memory data grid with over 10,000 deployments. They address a key data management problem in analytics by distributing the data in a grid. Their customer examples are in the areas of financial trading and massively multiplayer gaming. Also targeted at uses cases that require “burst” capacity.
22) Hortonworks — Commercial Hadoop platform leader with a large number of code committers for Hadoop and extensive partnerships in the Big Data ecosystem. Diverse customer base includes Cardinal Health, Western Digital, eBay and Samsung.
23) Jaspersoft — Open source BI suite with 14,000 commercial customers and a large number of partners. Customers include Alcatel-Lucent, McKesson and Puma.
24) Kaggle — Creates predictive analytics competitions for the data scientist community to solve. Real-world problems solved in the areas of financial services, healthcare, energy and retail. Results delivered to GE, Allstate, NASA, TESCO and Merck.
25) Karmasphere — Collaborative analytics workspace that brings data science to business analysts using SQL. Customers include Playfish and XGraph.
26) Kontagent — Mobile analytic solution for app developers, marketers and producers with 250 million monthly active users. Announced on December 11 that they are merging with PlayHaven. Customers include Electronic Arts, eHarmony, Kaiser Permanente and Turner Broadcasting.
27) LucidWorks — Search, discovery and analytics solution based on Apache Lucene/Solr. Customers include Sears, ADP and Raytheon.
28) MapR — Big Data platform based on Hadoop and NoSQL Their customers come from financial services, retail, media, healthcare and manufacturing as well as Fortune 100 companies. Customers include CIsco, Xactly, Cision and Rubicon.
29) MarkLogic — Schema-agnostic enterprise NoSQL database technology, coupled with powerful search and flexible application services. Their clients include Warner Brothers, Dow Jones, Citigroup and Boeing.
30) MongoDB — NoSQL database solution with four million downloads and 600 customers. Their customers include MetLife, Forbes, Cisco and FourSquare.
31) Mu Sigma — Consultants providing analytics services to 75 Fortune 500 companies in the areas of marketing, risk and supply-chain management. They have customer case studies from companies in pharmaceuticals, retail, insurance and banking.
32) Neo Technology — Services based on the Neo4j graph database that has a large ecosystem of partners and extensive deployments worldwide. Neo4j has been implemented in Adobe, Cisco and Deutsche Telekom.
33) NGData — Consumer intelligence solutions based structured and unstructured data with a focus on banking, retail and publishing. Their recommendation engine based on real-time analysis of customer behavior and integrates with ecosystem players such as SAP, SAS and Tableau.
34) Opera Solutions — Consulting leader in predictive analytics using Big Data. They partner with Oracle, QlikView and SAP. They have success stories in large number of verticals including consumer finance, insurance and healthcare.
35) Oxdata — Statistical analysis software that works with HDFS targeted at the non-statistician. The founding team comes from DataStax and Platfora.
36) Palantir — Analytics solutions with a focus on large-scale problems for the public sector such as Medicare fraud, environmental impact of oil spills and gang violence. Reported to have raised $605 million in financing in the last five years.
37) ParStream — Columnar database for real-time Big Data analytics. They have use cases in the areas of search and selection, business analytics, and automatic response systems. They have customers in telecommunications, financial services and marketing.
38) Pentaho — Suite of applications for data access, visualization, integration, analysis and mining with 10,000 deployments in185 countries. Their prominent customers include Lufthanhsa, Telefonica and Marketo.
39) Pivotal — Big Data and cloud application platform formed in 2013 from EMC/VMware/Greenplum with established technology products and customer base.
40) Platfora — Big Data analytics platform for analyzing business data across events, actions, behaviors and time. Their customers include Disney, Shopify and Edmunds.com.
41) PROS — Predictive analytics solutions for sales, pricing and revenue management. Their targeted areas include travel, distribution, manufacturing and services. Customers include Lufthansa, Cummins and Navistar.
42) Qubole — Cloud data platform that hides the complexity of infrastructure management. Founded by former Facebook data service team members. Customers include Pinterest, Nextdoor and Quora.
43) Revolution Analytics — Commercial support for users of “R” language for statistical analysis. Extensive customer base includes American Express, Kraft Foods and Merck.
44) Rocket Fuel — Media buying platform for advertisers using advanced analytics. Diverse customer base includes BMW, Comcast and Pizza Hut.
45) SISense — Analytics platform with a focus on scalability and visualization using a columnar database and HTML5 technologies. Their customers include Caterpillar, Philips and Target.
46) Skytree — Advanced analytics using machine learning implemented using a distributed architecture. Customers include SETI Institute, eHarmony and US Golf Association.
47) Splunk — Operational intelligence software to analyze machine data used by 6,400 enterprises globally including half of the Fortune 100. Case studies include Tesco.com, Survey Monkey and NPR.
48) Tableau Software — Visualization solution for analytics with extensive partnerships in the BI ecosystem. They have customers in a wide range of industries.
49) The Hive — Co-creator and accelerator for businesses that use large volumes of data for intelligent decision-making. The Hive regularly hosts events featuring thought leaders in the application of Big Data technologies.
50) WibiData — Platform that allows companies to create a site enabled by advanced analytics that fine-tunes itself based on user interaction. Customers include Wikipedia, Rich Relevance, Opower and Atlassian.
Shirish Netke is president and CEO of Amberoon Inc.
M.R. Rangaswami is co-founder and CEO of Sand Hill Group and publisher of SandHill.com.