Premise
Big data analytics (BDA) is the heart of the digital business, the basis for turning data into a business value that drives differentiating operations and customer experiences. Clouds have become the principal development platforms for BDA apps in this new world of 24×7 always-on operations. More BDA in large enterprises runs in hybrid clouds while public clouds have becoming the predominant BDA platform for many small to mid-sized businesses.
With George Gilbert and Ralph Finos
In the 21st-century, the enterprises that succeed are those that converge their big data analytics (BDA) investments into data-driven transactions, operational business intelligence, predictive analytics, machine learning (ML), deep learning (DL), artificial intelligence (AI), stream computing, and other big data-fueled capabilities in cloud-native environments. Consequently, BDA has become a boardroom conversation; although, too often the question being discussed is “why aren’t we generating the expected returns from BDA?”
The tech industry is making progress on answering those questions. User convergence on full-stack BDA architectures is a key that will unlock digital business success. However, technology alone is insufficient for driving analytic applications unless organizations accelerate digital transformation to structures and processes that leverage BDA’s disruptive potential. Likewise, organizations will need to step up the empowerment of their digital workforces by investing in people, roles, skills, organizational centers of excellence, and other pillars of data-driven culture. At the very least, more business analytics and other knowledge workers will need to shift toward more of a development role in the ML, DL, and AI initiatives that drive disruptive business applications.
The worldwide BDA market grew at 24.5% in 2017 vs 2016, faster than we forecasted in last year’s report as a result of better than expected public cloud deployment and utilization as well as progress in convergence of tools. Enterprises are moving more rapidly out of the experimentation and proof-of-concept phases to achieving higher levels of business value from their big data deployments.
Looking forward, the overall the BDA market will grow at an 11% compounded annual growth rate (CAGR) to $103B by 2027 (Figure 1). Edge computing – including streaming and ML app deployments on smart devices will boost the market in the out years.
Figure 1: Worldwide Big Data Hardware, Software, and Services Revenue $B 2016-2027
Trends Overview
Going forward, the following digital business trends will accelerate solution convergence up and down the stack include:
- Achieving strategic outcomes. Customers will assure strategic business outcomes by converging on strategic BDA solution providers who deliver pre-built applications that incorporate best practices and are rapidly customizable to their unique requirements.
- Reducing latencies. Customers will increasingly eliminate delays and bottlenecks throughout data, application, and business infrastructures by converging on internal pipelines that incorporate fewer discrete BDA products.
- Streamlining processes. Customers will reduce complexities by converging on fewer, simpler, more consistent, and more automated BDA development and operations processes that span disparate pipelines, platforms, tasks, and roles.
- Tightening controls and safeguards. Customers will tighten oversight and compliance by converging on unified governance tools that enforce security, policy, and other guardrails up and down the BDA stack and across disparate pipelines, platforms, and tasks.
From Wikibon’s latest update to the annual BDA market study, what remains the same as in previous years are the following challenges:
- Excessive complexity. BDA environments and applications are still too complex and need to simplified to put them within reach of mainstream user and developers, many of whom lack in-house IT staff with the requisite specialized skills.
- Cumbersome overhead. BDA administration and governance processes are still too siloed, costly, and inefficient for many IT professionals.
- Protracted pipelines. BDA application development, training, and deployment pipelines are still too time-consuming, manual, and inconsistent, and need to be automated, accelerated, and streamlined to a greater degree.
- Bespoke applications. BDA professional services are still essential for developing, deploying, and managing the many bespoke applications that span across hybrid clouds, involve disparate platforms and tools, and incorporate unfathomably complex data processes.
Wikibon’s latest BDA market study found that several trends are driving the industry’s competitive landscape. See Figure 2 for a breakout by year through 2030.
The trends are:
- Public cloud providers are expanding their sway over the BDA market. By 2020, the BDA industry will have converged around three principal public cloud providers—most likely, AWS, Microsoft Azure, and Google Cloud Platform—and most ISVs will build solutions that operate in all of them. These and other BDA public cloud providers—including such established BDA vendors as IBM and Oracle–offer managed IaaS and PaaS data lakes into which customers and partners are encouraged to develop new applications and into which they’re migrating legacy applications. As a consequence, the pure data platform/NoSQL vendors seem to be flatlining and becoming marginalized in a BDA space increasingly dominated by diversified public cloud providers vendors. Likewise, the old-line business analytics vendors are losing momentum and falling by the wayside as the market migrates to cloud-based solutions from diversified BDA for these capabilities.
- Public cloud advantages over private clouds for BDA continue to widen. By 2021, public clouds will be the preferred BDA approach for every customer segment, including large enterprises. That’s because BDA public cloud solutions are maturing more rapidly than on-premises stacks, adding richer functionality, with increasingly competitive cost of ownership. Unlike on-premises alternatives, public cloud providers can offer BDA databases (e.g., Google’s Spanner, Microsoft’s Cosmos DB, and AWS’s DynamoDB) that provide global consistency and availability, owing to the proprietary fiber networks linking their data centers. Likewise, public cloud services can collect far more operational telemetry than product designed for on-premise deployment, thereby enabling public clouds to use ML to automate IT operations and application performance management to a greater degree than is possible on-premises alternatives. Furthermore, public clouds are growing their application programming interface (API) ecosystems, deepening their functionality, and enhancing their administrative tools faster than what is emerging from the traditional BDA solutions designed for on-premises deployments.
- Hybrid clouds are becoming an intermediate stop for enterprise BDA on the way to more complete deployment in public clouds. By 2022, hybrid clouds will figure into the plans of most large enterprise, but predominantly as a makeshift strategy. The balance is tipping toward enterprises putting more of their BDA assets in public clouds, thereby boosting the importance of hybrid public/private BDA clouds as transitional architectures. Traditional BDA vendors are adding developing their products for hybrid use cases, creating new options for traditional workloads. By the same token, more traditionally premises-based BDA platforms are being rearchitected to deploy primarily in public clouds. For example, there are several partially “cloudified” versions of Hadoop on AWS from established distribution vendors, as well as alternatives both from that public cloud provider (i.e., Elastic MapReduce) and from other BDA solution providers such as Qubole, Snowflake, and Teradata.
- Cloud-based BDA silo convergence is speeding enterprise time-to-value. By 2023, most large enterprise customers will have converged their siloed BDA assets into at least one strategic public cloud. The growing dominance of public cloud providers is collapsing the cross-business silos that have heretofore afflicted enterprises’ private BDA architectures. Just as important, BDA solutions, both cloud-based and on-premises, are converging into integrated offerings geared to reducing complexity and accelerating time to value. More solution providers are providing standardized APIs for simplifying access, accelerating development, and enabling more comprehensive administration throughout their BDA solution stacks. stack. In the public cloud, API standardization enables solution providers to hide the complexities of their underlying BDA solutions underneath abstraction layers that simplify access to heterogeneous collection of open-source and proprietary components.
- Innovative BDA startups are bringing increasingly sophisticated cloud-facing solutions to market. By 2024, the dominant BDA application providers will be those that have disrupted the competitive landscape with AI-based solutions of astonishing sophistication. The threat from new market entrants is accelerating in every BDA niche, with most of the innovations being designed for public or hybrid cloud deployments. Many new analytic and application databases, stream processing, and data science startups have entered the BDA arena in the past several years. Many of these new vendors provide solutions that incorporate sophisticated ML/AI and are packaged to address disruptive business applications. Nevertheless, the leading BDA solution providers have risen to the competitive challenge and are rapidly converging capabilities to across their established solution portfolios. Likewise, the public cloud providers are expanding their ability to serve customers’ BDA requirements in Internet of Things (IoT), edge, streaming, serverless, and containerized application environments. Just as important, public cloud providers are expanding their BDA SaaS offerings by providing a growing range of self-service prepackaged line-of-business public-cloud services for diverse industry and line-of-business applications.
- Disruptive BDA approaches are becoming viable alternatives to established platforms. By 2025, a new generation of “unicorn” BDA powerhouse solution providers will have emerged from startup status on the strength of a new technical approach of unprecedented flexibility and sophistication—probably based on some blend of IoT, blockchain, and stream computing. The threat of substitute products and services in the BDA market is coming from startups contending with incumbents, but also from incumbents who are themselves shaking up the old order by introducing a steady stream of innovative solutions that address established and emerging business requirements in new ways using new technological approaches. AI has become the predominant paradigm, with “cognitive computing” rarely invoked in this context. TensorFlow is now the principal open-source DL tool, supplementing Kafka, Spark, R, and Hadoop in more big data analytics deployments. DevOps for data science becoming the core technical approach for operationalized data apps that leverage the cloud. Development is increasingly being automated, in terms of the end-to-end ML pipelines. Edge is increasingly a factor in mobile, smart sensor, and robotics applications. Batch-oriented BDA deployments are diminishing in favor of more completely real-time, streaming, low-latency end-to-end environments. New entrants, such as Confluent, are building more end-to-end streaming analytics offerings incorporating Kafka, which is being extended to serve as a durable source of truth for diverse applications and—as an alternative to Hadoop–to play well alongside extract transform load and Spark workloads executing on NoSQL databases.
- Hadoop is becoming just a piece in the BDA puzzle, not the be-all solution for all use cases. By 2026, Hadoop will be thought of as a legacy technology in the BDA arena, though—like RDBMSs, OLAP cubes, and other long-vintage database technology—it will probably retain its footprint in enterprise BDA architectures for many years to come. Though Hadoop remains fundamental to many enterprises’ BDA initiatives, the market no longer believes that it can serve as the catch-all platform for all analytic and data management applications. Nevertheless, Hadoop distribution providers continue to enhance their offerings by hiding the complex seams that came from stitching together independently developed components; in this regard, MapR anticipated this at the storage layer and has done the best work on that front. In addition, the Hadoop distribution vendors have all evolved their solution in various direction to address a growing open-source data analytics stack that includes Spark, Kafka, and TensorFlow.
- Users can increasingly mix-and-match multivendor deployments in open BDA ecosystems. By 2027, no BDA solution vendor will go to market with any offering that incorporates proprietary, non-standard, non-open-source componentry at any level, including the hardware substrate. The bargaining power of buyers in the BDA market is growing, due to expanding choice among a growing range of innovative commercial and open-source offerings. Customers are leveraging this open, vibrant, competitive BDA ecosystem to extract differentiated value—in the form of efficiencies, governance, consumability, and other value points–from existing and new BDA suppliers. The suppliers, in turn, are decoupling or bifurcating their layered solutions into mix-and-match combinations of infrastructure, apps, pipeline managers, and other BDA enablers in order to grow their footprint in ecosystems in which full-stack vendor lock-in is a thing of the past.
- Databases are being deconstructed and reassembled into new approaches to address emerging requirements. By 2028, the database as we used to know it will be ancient history, from an architectural standpoint, in a world where streaming, in-memory, serverless infrastructures reign supreme.The BDA market continues to evolve through the rethinking of traditional database architectures to address new analytic use cases. Vendors are innovating by exploring new ways to re-architect the core database capabilities to address emerging requirements such as end-to-end integrated AI DevOps pipelines and edge-facing IoT analytics. In this evolution, analytic and application databases are converging as more high-performance transactional analytic and atomic/consistent/isolated/durable (ACID) transaction capabilities are integrated into data platforms of all types. Also, the database storage engine is becoming a repository primarily for machine data that is addressable through alternate structures such as key value indices and object schema. The database query engine is evolving beyond Structured Query Language (SQL) optimization into tighter integration with Apache Spark and other open-source code in order to support scalable training and inferencing of ML models. In addition, the database system catalog is evolving into a platform for managing data relationships across diverse storage engines within BDA architectures in support of operationalized data curation, lineage tracking, and policy administration by teams of data stewards, data scientists, and business analysts.
- Data science toolchains are increasingly automating the end-to-end DevOps pipeline for BDA applications. By 2029, every BDA solution provider will provide tools that automates the vast majority of coding, modeling, training, and other tasks involved in building, optimizing, and deploying a never-ending stream of AI/DL/ML models in production environments. BDA-augmented programming will continue to grow in sophistication. Developers have access to a growing range of data-science pipeline tools for automating various phases and processes of ML/DL/AI development, deployment, and operations. Available solutions (from vendors such as Alteryx, AWS, DataRobot, Domino Data Lab, and ParallelM) automatically sort through a wide range of alternatives relevant to model development, optimization, deployment, and operationalization tasks. Available automation solutions vary widely in the scope of pipeline phases and processes they address. The more comprehensive solutions automate modeling, training, and refinement, as well as the front-end processes of data discovery, exploration, and preparation and the back-end and ongoing processes of model deployment, operationalization, and governance. A growing range of these solutions leverage specialized ML algorithms to drive automated development, optimization, and deployment of these models.
- BDA packaged applications are becoming more widely available. By 2030, every customer—from large enterprises down to mom-and-pop shops—will acquire BDA solutions as pre-built, pre-trained templatized cloud offerings that continuously and automatically adapt and tune themselves to deliver desired business outcomes. Packaged business applications are driving mainstream adoption of BDA solutions, just as they did for transactional business applications. In that regard, a growing range of packaged BDA applications are coming to market, leveraging AI, DL, and ML as their core value added. Many of these applications rely on the solution providers incorporating pre-trained models that, to varying degrees, the customers can tweak and extend to their own bespoke needs. This focus on model training as a key application evolution/maintenance task is driving solution providers to align their solutions with enterprise customers’ DevOps processes for continuous application-release pipelines. Currently, only cloud vendors can offer developer-ready APIs on models they’ve trained with data from their consumer services. Even if data engineers need to customize the models with additional, proprietary data, these services are accessible to orders of magnitude more developers than the plethora of data science tools that start with low level frameworks or algorithms.
Figure 2: Chief BDA market trends through 2030
Segment Overviews
In this section, Wikibon presents our findings on key trends and forecasts for each of the following BDA solution segments:
- BDA software solutions
- BDA professional services
- BDA hardware
BDA software solutions
The BDA software market grew at 32% in 2017 and will grow over the forecast period (2017-2027) at a 16% CAGR (See Figure 3). This continued strong growth will be a function of enterprises and governments seeking tools and solutions to gain the significant potential of harnessing big data for business value in all areas of their operations. Over time, the market will mature and there be a more normal distribution of software spending between databases, tools, applications with big data analytic application software becoming increasingly more capable and good enough to support many use cases in the 2020s.
In updating our annual survey of BDA solution providers, Wikibon focused on solution providers and their ecosystems of partners who offer leading BDA functionality in the following key segments:
- Analytic applications. This segment includes solutions for data query, reporting, business analytics, dashboarding, cataloguing, discovery, visualization, and exploration, as well as packaged solutions that incorporate deep domain content for particular industry or line-of-business applications. Some providers of broader BDA solutions who also offer analytic apps include Alation, AWS, Datameer, IBM, Microsoft, Micro Focus, Oracle, SAP, SAS Institute, Teradata, and Zoomdata.
- Data science pipelines. This segment includes solutions for ML, DL, AI, data modeling, data preparation, data mining, predictive analytics, and text analytics tools and platforms. The chief providers data science pipeline solutions include AWS, Cloudera, Databricks, Google, Hortonworks, IBM, Microsoft, Oracle, ParallelM, Qubole, SAP, SAS Institute, and Teradata.
- Stream processing. This segment includes solutions for real-time, streaming, low-latency data acquisition, movement, ingest, processing, analytics, query, and other approaches for managing data in motion. The chief providers of stream processing include Attunity, AWS, Confluent, Cloudera, Data Artisans, Databricks, Google, Hortonworks, HPE, IBM, Iguazio, Informatica, MapR, MemSQL, Micro Focus, Microsoft, Oracle, Qubole, SAP, SAS Institute, Snowflake, Splice Machine, Splunk, Syncsort, Teradata, and Zoomdata.
- Application infrastructure. This segment includes solutions for data integration, transformation, augmentation, governance, and movement in BDA architectures. The chief providers of application infrastructure include Attunity, AWS, BMC, Confluent, Data Artisans, Google, Hortonworks, IBM, Iguazio, Informatica, Microsoft, Oracle, SAP, Syncsort, and Talend.
- Analytic and application databases. Analytic databases include any of several data platforms (relational, OLAP, in-memory, Hadoop, NoSQL, file systems, etc.) for storing, processing, and managing data for delivering actionable insights. The chief providers of analytic databases include Actian, AWS, Confluent, Cloudera, Google, Hortonworks, HPE, IBM, MapR, MemSQL, Micro Focus, Microsoft, MongoDB, Oracle, Qubole, Redis Labs, SAP, Snowflake, Splunk, and Teradata. Application databases include relational and other data platforms designed for managing transactional and other non-analytic appications. The chief providers of these include AWS, Google, IBM, Microsoft, Oracle, and SAP.
Figure 3: Worldwide Big Data Software Revenue $B 2016-2027
Analytic Applications
BDA application solution revenues for 2017 were $1.4B in 2017 and will grow to $17.8B by 2027 (29% CAGR), accounting for 39% of big data software by that time (from 13% in 2017). See Figure 4.
- Old-line business analytics application vendors will lose momentum as the market migrates to cloud-based solutions from diversified BDA solution providers for these capabilities.
- Analytic apps for production reporting, ad-hoc query, and other traditional business analytics will increasingly migrate to BDA cloud environments that combine cloud data warehouses, exploratory data lakes, stream computing backbones, in-memory data stores and application databases for real-time decision support and contextualized transactions.
- The first horizontal AI-infused analytic application that will be adopted by most enterprises will be for IT operations management, to address 24×7 automated event monitoring, root-cause diagnostics, and predictive remediation of application, network, and system performance.
- Established analytic app vendors will continue to migrate their applications to the principal public cloud, in descending order of priority based on cloud provider market share: AWS, Microsoft Azure, IBM Cloud, and Google Compute Platform.
- Big data catalogs will become a standard component of every enterprise’s analytic application infrastructure, running across hybrid clouds in support of collaborative data-lake curation by teams of data and business professionals.
- Analytic app vendors will focus on delivering packaged AI, DL, and ML applications for both business and consumer uses.
- Most new analytic application platforms will be architected for continued feature evolution, with a vendor emphasis on development pipelines that continually refresh the models, metadata, rules, and other artifacts in intelligent apps that have been deployed to the cloud edge.
Figure 4: Big Data Analytic Applications Software Revenue $B – 2017, 2022, 2027
Data Science Pipelines
BDA data-science pipeline solution revenues for 2017 were $0.3B in 2017 and will grow to $2.8B by 2027 (26% CAGR), accounting for 6% of big data software revenue by that time (from 3% in 2017). See Figure 5.
- Providers of data pipeline tools will deepen the workflow, collaboration, and model governance features of their tools in order to align with the DevOps practices of the application development shops in which more data scientists will work.
- Providers of data pipeline tools will differentiate through their incorporation of innovative solutions for feature engineering, algorithm selection, model training, and other key pipeline tasks in the development and operationalization of ML, DL, and other AI assets.
- The BDA solution providers that succeed in the data pipeline segment will be those with diverse solution accelerator templates, API-ready model libraries, training data, and deep consulting expertise to accelerate customer time to value on disruptive AI-focused business initiatives.
- The BDA public cloud providers with B2C businesses will have an advantage in the data science pipeline market through their ability to leverage their deep data and ML automation solutions to offer pre-trained ML models for diverse applications.
Figure 5: Big Data Data Science Pipelines Software Revenue $B – 2017, 2022, 2027
Stream Computing
BDA stream-computing solution revenues for 2017 were $0.3B in 2017 and will grow to $1.2B by 2027 (32% CAGR), accounting for 6% of big data software revenue by that time (from 12% in 2017). See Figure 6.
- Streaming low-latency platforms will become the core and primary orientation of most BDA solutions, platforms, and apps, due to the skyrocketing adoption of mobile, embedded, IoT, edge, microservices, and serverless computing.
- Most enterprises and cloud providers will adopt streaming data backplanes as the single source of truth across their BDA applications.
- Convergence of streaming data with batch and request/response processing will become the core approach for ingesting, processing, and delivering all cloud-centric BDA applications.
- Kafka, Flink, and Spark Streaming will become the principal stream computing platforms for BDA.
Figure 6: Big Data Stream Computing Software Revenue $B – 2017, 2022, 2027
Application infrastructure
BDA application infrastructure solution revenues for 2017 were $2.4B in 2017 and will grow to $7.4B by 2027 (12% CAGR), accounting for 16% of big data software revenue by that time (down from 22% in 2017) as applications take a much larger share of the big data software pie. See Figure 7.
- BDA public cloud providers will roll out abstraction layers that enable simplified development, administration, and access to data integration, governance, transformation, and other application infrastructure capabilities spanning multiclouds.
- More application infrastructure layers will be built for agile deployment of data integration, movement, and workload management across public, private, and hybrid clouds running diverse BDA applications.
- More BDA apps, models, workloads, and data will be moved seamlessly across hybrid and edge clouds to balance resource consumption, maintain service levels, and ensure adaptive resilience for disaster recovery and other resources.
- BDA public cloud providers will expand and enhance their application infrastructure stacks in order to support a growing range of performance, compliance, and other stakeholder requirements.
- Cloud-native BDA application infrastructures will shift toward serverless, microservices, edge, IoT, embedded, and mobile architectures.
Figure 7: Big Data Application Infrastructure Software Revenue $B – 2017, 2022, 2027
Analytic and Application Databases
BDA analytic and application database solution revenues for 2017 were $6.4B in 2017 and will grow to $12.0B by 2027 (6% CAGR), accounting for 26% of big data software revenue by that time (down from 60% in 2017) – See Figure 8.
- Analytic and application databases are converging. Stand-alone application and analytic databases will diminish in footprint as enterprises migrate toward next-generation BDA platforms that converge transactional, analytic, and other database use cases in high-performance, scalable, and secure cloud architectures. High-performance transactional analytics and ACID transaction capabilities are integrated into data platforms of all types.
- Solution providers that build their solutions primarily on Hadoop will decline, as that open-source platform’s role in the BDA ecosystem declines in favor of Kafka, Spark, TensorFlow, and other codebases that more directly address AI/ML/DL, data science, stream computing, and edge analytics apps.
- Providers that support hybrid deployment of disparate analytic data platforms (including Hadoop, NoSQL, in-memory, streaming, graph, file, object, and document databases) will grow their market share in the enterprise segment for data lake and data fabric solutions.
Figure 8: Big Data Application and Analytic Databases Software Revenue $B – 2017, 2022, 2027
BDA professional services
Professional services are essential for developing, deploying, and managing increasingly complex BDA applications across hybrid clouds.
The big data professional services market grew at 23% in 2017 and will grow over the forecast period (2017-2027) at a 9% CAGR (See Figure 1). This continued strong growth will be a function of enterprises and governments seeking support in creating valuable solutions for their businesses, though they may have difficulty hiring sufficient numbers of personnel with the skills needed to efficiently and effectively deploy BDA platforms, tools, and applications. BDA software will become more capable and good enough to dampen the need for professional services in the 2020s. However, there will always be new and unplanned-for BDA applications that will best be delivered by professional services firms skilled at creating practices to handle them. See Figure 9:
- Packaging of solution accelerator BDA offerings for professional services ecosystems. To address the need for targeted BDA solutions, providers are emphasizing their SaaS offerings by diversifying into a wide range of templated industry solutions that they customize for specific clients. Wikibon’s research shows that the more diversified BDA solution providers are increasingly offering tailored packages of domain content, cloud services, licensed software, and optimized hardware bundles for industry-specific and line-of-business offerings. These are templatized solution the provider’s professional services ecosystems—both their own personnel and channel partners—can customize for specific client engagements. This trend is consistent with the broader market trends under which analytic application vendors are delivering more packaged AI/DL/ML solutions for specific business and consumer use cases.
- Incorporation of BDA professional services knowledge assets into productized solutions. Professional services are becoming an increasingly important differentiator in a BDA market that is converging on dominant public cloud providers and their solution-partner ecosystems. Wikibon’s research shows that the professional services teams of BDA solution providers are proving to be a critical conduit for productizing domain expertise, content, and best practices. As packaged applications and cloud services become the high-margin solution segments in BDA, professional service teams will prove pivotal to vendors’ efforts to partner with their customers long term and to address particular fine-grained opportunities beyond the reach of horizontal solution providers.
- Delivery of BDA emerging technologies into early-adopter professional service engagements. Wikibon’s research shows many of the most high-value, complex, and sophisticated BDA technologies are being proven out in customer pilots, proofs of concept, and early-adoption cycles. These sort of engagements provide BDA solution providers with invaluable vehicles for proving out distributed AI microservices, cognitive digital assistants, blockchain, and other emerging technologies with live customers in operational settings.
Figure 9: Big Data Professional Services Revenue $B 2017,2022, 2027
BDA Hardware
Advanced hardware architectures will become even more essential for BDA scaling, performance, and robust operation under diverse scenarios.
The big data hardware market grew at 19% in 2017 and will grow over the forecast period (2017-2027) at a 9% CAGR (See Figure 1). The specialized parallel processors and memory intensive servers that big data demands will be a source of this growth. This strong growth will also be a function of strong big data deployment growth in the public cloud (Wikibon includes public cloud IaaS devoted to big data workloads as hardware), as well as the emergence of edge computing deployments requiring more capable hardware assets in remote locations. We expect that the impact of edge computing to become an important factor in the mid-2020s. See Figure 10.
In prior years, Wikibon segmented hardware into compute, storage, and networking components. As hardware becomes increasingly converged (and hyperconverged) into true private cloud offerings (on-premises offerings that approximate the public cloud experience), we believe this distinction will matter less and less over time.
- Distributed in-memory low-latency grids are the future of BDA in the cloud. Wikibon’s research indicates that the UniGrid hardware architecture is becoming the future shape of BDA workload deployment, scaling, acceleration, and management in radically distributed environments. As it emerges as an architectural best practice from the hyperconverged infrastructure space, UniGrid will make it easier to build internet scale BDA hardware platforms that combine analytics and some transactions and use in-memory. UniGrid architecture will make it possible to squeeze latency out of existing OLTP databases and apply that to run the full range of AI, DL, and ML pipeline and operational workloads in BDA clouds.
- Edge computing is where BDA-optimized hardware is heading. Wikibon’s research iindicates that BDA edge computing will increasingly require deep memory architectures to accommodate massive data sets and real-time needs. In this regard, the AI chipset wars will accelerate as more innovative startups enter the segment to edge-facing applications, as mergers and acquisitions intensify, as incumbent BDA solution providers (especially Google, AWS, Microsoft, and IBM) deepen their edge hardware portfolios, and the incumbent AI chip manufacturers—especially NVIDIA and Intel—defend their positions in what’s sure to be the fastest growing chip segment of the next several years. What will surely emerge from this ferment are innovative BDA edge-hardware approaches that combine graphic processing unit (GPUs) with tensor processing units (TPUs), field programmable gate arrays (FPGAs), central processing units (CPUs), and various neuromorphic application-specific integrated circuits (ASICs). A new generation of AI-optimized commodity chipsets will emerge to accelerate the technology’s adoption in devices, gateways, and other tiered edge-cloud deployments. More embedded, mobile, and IoT&P platforms will come to market incorporating blends of GPU, TPU, FPG, CPU, and ASICs.
- Streaming speed and scale are the prime hardware imperatives in BDA in the cloud. AI’s growing ubiquity is the paramount BDA market trend that will spur further performance boosts at the hardware level. Wikibon’s research indicates that AI hardware manufacturers will likely achieve continued year-over-year 10-100x boosts in the price-performance, scalability, and power efficiency of their chipset architectures over at least the next 10 years. Between now and 2027, every AI chipset that comes to market will be optimized for the core DL and ML algorithms, especially convolutional neural networks, recurrent neural networks, long short-term memory networks, and generative adversarial networks. Increasingly, AI co-processors will be optimized primarily for low-cost, low-power, edge-based inferencing, and only secondarily for edge-based training. Embedding of low-cost low-power AI chipsets is the driving force behind AI’s, hence BDA’s, growing ubiquity in our lives. Embedded AI-inferencing co-processors will become standard components of all computing, communications, and consumer electronics devices over the next 5-10 years. By 2027, the principal AI client chipset architectures will shift away from all-GPU architectures toward hybrid approaches. GPU architectures will continue to improve to enable highly efficient inference at the edge, defending against further encroachments from CPUs, FPGAs, ASICs, and other chipset architectures for these workloads. However, for the foreseeable future, GPUs will continue to dominate AI training in public clouds, private clouds, and data centers.
Figure 10: Big Data Hardware Revenue $B 2017,2022, 2027
Action Items
- Over the coming 3-6 months, BDA professionals should begin to evaluate the growing range of commercial packaged solutions that accelerate time-to-value for specific industry or line-of-business analytic and transactional applications.
- Over the coming 3-12 months, you should explore moving more of your BDA applications to public cloud environments to take advantage of the rapidly maturing, low-cost options in this arena.
- Over the coming 3-24 months, you should consider building out your hybrid public/private multi-clouds to ensure a graceful long-term transition to the more completely public-cloud BDA computing environments of the future.
- As you work out your long-range public-cloud BDA architecture—including the underlying hardware platforms needed to process all these workloads at scale–you should evaluate cloud providers on the richness of their own solution portfolio, on the range of partner/ISV solutions built for those environments, and on APIs for mixing and matching open-source and proprietary solutions for diverse analytics, transactional, and other data-driven workloads.
- Starting immediately, you should engage professional services partners for assistance in planning and executing your move toward BDA in the public cloud.