Data has value. Yet there is no international standard rate of exchange for data and there is no formalized system of currency that denotes how much an organization should pay for a bushel, barrel, quart, pinch or baker’s dozen of datum when aggregated today.
Despite this unerring truth, technology vendors are so fond of talking about how well their platform enables a business to work with information, they are almost incapable of saying the word “data” out loud, unless it is followed by “value”. It’s almost as if tech firms are trying to personify information into some corporeal incarnate thing… actually, it’s not almost like that, that’s exactly how it is.
Because data itself comes in so many forms (some say 13, but perhaps there are even more) and travels at a variety of different speeds (data at rest, data at its “normal” speed and speedier streams of real-time data) and because data comes in so many different physical shapes (a byte, a nibble, a block, a file or a chunk of big data), technologists have largely never extended the core definitions of terabytes, petabytes and exabytes to also include some kind of qualitative measure of data, at least not beyond that which denotes whether data is stale, duplicated, biassed or in some other way misconfigured or corrupted.
So then, can we ask whether AI changes the baseline for data value? Used properly, prudently and proficiently, most technology advocates and indeed vendors would argue yes, but why is this so… and what tools do we need to help us through this newly recalibrated landscape?
A Triumvirate Of Data, Analytics & A
The challenge for vendors in this space is to convince us that they don’t just offer a platform for big data analytics. Today, they need to provide evidence of a more hybrid and unified platform that is capable of serving a triumvirate of data, analytics and AI. Among the companies vocal on this subject is data services for AI specialist Cloudera.
The company’s Cloudera Data Services service is engineered to bring private AI (i.e. that which a business operates on its own database, running its own data and language models and is executed on-premises inside its own private cloud instances behind its own company firewall) with GPU-accelerated generative AI. According to company CEO Charles Sansbury, Cloudera Data Services allows data teams to accelerate workload deployment, enhance security by automating complex tasks and achieve faster time to value (there’s that word) for AI deployment.
“I like this notion of data value changing as a result of our current work to develop, implement and extend enterprise AI services,” said Sansbury, speaking at a press briefing this week. “We talk about data anywhere, for AI everywhere… but there are lots of emerging views now about performing fine tuning on core enterprise data to achieve better outcomes for AI models. But to do that, organizations need to have good control of that enterprise data… and most companies data stacks are a mess,, right? The solution can not be… oh, just move it to the cloud i.e. that’s not compelling for a chief security officer. What we’re finding is companies are not thinking enough about how they maintain control of quality enterprise data.”
Sansbury says it’s challenge (and of course he thinks his organization’s platform can fix it) and this is an especially sensitive and important topic right now (with nations drilling down on border controls and identity management) because no firm wants to run into data sovereignty issues or what the Cloudera chief labels as “pure abject fear” over the state of data.
“I think the best example we have of data value centricity, in terms of numbers, is Informatica and Salesforce [the CRM giant bought data management company Informatica for $8 billion in May this year to strengthen its data cloud technology gambit] – and that really was because it recognized the value of data gravity [a term used to explain how data has a gravitational ‘pull’ on applications and services that can make it hard to dislodge] and having data under an appropriate level of management. It’s all pretty exciting and, while we didn’t create these constructs, I do think we’re in right place (with a good resume) to be relevant to this discussion,” added Sansbury.
Salesforce itself isn’t far behind in its willingness to speak on this subject. Muralidhar Krishnaprasad, president and CTO of unified Agentforce platform at Salesforce, reminds us that agentic AI “thrives on clean, trustworthy data with context” now. Without it, even the fastest AI delivers super-fast rubbish and we know the garbage-in garbage-out mantra that AI engineers live by now.
“That shift in data analytics and information management cadence certainly represents something of a recalibration, if not a wholesale re-architecting process in some enterprise IT scenarios,” Krishnaprasad. “Because data is a cross-functional asset and its true power is unleashed when it’s understood and unified at its core. A unified data platform ensures that AI models are grounded in accurate, comprehensive information and enables a complete understanding of all the enterprise’s data – both structured and unstructured information – so that AI agents to automate tasks and interact intelligently.”
Google Cloud: Data Is Fuel
Perhaps no firms are better to comment on the changing value scale of data than the hyperscalers. After all, these are the data behemoths running, evolving and extending the planet’s datacenters, so it’s in their vested interest to make sure customers (users at every level from corporates to kids gaming) increasingly think of data like a “good” with a prescribed and also changeable level of value in the classical economic sense. Yasmeen Ahmad, managing director for Data Cloud at Google Cloud agrees.
“Data is no longer a passive corporate asset; its role as the uncontested fuel for AI means its value is now purely strategic. This is a fundamental re-architecture of the enterprise, transforming current workflows to leverage autonomous agentic systems. Unlocking the massive potential of AI is only possible if organizations build those core data foundations that are governed and accessible. Frankly, implementing an AI-ready data platform is now a strategic imperative for staying competitive,” stated Ahmad, in an online press analysis session this week.
Ahmad suggests that the largest immediate gain in data value when it comes to AI is activating what we can call “dark data” i.e. the 80% to 90% of all company data is unstructured and often locked away in images, video and documents. She says that AI acts as the essential “universal translator” for these information stores and that technology instantly makes multimodal information “a first-class citizen”, allowing businesses to get a more comprehensive view by analyzing a “richer tapestry” of data today. This, she proposes, opens the door to analytical capabilities that were previously impossible for structured systems alone.
While security specialists would offer words of caution here (i.e. if AI can suddenly read and interpret everything, then we need additional governance controls that oversee software system aspects like permissions, identity and data sovereignty), the point is well made – AI is good at digging so all data may now be surfaced.
Also competing in this space is data engineering, data science, and machine learning, company Databricks. CTO for EMEA region Dael Williamson says that AI has made it clear that the value of data is determined by how easily it can be trusted, governed and activated, not by how much of it an organization holds.
“Yet most companies do not even have an inventory of their data assets let alone a valuation. Unlike physical or financial assets that are recorded on balance sheets, data has always been treated as intangible in accounting terms despite investor communities consistently attaching a significant valuation multiple to it. AI is now making data asset valuation a priority for CFOs, and a data intelligence platform is the perfect factory for bringing those assets under management to drive AI-enabled value,” said Databricks’ Williamson.
Unified Operational & Analytical Data
He thinks that delivering on this vision requires a new approach to how data is managed day to day and suggests that we need architectures that “unify operational and analytical data” to eliminate fragile pipelines and provide more elastic scale. Williamson says that this may be why the IT industry is moving toward adopting Postgres databases built for the AI era, where transactional data and AI workloads can coexist in one environment.
“By enabling experimentation at machine speed and ensuring that AI models always have access to the freshest and most reliable data, these approaches allow enterprises to realize measurable returns from assets that have too often been overlooked or undervalued,” he added.
What makes a difference between the services that work in this space is not so much the governance factors, the expansive scalability considerations, the mission-critical user authentication controls and the multi‑cluster node template fairy dust… it is just usability. What sets one data management vendor apart from another is its ability to juggle different data workloads (streaming, machine learning, deeper slower analytics) all in one platform with metadata-driven architectures designed to reduce time-to-delivery.
Analyst house Gartner assesses this space by its quadrant for Cloud database management as-a-Service vendors. Further, it defines competency at this level as related to vendors who can support both on-premises and multiple public clouds, so that’s hybrid cloud for sure. While Cloudera has data lakehouse (a notion of some unstructured data lake information pooled with a degree of the structure of a data warehouse) competencies, key competitors in this arena also include Databricks, Snowflake (for its Iceberg tables), AWS Lake Formation and Google Cloud BigLake and of course Microsoft for Azure Data Lake.
What To Think About Next
Firms who aim to now deliver data value for AI will have to be aligned to the double helix of cloud-native technologies from the start. This makes things tough because juggling heritage, experience and longevity against any perception of cutting-edge newness is always tough.
Equally difficult is the need to exhibit expansiveness and service comprehensiveness with the challenges that that brings in terms of infrastructure provisioning, configuration, administration and management. All companies in the data value for AI space will be judged on their legacy data platform baggage (or not), their operational total cost of ownership, their ability to offer one unified workspace across different data, their infrastructure simplicity and whether or not there is a cloud-native alternative waiting in the wings, which might offer greater scale and could even be cheaper.