Nick Hart, President & CEO, Data Foundation.
The White House AI Action Plan charts an ambitious path to make America the global leader in artificial intelligence (AI), with its emphasis on innovation, infrastructure and international leadership. As organizations nationwide mobilize to implement this vision, there’s a critical foundation that must be laid first—one I learned about the hard way during a recent move.
Three years ago, I found myself staring at boxes I hadn’t opened since my last move, filled with items I couldn’t even identify. Old cables, obsolete electronics, papers from jobs I’d forgotten—all taking up space, costing money to store and creating nothing but clutter. My data systems weren’t much different: Terabytes of files with no clear purpose, consuming storage, multiplying security risks and making it harder to find what actually mattered.
The AI Action Plan rightly calls for building “world-class scientific datasets” with quality standards. But here’s what most organizations miss as they rush to implement AI: More data doesn’t automatically mean better AI. It often means worse AI—built on digital junk that’s expensive to maintain and impossible to govern effectively.
The Data Hoarding Problem
Organizations treat data like pack rats treat possessions—more must be better. They hastily collect everything accessible regardless of quality, store massive datasets “just in case” or inherit legacy systems without documentation about what data exists or why. This digital clutter creates expensive storage costs, multiplies security risks and obscures valuable insights.
The AI Action Plan represents a bold vision for American technological leadership, and the Data Foundation strongly supports its comprehensive approach. The plan’s emphasis on building world-class datasets and establishing quality standards provides the perfect framework for what comes next: Ensuring those datasets are curated, not just comprehensive.
Four Strategies For Data Efficiency In 2025
The strategies for making data more efficient in 2025 are not just essential for business models but for protecting privacy and enabling AI systems that actually work.
1. Collaborate on knowledge needs.
Before collecting data, organizational leaders must jointly identify what they actually need to know. In government, we call this strategic planning that leads to a learning agenda—a systematic approach I helped advance through the Foundations for Evidence-Based Policymaking Act.
Learning agendas require leaders to explicitly identify priority questions they need answered to achieve their missions, typically updated every four years. Private sector organizations should adopt similar practices, with executives and operational leaders collaborating to define what knowledge gaps actually matter for decision-making.
2. Establish data leadership.
Governance standards must be set by senior leaders specifically responsible for data—ideally a chief data officer (CDO). If your organization has more than 25 staff members and doesn’t have someone explicitly tasked to lead data efforts, it’s time to plan for both growth and data governance needs.
CDOs serve as the air traffic controllers of the data ecosystem, ensuring collections serve strategic purposes rather than accumulating randomly. They work alongside chief AI officers to ensure data governance frameworks support AI deployment while maintaining appropriate protections. In 2018, Congressional leaders from across the political spectrum recognized the need for exactly this kind of strategic steward of federal government data, which is why they passed a law called the OPEN Government Data Act, signed by President Trump in his first term, that requires each agency to designate a non-political appointee as a chief data officer.
3. Emphasize data minimization, not maximization.
The goal isn’t to have as much data as possible—it’s to have data that’s fit for purpose and available when needed. Data minimization means collecting only what serves identified knowledge needs, maintaining it only as long as it provides value and ensuring quality over quantity.
Consider the Treasury Department’s “Do Not Pay” system, which provides resources to prevent improper payments. The system has access to the data needed but only accesses the specific data needed for verification, which helps protect privacy. This targeted approach has already saved hundreds of millions of taxpayer dollars, proving that strategic data use beats comprehensive data collection.
4. Archive what you don’t need.
Organizations must systematically archive or delete data that no longer serves its purpose. This cuts storage costs, reduces security risks and ensures teams aren’t paying cloud computing bills for digital junk. Just as moving forces you to confront possessions you no longer need (and discuss whether you need certain items with family), AI deployment should trigger comprehensive data audits.
Moving Forward
The national AI Action Plan correctly emphasizes building world-class datasets with quality standards. Organizations that master data minimization will discover that AI systems perform better, cost less to operate and pose fewer risks than those built on data maximization approaches.
The convergence of AI capabilities with mature data governance represents an unprecedented opportunity to make government more efficient and businesses more competitive. The foundations are in place through laws like the Evidence Act and frameworks like the Federal Data Strategy. The technology exists to implement these principles at scale.
It’s time to clean house before we build the future. America’s AI leadership depends not just on having the most data, but on having the right data, used responsibly, with clear purpose and strong governance.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
