Although roughly 90 percent of state technology offices have launched AI pilot programs, a growing gap separates states running innovative, whole-of-government testing environments from those struggling to track outcomes or scale beyond isolated agency tests. The Center for Data Innovation published a report on June 4, 2026, examining how states can move AI tools from experimental phases into statewide deployment. The report argues that without clear evaluation metrics, states risk repeating cycles of experimentation without meaningful implementation.

Five states have established structured AI testing frameworks that move beyond fragmented, agency-by-agency pilots. Utah created a statewide AI pilot program through the 2024 Artificial Intelligence Policy Act, establishing an Office of Artificial Intelligence and a regulatory sandbox that lets companies test AI tools under temporary regulatory relief and state supervision. Connecticut's 2025 AI Engagement and Enablement Lab has enabled 20 pilot programs, including Microsoft Copilot for productivity, automated Q&A tools for citizen inquiries, and systems that identify election-related misinformation. Ohio's emerging technology sandbox, which went into effect in March 2026, supports more than a dozen agencies operating under state-mandated privacy requirements. Texas enacted the Texas Responsible Artificial Intelligence Governance Act in 2025, giving developers up to 36 months to test AI systems under temporary regulatory relief. North Carolina created an AI Accelerator with a 60-day rapid-testing cycle where agencies can pilot AI solutions in a secure environment.

The report finds that these five states have moved beyond cautious observation by creating structured environments that prioritize hands-on experimentation. According to the analysis, Utah's approach grounds future regulation in real performance data, with the health technology company Doctronic using an approved AI system to help patients with chronic conditions renew prescriptions at participating pharmacies. The report notes that Ohio has already deployed pilots such as AI-equipped vehicles that detect road defects and AI-powered analysis of Medicare prior-authorization requests, while North Carolina's Treasury agency has piloted an AI system that streamlines management of public financial records.

The report explains that other states can follow their lead by moving away from siloed, agency-by-agency pilots that often operate in isolation and lack shared evaluation standards. Instead, states should establish centralized hubs for rapid internal deployment and multidisciplinary oversight councils to coordinate cross-agency strategy. Connecticut's framework pairs its testing lab with an AI Advisory Board that oversees implementation, coordinates with labor organizations, and engages private sector and academic experts, ensuring AI adoption happens with agency alignment and continuous expert oversight rather than fragmented experimentation. Ohio's multi-agency AI Council governs statewide AI use and includes controls that ensure sensitive citizen data can't resurface in public AI systems later on—safeguards that many vendors already offer but that Ohio now requires for all pilots. This balance of innovation and strong safeguards gives agencies a safe, structured environment to test AI systems, strengthening their long-term operational capacity and public trust.

The report recommends that policymakers move from theory to practice by adopting these emerging sandbox models to close the growing gap between AI's potential and its responsible integration. By shifting away from restrictive blanket bans and toward supervised, whole-of-government testing, states can generate the data needed to craft effective and safe guardrails. To ensure that public services remain modern and secure, states need structured pilot environments that collect real performance data rather than making regulatory decisions based on assumptions.