May 18, 2026
Khaza Anuarul Hoque is innovating frameworks to deploy AI foundation models on edge devices while preserving mathematically verified behavioral guarantees.

Artificial intelligence (AI) is rapidly moving out of massive data centers and onto everyday devices — from smartphones and laptops to drones and autonomous vehicles.
But making that transition possible requires more than just shrinking today’s powerful models. It requires ensuring that, when they are compressed to run on limited hardware, they still behave the way we expect, especially in situations where safety is critical.
A new research effort led by Mizzou Engineering computer scientist Khaza Anuarul Hoque aims to do exactly that.
Hoque and his team are developing a framework designed to compress foundation models such as large language models (LLMs) and vision-language models (VLMs) to run on devices with limited processing power and energy while still providing formal, mathematical guarantees that key behaviors are preserved.
The project, which the researchers call VeriFAI, has drawn support from NVIDIA, which recently awarded the team 32,000 hours of GPU computing time to scale their approach.
“At a high level, we’re trying to answer a simple but critical question,” Hoque said. “Can we make these models smaller without introducing behaviors we don’t want — and mathematically prove that we haven’t?”
The hidden risk of compression
Today’s most advanced AI systems are notoriously resource intensive. Running them typically requires powerful servers and large amounts of energy, making them difficult to deploy on so-called edge devices (e.g., laptops and smartphones) with limited computing power, memory and energy.
To address this, engineers routinely compress models using techniques such as pruning and quantization. These methods reduce size and computation, making it feasible to run AI locally.
In fact, many consumers are already using compressed AI models without realizing it. Features like predictive text, voice assistants and on-device AI tools rely on optimized versions of larger models.
But there’s a catch.
“When you compress a model, you’re not just making it smaller,” Hoque said. “You’re also changing how it behaves, sometimes in ways you don’t fully understand.”
Even when overall accuracy remains high, compression can degrade specific, fine-grained capabilities. A model might perform well on average but fail in exactly the scenario that matters most; misinterpreting a visual cue, for example, generating incorrect information or violating a safety constraint.
Those risks become particularly important in domains such as autonomous driving, aerospace, defense, healthcare and critical infrastructure, where failures can carry significant real-world consequences.
“Without formal guarantees, you cannot pass certification,” Hoque said. “And without that regulatory certification from the FAA, NHTSA or other agencies, you cannot deploy.”
From bottleneck to breakthrough
VeriFAI addresses this problem by integrating the kinds of formal verification methods already used in hardware design directly into the model compression process.
Instead of just checking whether the model is generally accurate, VeriFAI lets users set specific, precise mathematical rules to guide the model; “always correctly identify school buses,” for example.
Then, when VeriFAI compresses the model, it searches for a solution that follows all those specific rules, not just one that performs well on average.
Crucially, the approach does not require retraining the model from scratch, saving both time and computational cost.
“We’re not trying to fix all the problems in AI,” Hoque said. “If the original model has issues, those are still there. What we ensure is that compression doesn’t make things worse.”
The idea has already shown promising early results. In prior experiments across several model architectures, Hoque’s team demonstrated significant reductions in energy use and model size while maintaining verified behavioral properties.
But scaling the approach to the largest, most complex models — especially multimodal systems that combine vision and language — requires substantial computing power.
“These models are enormous,” Hoque said. “Without access to high-performance GPUs, you simply can’t explore the design space effectively.”
The NVIDIA grant removes that barrier, allowing the team to extend their methods to next-generation systems and run large-scale experiments.
Correct-by-construction
Looking ahead, Hoque envisions a broader shift in how AI systems are built.
Today, most models are developed through an iterative process: train, test, adjust and repeat. Ensuring safety requires extensive validation and retraining.
VeriFAI points toward a different paradigm that Hoque calls “correct by construction.”
“What if we could design models so that they satisfy safety and performance requirements from the beginning?” he asked. “That would fundamentally change how we build AI systems.”
As governments and industries consider standards for deploying AI in safety-critical environments, formal guarantees may become increasingly important for certification and operational approval.
“In many regulated domains, reliability cannot rely solely on empirical testing,” Hoque said. “Formal guarantees provide stronger assurance that critical behaviors remain preserved under deployment constraints.”
The research aligns with growing industry and regulatory interest in trustworthy and certifiable AI systems.
“The future of AI is not only about making models more capable,” he said. “It is also about making them dependable, energy-efficient and safe enough to operate reliably wherever they are deployed.”
Unlock solutions to real-world problems. Join Mizzou Engineering.