The Usability Problem for AI

Imagine having a forklift powerful enough to lift anything, yet it comes with just a single crude lever and a steering wheel that barely turns. That’s where we stand today with deep learning models. Recent advancements have enabled these systems to represent increasingly complex data distributions through enormous parameter counts and massive training sets. But while we’ve unlocked new capabilities, we haven’t developed the fine-grained controls to match. In other words, we have immensely powerful tools, but our ability to guide them remains rudimentary.

The result is a kind of usability gap. Take text-to-speech as an example. We can now generate human-like speech from written prompts, and we can instruct the model with some limited descriptors—say, “speak with a warm, calm voice.” But this process is still imprecise. Natural language often underspecifies our true intentions. One path to refinement would be to spell out every subtlety of our desired output in excruciating detail, leaving no room for misinterpretation. Yet that approach quickly becomes cumbersome.

A better solution might be to have the model itself infer our intentions from context, much as people do with close friends and colleagues—understanding what we mean without our stating it explicitly. But this kind of general, context-sensitive understanding remains elusive for now.

The rise of “GPT wrappers” reflects one attempt to bridge this usability gap. Some regard these wrappers as superficial hacks, but they are, in fact, efforts to improve how users interact with complex systems. By fine-tuning outputs or constraining responses, these solutions aim to make models more practical. Over time, as we refine these interfaces and create more intuitive ways of shaping a model’s behavior, the line between “just a wrapper” and a truly integrated product may blur.

This current rush to build better interfaces and control mechanisms is not just a passing fad. It’s an attempt to seize the opportunity offered by increasingly capable—yet still hard to use—AI models. If we can move beyond the forklift’s crude lever and stiff wheel, we might finally steer these powerful systems with the nuance and precision to allow for greater productivity and advancement in society.

Enjoy Reading This Article?