Bold statement upfront: AI progress is accelerating, but the big mystery behind how these frontier systems actually operate remains unsolved.
SAN DIEGO — For the past week, researchers, startup founders, and industry leaders from around the world gathered in sunny San Diego for the premier conference in artificial intelligence. The Neural Information Processing Systems conference, known as NeurIPS, has run for 39 years and attracted a record 26,000 attendees this year—twice the number seen just six years ago.
Since its inception in 1987, NeurIPS has explored neural networks and the intersections of computation, neuroscience, and physics. What once looked like a niche curiosity has become foundational to modern AI, turning NeurIPS from a compact meeting in a Colorado hotel into an event that now fills the San Diego Convention Center—also the home of Comic-Con.
Even as the conference expands alongside a booming AI industry and features on highly specialized topics like AI-generated music, one of the loudest topics remains a fundamental question: how do frontier AI systems actually work?
The reality is that most leading AI researchers and company leaders acknowledge a core mystery: a lack of full understanding of how today’s top models function. The field of interpretability—trying to interpret what models do and how they do it—remains incomplete and evolving.
Shriyash Upadhyay, an AI researcher and co-founder of Martian, a company focused on interpretability, described the field as still in its infancy. He noted that the landscape is full of competing ideas and divergent agendas.
“Traditionally, science advances by incremental steps, like refining a measurement by a decimal point,” Upadhyay said. “With interpretability, we’re asking bigger questions: What are electrons? Do they exist? Are they measurable? The parallel question for AI is: what does it mean for an AI system to be interpretable?” He and Martian used NeurIPS to launch a $1 million prize to boost interpretability efforts.
During the conference, interpretability teams from major tech players signaled divergent paths for understanding increasingly capable systems.
Google signaled a notable shift toward practical methods that emphasize real-world impact rather than attempting to reverse-engineer every component of a model. Neel Nanda, a Google interpretability lead, explained that grand goals like near-complete reverse-engineering feel distant, given the ambition to deliver results within about a decade. He cited rapid progress in the field alongside slower-than-expected advances from previous, more ambitious approaches as reasons for the pivot.
In contrast, OpenAI’s head of interpretability, Leo Gao, announced and discussed at NeurIPS that the team plans to pursue a deeper, more ambitious form of interpretability aimed at fully understanding how neural networks operate.
Yet skepticism remains. Adam Gleave, an AI researcher and co-founder of the FAR.AI nonprofit, questioned whether truly complete comprehension of model behavior is attainable. He suggested that deep-learning models may not have simple explanations or a way to be fully reverse-engineered in a way that makes sense to humans.
Even so, Gleave remained optimistic that progress will help make AI systems more reliable and trustworthy by clarifying how models behave across different levels. He also highlighted a rising interest in safety and alignment within the machine-learning research community, while noting that NeurIPS still hosts sessions focused on expanding AI capabilities so large they met in rooms big enough to double as aircraft hangars.
Beyond understanding, many researchers argue that current evaluation methods fall short. Sanmi Koyejo, a Stanford computer science professor who leads the Trustworthy AI Research Lab, pointed out that there are not yet robust tools to measure more complex concepts like general intelligence and reasoning. He stressed the need for new, meaningful benchmarks capable of assessing broader AI behavior.
The same questions apply to AI models used in biology, chemistry, and other sciences. Ziv Bar-Joseph of Carnegie Mellon University, founder of GenBio AI, said biology-specific evaluations are in their infancy and that the field is still figuring out how to assess AI in biology, not just what to study.
Despite these gaps, teams recognize that AI is already accelerating scientific discovery in tangible ways. Upadhyay likened the situation to building bridges before Newton fully understood physics: practical impact can precede complete understanding.
For the fourth consecutive year, an offshoot conference at NeurIPS focused on AI methods to advance scientific discovery. Ada Fang, a Harvard Ph.D. student working at the intersection of AI and chemistry, described the edition as a strong success. She noted that frontier AI research spans biology, materials science, chemistry, and physics, with shared challenges and ideas.
Jeff Clune, a pioneer in applying AI to science and a professor at the University of British Columbia, observed a rapid surge in interest. He shared that the volume of inquiries, meetings, and conversations about building AI that can learn, discover, and drive scientific innovation is at an all-time high. Looking around the room, he said it’s inspiring to see a community that was once small and overlooked now drawing widespread attention because AI’s effectiveness has reached a threshold where tackling humanity’s most pressing problems is suddenly within reach.
Jared Perlo contributes coverage on AI for NBC News, supported by the Tarbell Center for AI Journalism.
Would you like this rewritten version to lean more toward a technical explainer, or keep a balanced, reader-friendly news tone with more practical takeaways and real-world examples?