Only a year ago, ChatGPT woke the world up to the power of foundation models. But this power is not about shiny, jaw-dropping demos. Foundation models will permeate every sector, every aspect of our lives, in much the same way that computing and the Internet transformed society in previous generations. Given the extent of this projected impact, we must ask not only what AI can do, but also how it is built. How is it governed? Who decides?
We don’t really know. This is because transparency in AI is on the decline. For much of the 2010s, openness was the default orientation: Researchers published papers, code, and datasets. In the last three years, transparency has waned. Very little is known publicly about the most advanced models (such as GPT-4, Gemini, and Claude): What data was used to train them? Who created this data and what were the labor practices? What values are these models aligned to? How are these models being used in practice? Without transparency, there is no accountability, and we have witnessed the problems that arise from the lack of transparency in previous generations of technologies such as social media.
To make assessments of transparency rigorous, the Center for Research on Foundation Models introduced the Foundation Model Transparency Index, which characterizes the transparency of foundation model developers. The good news is that many aspects of transparency (e.g., having proper documentation) are achievable and aligned with the incentives of companies. In 2024, maybe we can start to reverse the trend.
By now, policymakers widely recognize the need to govern AI. In addition to transparency, among the first priorities is evaluation, which is mentioned as a priority in the United States executive order, the European Union AI Act, and the UK’s new AI Safety Institute. Indeed, without a scientific basis for understanding the capabilities and risks of these models, we are flying blind. About a year ago, the Center for Research on Foundation Models released the Holistic Evaluation of Language Models (HELM), a resource for evaluating foundation models including language models and image generation models. Now we are partnering with MLCommons to develop an industry standard for safety evaluations.
But evaluation is hard, especially for general, open-ended systems. How do you cover the nearly unbounded space of use cases and potential harms? How do you prevent gaming? How do you present the results to the public in a legible way? These are open research questions, but we are on a short fuse to solve them to keep pace with the rapid development of AI. We need the help of the entire research community.
It does not seem far-fetched to imagine that ChatGPT-like assistants will be the primary way we access information and make decisions. Therefore, the behavior of the underlying foundation models — including any biases and preferences — is consequential. These models are said to align to human values, but whose values are we talking about? Again, due to the lack of transparency, we have no visibility into what these values are and how they are determined. Rather than having these decisions made by a single organization, could we imagine a more democratic process for eliciting values? It is the integrity and legitimacy of the process that matters. OpenAI wants to fund work in this area, and Anthropic has some research in this direction, but these are still early days. I hope that some of these ideas will make their way into production systems.
The foundation-models semi truck will barrel on, and we don’t know where it is headed. We need to turn on the headlights (improve transparency), make a map to see where we are (perform evaluations), and ensure that we are steering in the right direction (elicit values in a democratic way). If we can do even some of this, we will be in a better place.
Percy Liang is an associate professor of computer science at Stanford, director of the Center for Research on Foundation Models, senior fellow at the Institute for Human-Centered AI, and co-founder of Together AI.