Faced with a classification task, an important step is to browse the catalog of machine learning architectures to find a good performer. Researchers are exploring ways to do it automatically.
What’s new: Esteban Real, Chen Liang, and their colleagues at Google Brain developed AutoML-Zero, an evolutionary meta-algorithm that generates a wide variety of machine learning algorithms to classify data. Applied to the small CIFAR-10 image dataset, it discovered several common deep learning techniques.
Key insight: Past meta-algorithms for machine learning constrain their output to particular architectures. Neural architecture search, for instance, finds only neural networks. AutoML-Zero finds any algorithm that can learn using high school-level math.
How it works: The researchers used AutoML-Zero to generate models for various resolutions of CIFAR-10.
- In the authors’ view, a machine learning model comprises a trio of algorithms: Setup initializes parameter values, Predict provides a scalar output given input vectors, and Learn updates weights based on the inputs, training labels, outputs, and current values.
- AutoML-Zero starts with a set of models with empty Setup, Predict, and Learn. It generates a population of models and evolves them for improved performance on a set of tasks.
- The meta-algorithm trains an instance of every model in each training iteration. It applies each model’s Predict and Learn to a task’s training set and evaluates performance on the task’s validation set.
- It culls a random subset of the population and mutates the best-performing model by adding an operation exchanging one operation for another, or switching input variables. The mutated model replaces the oldest model in the subset.
Results: AutoML-Zero regularly generated models that achieved 84 percent accuracy on CIFAR-10, compared to only 82 percent achieved by a two-layer, fully connected network. In the process, it rediscovered gradient descent, ReLu activations, gradient normalization, and hyperparameters.
Why it matters: The researchers estimate that, given AutoML-Zero’s wide-ranging purview, the chance of coming up with a model suitable for a CIFAR-10 classification task is vanishingly small (around 1 in 107 for linear regression, and 1012 if that line is offset by a constant). Yet it did so frequently — a demonstration of the meta-algorithm’s power to come up with useful architectures. If AutoML-Zero can find nearly state-of-the-art models on such a complex task, it may well be able to discover techniques that humans haven’t yet devised.
We’re thinking: CIFAR-10 was developed over a decade ago for machine learning experiments on the CPU-based neural networks of the day. We’re curious to learn how AutoML-Zero scales to larger datasets.
We’re not thinking: Today we have learning algorithms that design other learning algorithms. When will we have learning algorithms that design learning algorithms that design learning algorithms?