Share
Neural Architecture Search (NAS) is widely utilized for automatically identifying the best-performing neural network among a vast array of candidate architectures. Networks identified through NAS often outperform those that are manually designed, proving their efficacy across various mainstream applications. For example, consider the EfficientNet family (ranging from B0 to B7), which was discovered using NAS (see figure). Given a certain compute budget (say, in terms of FLOPS), these architectures are likely to serve as a promising backbone for your application.
In their groundbreaking paper, "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks," Mingxing Tan et al. further develop the ideas introduced in their previous work, "MnasNet: Platform-Aware Neural Architecture Search for Mobile." They propose a multi-objective optimization problem that involves identifying Convolutional Neural Network (CNN) models that can deliver high accuracy while maintaining a low computational overhead, as measured by inference latency or Floating Point Operations Per Second (FLOPS).
To accomplish this, Tan et al. implement the methodology outlined by Barret Zoph et al. in "Learning Transferable Architectures for Scalable Image Recognition." In this approach, each CNN model within the pre-defined search space is decomposed into a list of tokens. Instead of being chosen randomly, these tokens are derived from a sequence of actions taken by the reinforcement learning (RL) agent. The ultimate goal of the process is to maximize the expected rewards, interpreted in this case as high model accuracy and low latency or FLOPS. The RL agent, through a process of learning and refinement, is trained to choose an architecture, i.e., a sequence of actions, that leads to a model with superior performance metrics.
However, The catch with Neural Architecture Search (NAS) is its high computational cost. For instance, the search for EfficientNetB0 took a whopping 3800 GPU days! The majority of these research papers originate from Google, which possesses the necessary infrastructure for such endeavors. However, individuals like myself and perhaps you may not have access to such resources. Given this circumstance, how do we circumnavigate this challenge?
Excited to learn more about how we can tackle the high computational cost of NAS? Don't miss my next post, where we'll delve into the intriguing world of zero-shot NAS such as ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients by Guihong Li. This exciting development in the field is all about designing training-free proxies that can anticipate the performance of a given architecture on a test dataset.
Incorporate AI ML into your workflows to boost efficiency, accuracy, and productivity. Discover our artificial intelligence services.
View All
© Copyright Fast Code AI 2024. All Rights Reserved