..

Internationale Zeitschrift für Sensornetzwerke und Datenkommunikation

Manuskript einreichen arrow_forward arrow_forward ..

A Compiler-Aware Framework of Network Pruning and Architecture Search for Mobile Acceleration

Abstract

Zhengang Li

With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently and do not fully consider compiler-level optimization which is a must-do for mobile acceleration. In this work, we propose NPAS, a compiler-aware unified network pruning and architecture search and the corresponding comprehensive compiler optimizations supporting different DNNs and different pruning schemes, which bridge the gap of weight pruning and NAS. Our framework achieves 6.7 ms, 5.9 ms, and 3.9 ms ImageNet inference times with 78%, 75% (MobileNet-V3 level), and 71% (MobileNet-V2 level) Top-1 accuracy respectively on an off-the-shelf mobile phone, consistently outperforming prior work.

Haftungsausschluss: Dieser Abstract wurde mit Hilfe von Künstlicher Intelligenz übersetzt und wurde noch nicht überprüft oder verifiziert

Teile diesen Artikel

Indiziert in

arrow_upward arrow_upward