Efficient Object Detection in Mobile and Embedded Devices with Deep Neural Networks
Tesis y disertaciones académicas
Universidad de Salamanca (España)
Deep Neural Networks
Mobile and Embedded Devices
Efficient Object Detection
Fecha de publicación
[EN] Neural networks have become the standard for high accuracy computer vision. These algorithms can be built with arbitrarily large architectures to handle an ever growing complexity in the data they process. State of the art neural network architectures are primarily concerned with increasing the recognition accuracy when performing inference on an image, which creates an insatiable demand for energy and compute power. These models are primarily targeted to run on dense compute units such as GPUs. In recent years, demand to allow these models to execute in limited capacity environments such as smartphones, however even the most compact variations of these state of the art networks constantly push the boundaries of the power envelop under which they run. With the emergence of the Internet of Things, it is becoming a priority to enable mobile systems to perform image recognition at the edge, but with small energy requirements. This thesis focuses on the design and implementation of an object detection neural network that attempts to solve this problem, providing reasonable accuracy rates with extremely low compute power requirements. This is achieved by re-imagining the meta architecture of traditional object detection models and discovering a mechanism to classify and localize objects through a set of neural network based algorithms that are better aimed to mobile and embedded devices. The main contributions of this thesis are: (i) provide a better image processing algorithm that is more suitable at preparing data for consumption by taking advantage of the characteristics of the ISP available in these devices; (ii) provide a neural network architecture that maintains acceptable accuracy targets with minimal computational requirements by making efficient use of basic neural algorithms; and (iii) provide a programming framework for how these systems can be most efficiently implemented in a manner that is optimized for the underlying hardware units available in these devices by taking into account memory and computation restrictions.