首页 | 官方网站   微博 | 高级检索  
     

面向嵌入式设备的深度学习物体检测优化算法
引用本文:戴雷燕,冯杰,董慧,杨小利.面向嵌入式设备的深度学习物体检测优化算法[J].计算机系统应用,2019,28(4):163-169.
作者姓名:戴雷燕  冯杰  董慧  杨小利
作者单位:浙江理工大学 信息学院,杭州,310018;浙江理工大学 信息学院,杭州,310018;浙江理工大学 信息学院,杭州,310018;浙江理工大学 信息学院,杭州,310018
基金项目:国家自然科学基金青年基金项目(61501402)
摘    要:随着深度神经网络研究地不断深入,物体检测的精度和速率都在不断提升,但是随着网络层的加深,模型体积不断增大,计算代价也越来越高,无法满足神经网络直接在嵌入式设备上实现快速前向推理的需求.为了解决这个问题,本文针对嵌入式设备进行深度学习物体检测优化算法研究.首先,选择合适的物体检测算法框架和神经网络架构;然后在此基础上针对特定检测场景下采集的图片进行训练和模型剪枝;最后,对移植到嵌入式设备上的模型剪枝后的物体检测模型进行汇编指令优化.综合优化后,与原有网络模型相比,模型体积减小9.96%,速度加快8.82倍.

关 键 词:深度学习  物体检测  剪枝  汇编优化  嵌入式设备
收稿时间:2018/10/29 0:00:00
修稿时间:2018/11/19 0:00:00

Deep Learning Object Detection Optimization Algorithm for Embedded Devices
DAI Lei-Yan,FENG Jie,DONG Hui and YANG Xiao-Li.Deep Learning Object Detection Optimization Algorithm for Embedded Devices[J].Computer Systems& Applications,2019,28(4):163-169.
Authors:DAI Lei-Yan  FENG Jie  DONG Hui and YANG Xiao-Li
Affiliation:School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China,School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China,School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China and School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
Abstract:Along with the deep research on neural network, the object detection precision and speed are improved. But, computational cost is higher and higher with the deepening of network layer and increasing model volume, it cannot meet the needs that the neural network realizes fast forward reasoning directly in the embedded devices. In order to solve this problem, we study deep learning object detection optimization algorithm for embedded devices in this study. First, we choose the appropriate object detection algorithm and neural network frame structure. Then, the training and model pruning are carried out for the images collected under the specific detection scenario. Finally, the assembly instruction is optimized for the pruned object detection model transplanted to the embedded device. Compared with the original network model, the proposed model volume is reduced by 9.96% and the speed is accelerated by 8.82 times after comprehensive optimization.
Keywords:deep learning  object detection  pruning  assembly optimization  embedded device
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号