首页 | 官方网站   微博 | 高级检索  
     

采用改进Unet网络的茶园导航路径识别方法
引用本文:赵岩,张人天,董春旺,刘中原,李杨.采用改进Unet网络的茶园导航路径识别方法[J].农业工程学报,2022,38(19):162-171.
作者姓名:赵岩  张人天  董春旺  刘中原  李杨
作者单位:1. 石河子大学机械电气工程学院,石河子 832000; 3. 农业农村部西北农业装备重点实验室,石河子 832000;;1. 石河子大学机械电气工程学院,石河子 832000; 2. 中国农业科学院茶叶研究所,杭州 310000; 3. 农业农村部西北农业装备重点实验室,石河子 832000;
基金项目:浙江省"尖兵"研发攻关计划(2022C02010);中央级院所基本科研业务费专项(1610212021004;1610212022004)
摘    要:针对目前在茶园垄间导航路径识别存在准确性不高、实时性差和模型解释困难等问题,该研究在Unet模型的基础上进行优化,提出融合Unet和ResNet模型优势的Unet-ResNet34模型,并以该模型所提取的导航路径为基础,生成路径中点,通过多段三次B样条曲线法拟合中点生成茶园垄间导航线。该研究在数据增强后的茶园垄间道路训练集中完成模型训练,将训练完成的模型在验证集进行导航路径识别,根据梯度加权类激活映射法解释模型识别过程,可视化对比不同模型识别结果。Unet-ResNet34模型在不同光照和杂草条件下导航路径分割精度指标平均交并比为91.89%,能够实现茶园垄间道路像素级分割。模型处理RGB图像的推理速度为36.8 帧/s,满足导航路径分割的实时性需求。经过导航线偏差试验可知,平均像素偏差为8.2像素,平均距离偏差为0.022 m,已知茶园垄间道路平均宽度为1 m,道路平均距离偏差占比2.2%。茶园履带车行驶速度在0~1 m/s之间,单幅茶垄图像平均处理时间为0.179 s。研究结果能够为茶园视觉导航设备提供技术和理论基础。

关 键 词:导航  深度学习  茶园可视化  路径识别  语义分割  样条曲线拟合
收稿时间:2022/6/1 0:00:00
修稿时间:2022/7/26 0:00:00

Navigation path recognition between tea ridges using improved Unet network
Zhao Yan,Zhang Rentian,Dong Chunwang,Liu Zhongyuan,Li Yang.Navigation path recognition between tea ridges using improved Unet network[J].Transactions of the Chinese Society of Agricultural Engineering,2022,38(19):162-171.
Authors:Zhao Yan  Zhang Rentian  Dong Chunwang  Liu Zhongyuan  Li Yang
Affiliation:1. College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China; 3. Key Laboratory of Northwest Agricultural Equipment, Ministry of Agriculture and Rural Affairs, Shihezi 832000, China;;1. College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China; 2. Tea Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou 310000, China; 3. Key Laboratory of Northwest Agricultural Equipment, Ministry of Agriculture and Rural Affairs, Shihezi 832000, China;
Abstract:Navigation path recognition has been widely regarded as one of the most important sub-tasks of intelligent agricultural equipment. An intelligent tracked vehicle can also be expected to realize the automatic navigation on the road between the tea garden ridges. However, there are still some challenges on the navigation path recognition between tea ridges using deep learning models, such as the low accuracy, real-time performance, and model interpretability. In this research, a new Unet-ResNet34 model was proposed to accurately and rapidly recognize the navigation path between the tea ridges using semantic segmentation. The midpoints of the navigation path were then generated using the navigation path extracted from the model. Finally, the multi-segment cubic B-spline curve equation was used to fit the midpoints, in order to generate the navigation line between the tea garden ridges. The Image Labeler toolbox in the Matlab 2019 platform was selected to label the navigation path in the collected images for the navigation path dataset. A navigation path dataset was then obtained consisting of 1 824 images. Among them, 1 568 and 256 images in the dataset were randomly selected for the training and the validation set, respectively. Under different illumination and weed conditions, the Mean Intersection over Union (MIoU) was utilized as the accuracy indicator of the Unet-ResNet34 model, which was 91.89% for the tea road segmentation. The navigation path segmentation mask was also used to generate the navigation information and keypoints for the path fitting. Furthermore, the multi-segment cubic B-spline curve equation was selected to calculate the navigation line of the tea road between ridges using the midpoints as the control points. Additionally, the navigation line was selected to further calculate the pixel and distance error. The mean difference between the predicted pixel and distance error of tea navigation paths were 8.2 pixels and 0.022 m, respectively. As such, the width of the tea navigation path was achieved about 1 m, where the ratio was 2.2 % between the average distance error and the width of the tea navigation path. In terms of real-time performance and the number of parameters, the inference speed of the Unet-ResNet34 model was 36.8 frames per second. The number of parameters of the Unet-ResNet34 model was 26.72 M. The inference speed was 36.8 frames per second to process the RGB image with a size of 960×544. A visualization method of gradient weighted class activation mapping (Grad-CAM) was used to visually represent the final extraction feature of the improved models. More importantly, the special features were highlighted on the navigation path between the tea inter-ridges in the optimized Unet-ResNet34 structure, while retaining only the most crucial feature extractors. The speed of the tracked vehicle in the tea was mostly 0-1 m/s, particularly with the 0.179 s average processing time of a single tea inter-ridge image. In summary, the improved model can be fully realized the real-time and accurate navigation path recognition of tea ridges. The finding can also provide the technical and theoretical support to the intelligent agricultural equipment in the tea environment.
Keywords:navigation  deep learning  visualization of tea garden  path recognition  semantic segmentation  spline curve fitting
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号