期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automatic License Plate Recognition System for Vehicles Using a CNN

Parneet Kaur Yogesh Kumar Shakeel Ahmed Abdulaziz Alhumam Ruchi Singla Muhammad Fazal Ijaz 《计算机、材料和连续体（英文）》2022,71(1):35-50

Automatic License Plate Recognition (ALPR) systems are important in Intelligent Transportation Services (ITS) as they help ensure effective law enforcement and security. These systems play a significant role in border surveillance, ensuring safeguards, and handling vehicle-related crime. The most effective approach for implementing ALPR systems utilizes deep learning via a convolutional neural network (CNN). A CNN works on an input image by assigning significance to various features of the image and differentiating them from each other. CNNs are popular for license plate character recognition. However, little has been reported on the results of these systems with regard to unusual varieties of license plates or their success at night. We present an efficient ALPR system that uses a CNN for character recognition. A combination of pre-processing and morphological operations was applied to enhance input image quality, which aids system efficiency. The system has various features, such as the ability to recognize multi-line, skewed, and multi-font license plates. It also works efficiently in night mode and can be used for different vehicle types. An overall accuracy of 98.13% was achieved using the proposed CNN technique. 相似文献

2.

FPGA Implementation of Deep Leaning Model for Video Analytics

P. N. Palanisamy N. Malmurugan 《计算机、材料和连续体（英文）》2022,71(1):791-808

In recent years, deep neural networks have become a fascinating and influential research subject, and they play a critical role in video processing and analytics. Since, video analytics are predominantly hardware centric, exploration of implementing the deep neural networks in the hardware needs its brighter light of research. However, the computational complexity and resource constraints of deep neural networks are increasing exponentially by time. Convolutional neural networks are one of the most popular deep learning architecture especially for image classification and video analytics. But these algorithms need an efficient implement strategy for incorporating more real time computations in terms of handling the videos in the hardware. Field programmable Gate arrays (FPGA) is thought to be more advantageous in implementing the convolutional neural networks when compared to Graphics Processing Unit (GPU) in terms of energy efficient and low computational complexity. But still, an intelligent architecture is required for implementing the CNN in FPGA for processing the videos. This paper introduces a modern high-performance, energy-efficient Bat Pruned Ensembled Convolutional networks (BPEC-CNN) for processing the video in the hardware. The system integrates the Bat Evolutionary Pruned layers for CNN and implements the new shared Distributed Filtering Structures (DFS) for handing the filter layers in CNN with pipelined data-path in FPGA. In addition, the proposed system adopts the hardware-software co-design methodology for an energy efficiency and less computational complexity. The extensive experimentations are carried out using CASIA video datasets with ARTIX-7 FPGA boards (number) and various algorithms centric parameters such as accuracy, sensitivity, specificity and architecture centric parameters such as the power, area and throughput are analyzed. These results are then compared with the existing pruned CNN architectures such as CNN-Prunner in which the proposed architecture has been shown 25% better performance than the existing architectures. 相似文献

3.

Image Augmentation-Based Food Recognition with Convolutional Neural Networks

Lili Pan Jiaohua Qin Hao Chen Xuyu Xiang Cong Li Ran Chen 《计算机、材料和连续体（英文）》2019,59(1):297-313

Image retrieval for food ingredients is important work, tremendously tiring, uninteresting, and expensive. Computer vision systems have extraordinary advancements in image retrieval with CNNs skills. But it is not feasible for small-size food datasets using convolutional neural networks directly. In this study, a novel image retrieval approach is presented for small and medium-scale food datasets, which both augments images utilizing image transformation techniques to enlarge the size of datasets, and promotes the average accuracy of food recognition with state-of-the-art deep learning technologies. First, typical image transformation techniques are used to augment food images. Then transfer learning technology based on deep learning is applied to extract image features. Finally, a food recognition algorithm is leveraged on extracted deep-feature vectors. The presented image-retrieval architecture is analyzed based on a small-scale food dataset which is composed of forty-one categories of food ingredients and one hundred pictures for each category. Extensive experimental results demonstrate the advantages of image-augmentation architecture for small and medium datasets using deep learning. The novel approach combines image augmentation, ResNet feature vectors, and SMO classification, and shows its superiority for food detection of small/medium-scale datasets with comprehensive experiments. 相似文献

4.

Two-tier architecture for unconstrained handwritten character recognition

K. V. Prema N. V. Subba Reddy 《Sadhana》2002,27(5):585-594

In this paper, we propose an approach that combines the unsupervised and supervised learning techniques for unconstrained handwritten numeral recognition. This approach uses the Kohonen self-organizing neural network for data classification in the first stage and the learning vector quantization (LVQ) model in the second stage to improve classification accuracy. The combined architecture performs better than the Kohonen self-organizing map alone. In the proposed approach, the collection of centroids at different phases of training plays a vital role in the performance of the recognition system. Four experiments have been conducted and experimental results show that the collection of centroids in the middle of the training gives high performance in terms of speed and accuracy. The systems developed also resolve the confusion between handwritten numerals. 相似文献

5.

基于内积矩阵及卷积自编码器的螺栓松动状态监测EI北大核心CSCD

下载免费PDF全文

张敏照王乐田鑫海《工程力学》2022,39(12):222-231

螺栓连接结构中的螺栓松动容易导致结构失效,如何对结构中的螺栓松动状态进行监测是当前研究的一个热点。该文利用环境激励下结构振动响应的相关性分析,结合深度学习技术,研究了一种联合使用内积矩阵(inner product matrix,IPM)和卷积自编码器(convolutional autoencoder,CAE)的神经网络模型,即基于内积矩阵及卷积自编码器(inner product matrix and convolutional autoencoder,IPM-CAE)的深度学习模型。通过对螺栓连接搭接板的螺栓松动状态监测的试验研究,验证了该方法的可行性及有效性,并与使用IPM的卷积神经网络(convolutional neural network,CNN)、堆栈自动编码器(stack autoencoder,SAE)及胶囊网络(capsule network,CapsNet)相比,IPM-CAE方法具有较快的网络训练收敛速度和较高的识别精度。相似文献

6.

A Hybrid Duo-Deep Learning and Best Features Based Framework forActionRecognition

Muhammad Naeem Akbar Farhan Riaz Ahmed Bilal Awan Muhammad Attique Khan Usman Tariq Saad Rehman 《计算机、材料和连续体（英文）》2022,73(2):2555-2576

Human Action Recognition (HAR) is a current research topic in the field of computer vision that is based on an important application known as video surveillance. Researchers in computer vision have introduced various intelligent methods based on deep learning and machine learning, but they still face many challenges such as similarity in various actions and redundant features. We proposed a framework for accurate human action recognition (HAR) based on deep learning and an improved features optimization algorithm in this paper. From deep learning feature extraction to feature classification, the proposed framework includes several critical steps. Before training fine-tuned deep learning models – MobileNet-V2 and Darknet53 – the original video frames are normalized. For feature extraction, pre-trained deep models are used, which are fused using the canonical correlation approach. Following that, an improved particle swarm optimization (IPSO)-based algorithm is used to select the best features. Following that, the selected features were used to classify actions using various classifiers. The experimental process was performed on six publicly available datasets such as KTH, UT-Interaction, UCF Sports, Hollywood, IXMAS, and UCF YouTube, which attained an accuracy of 98.3%, 98.9%, 99.8%, 99.6%, 98.6%, and 100%, respectively. In comparison with existing techniques, it is observed that the proposed framework achieved improved accuracy. 相似文献

7.

Pedestrian Physical Education Training Over Visualization Tool

Tamara al Shloul Israr Akhter Suliman A. Alsuhibany Yazeed Yasin Ghadi Ahmad Jalal Jeongmin Park 《计算机、材料和连续体（英文）》2022,73(2):2389-2405

E-learning approaches are one of the most important learning platforms for the learner through electronic equipment. Such study techniques are useful for other groups of learners such as the crowd, pedestrian, sports, transports, communication, emergency services, management systems and education sectors. E-learning is still a challenging domain for researchers and developers to find new trends and advanced tools and methods. Many of them are currently working on this domain to fulfill the requirements of industry and the environment. In this paper, we proposed a method for pedestrian behavior mining of aerial data, using deep flow feature, graph mining technique, and convocational neural network. For input data, the state-of-the-art crowd activity University of Minnesota (UMN) dataset is adopted, which contains the aerial indoor and outdoor view of the pedestrian, for simplification of extra information and computational cost reduction the pre-processing is applied. Deep flow features are extracted to find more accurate information. Furthermore, to deal with repetition in features data and features mining the graph mining algorithm is applied, while Convolution Neural Network (CNN) is applied for pedestrian behavior mining. The proposed method shows 84.50% of mean accuracy and a 15.50% of error rate. Therefore, the achieved results show more accuracy as compared to state-of-the-art classification algorithms such as decision tree, artificial neural network (ANN). 相似文献

8.

LPNet: A lightweight CNN with discrete wavelet pooling strategies for colon polyps classification

Pallabi Sharma Dipankar Das Anmol Gautam Bunil Kumar Balabantaray 《International journal of imaging systems and technology》2023,33(2):495-510

The traditional process of disease diagnosis from medical images follows a manual process, which is tedious and arduous. A computer-aided diagnosis (CADs) system can work as an assistive tool to improve the diagnosis process. In this pursuit, this article introduces a unique architecture LPNet for classifying colon polyps from the colonoscopy video frames. Colon polyps are abnormal growth of cells in the colon wall. Over time, untreated colon polyps may cause colorectal cancer. Different convolutional neural networks (CNNs) based systems have been developed in recent years. However, CNN uses pooling to reduce the number of parameters and expand the receptive field. On the other hand, pooling results in data loss and is deleterious to subsequent processes. Pooling strategies based on discrete wavelet operations have been proposed in our architecture as a solution to this problem, with the promise of achieving a better trade-off between receptive field size and computing efficiency. The overall performance of this model is superior to the others, according to experimental results on a colonoscopy dataset. LPNet with bio-orthogonal wavelet achieved the highest performance with an accuracy of 93.55%. It outperforms the other state-of-the-art (SOTA) CNN models for the polyps classification task, and it is lightweight in terms of the number of learnable parameters compared with them, making the model easily deployable in edge devices. 相似文献

9.

Detecting Driver Distraction Using Deep-Learning Approach

Khalid A. AlShalfan Mohammed Zakariah 《计算机、材料和连续体（英文）》2021,68(1):689-704

Currently, distracted driving is among the most important causes of traffic accidents. Consequently, intelligent vehicle driving systems have become increasingly important. Recently, interest in driver-assistance systems that detect driver actions and help them drive safely has increased. In these studies, although some distinct data types, such as the physical conditions of the driver, audio and visual features, and vehicle information, are used, the primary data source is images of the driver that include the face, arms, and hands taken with a camera inside the car. In this study, an architecture based on a convolution neural network (CNN) is proposed to classify and detect driver distraction. An efficient CNN with high accuracy is implemented, and to implement intense convolutional networks for large-scale image recognition, a new architecture was proposed based on the available Visual Geometry Group (VGG-16) architecture. The proposed architecture was evaluated using the StateFarm dataset for driver-distraction detection. This dataset is publicly available on Kaggle and is frequently used for this type of research. The proposed architecture achieved 96.95% accuracy. 相似文献

10.

WDBM: Weighted Deep Forest Model Based Bearing Fault Diagnosis Method

Letao Gao Xiaoming Wang Tao Wang Mengyu Chang 《计算机、材料和连续体（英文）》2022,72(3):4741-4754

In the research field of bearing fault diagnosis, classical deep learning models have the problems of too many parameters and high computing cost. In addition, the classical deep learning models are not effective in the scenario of small data. In recent years, deep forest is proposed, which has less hyper parameters and adaptive depth of deep model. In addition, weighted deep forest (WDF) is proposed to further improve deep forest by assigning weights for decisions trees based on the accuracy of each decision tree. In this paper, weighted deep forest model-based bearing fault diagnosis method (WDBM) is proposed. The WDBM is regard as a novel bearing fault diagnosis method, which not only inherits the WDF’s advantages-strong robustness, good generalization, less parameters, faster convergence speed and so on, but also realizes effective diagnosis with high precision and low cost under the condition of small samples. To verify the performance of the WDBM, experiments are carried out on Case Western Reserve University bearing data set (CWRU). Experiments results demonstrate that WDBM can achieve comparative recognition accuracy, with less computational overhead and faster convergence speed. 相似文献

11.

Deep learning for photoacoustic tomography from sparse data

Stephan Antholzer Johannes Schwab 《Inverse Problems in Science & Engineering》2019,27(7):987-1005

The development of fast and accurate image reconstruction algorithms is a central aspect of computed tomography. In this paper, we investigate this issue for the sparse data problem in photoacoustic tomography (PAT). We develop a direct and highly efficient reconstruction algorithm based on deep learning. In our approach, image reconstruction is performed with a deep convolutional neural network (CNN), whose weights are adjusted prior to the actual image reconstruction based on a set of training data. The proposed reconstruction approach can be interpreted as a network that uses the PAT filtered backprojection algorithm for the first layer, followed by the U-net architecture for the remaining layers. Actual image reconstruction with deep learning consists in one evaluation of the trained CNN, which does not require time-consuming solution of the forward and adjoint problems. At the same time, our numerical results demonstrate that the proposed deep learning approach reconstructs images with a quality comparable to state of the art iterative approaches for PAT from sparse data. 相似文献

12.

A Deep Learning Hierarchical Ensemble for Remote Sensing Image Classification

Seung-Yeon Hwang Jeong-Joon Kim 《计算机、材料和连续体（英文）》2022,72(2):2649-2663

Artificial intelligence, which has recently emerged with the rapid development of information technology, is drawing attention as a tool for solving various problems demanded by society and industry. In particular, convolutional neural networks (CNNs), a type of deep learning technology, are highlighted in computer vision fields, such as image classification and recognition and object tracking. Training these CNN models requires a large amount of data, and a lack of data can lead to performance degradation problems due to overfitting. As CNN architecture development and optimization studies become active, ensemble techniques have emerged to perform image classification by combining features extracted from multiple CNN models. In this study, data augmentation and contour image extraction were performed to overcome the data shortage problem. In addition, we propose a hierarchical ensemble technique to achieve high image classification accuracy, even if trained from a small amount of data. First, we trained the UC-Merced land use dataset and the contour images for each image on pretrained VGGNet, GoogLeNet, ResNet, DenseNet, and EfficientNet. We then apply a hierarchical ensemble technique to the number of cases in which each model can be deployed. These experiments were performed in cases where the proportion of training datasets was 30%, 50%, and 70%, resulting in a performance improvement of up to 4.68% compared to the average accuracy of the entire model. 相似文献

13.

Image Recognition of Citrus Diseases Based on Deep Learning

Zongshuai Liu Xuyu Xiang Jiaohua Qin Yun Tan Qin Zhang Neal N. Xiong 《计算机、材料和连续体（英文）》2021,66(1):457-466

In recent years, with the development of machine learning and deep learning, it is possible to identify and even control crop diseases by using electronic devices instead of manual observation. In this paper, an image recognition method of citrus diseases based on deep learning is proposed. We built a citrus image dataset including six common citrus diseases. The deep learning network is used to train and learn these images, which can effectively identify and classify crop diseases. In the experiment, we use MobileNetV2 model as the primary network and compare it with other network models in the aspect of speed, model size, accuracy. Results show that our method reduces the prediction time consumption and model size while keeping a good classification accuracy. Finally, we discuss the significance of using MobileNetV2 to identify and classify agricultural diseases in mobile terminal, and put forward relevant suggestions. 相似文献

14.

Enhancing CNN for Forensics Age Estimation Using CGAN and Pseudo-Labelling

Sultan Alkaabi Salman Yussof Sameera Al-Mulla 《计算机、材料和连续体（英文）》2023,74(2):2499-2516

Age estimation using forensics odontology is an important process in identifying victims in criminal or mass disaster cases. Traditionally, this process is done manually by human expert. However, the speed and accuracy may vary depending on the expertise level of the human expert and other human factors such as level of fatigue and attentiveness. To improve the recognition speed and consistency, researchers have proposed automated age estimation using deep learning techniques such as Convolutional Neural Network (CNN). CNN requires many training images to obtain high percentage of recognition accuracy. Unfortunately, it is very difficult to get large number of samples of dental images for training the CNN due to the need to comply to privacy acts. A promising solution to this problem is a technique called Generative Adversarial Network (GAN). GAN is a technique that can generate synthetic images that has similar statistics as the training set. A variation of GAN called Conditional GAN (CGAN) enables the generation of the synthetic images to be controlled more precisely such that only the specified type of images will be generated. This paper proposes a CGAN for generating new dental images to increase the number of images available for training a CNN model to perform age estimation. We also propose a pseudo-labelling technique to label the generated images with proper age and gender. We used the combination of real and generated images to train Dental Age and Sex Net (DASNET), which is a CNN model for dental age estimation. Based on the experiment conducted, the accuracy, coefficient of determination (R2) and Absolute Error (AE) of DASNET have improved to 87%, 0.85 and 1.18 years respectively as opposed to 74%, 0.72 and 3.45 years when DASNET is trained using real, but smaller number of images. 相似文献

15.

A deep learning model integrating convolution neural network and multiple kernel K means clustering for segmenting brain tumor in magnetic resonance images

Balakumaresan Ragupathy Manivannan Karunakaran 《International journal of imaging systems and technology》2021,31(1):118-127

In medical imaging, segmenting brain tumor becomes a vital task, and it provides a way for early diagnosis and treatment. Manual segmentation of brain tumor in magnetic resonance (MR) images is a time‐consuming and challenging task. Hence, there is a need for a computer‐aided brain tumor segmentation approach. Using deep learning algorithms, a robust brain tumor segmentation approach is implemented by integrating convolution neural network (CNN) and multiple kernel K means clustering (MKKMC). In this proposed CNN‐MKKMC approach, classification of MR images into normal and abnormal is performed by CNN algorithm. At next, MKKMC algorithm is employed to segment the brain tumor from the abnormal brain image. The proposed CNN‐MKKMC algorithm is evaluated both visually and objectively in terms of accuracy, sensitivity, and specificity with the existing segmentation methods. The experimental results demonstrate that the proposed CNN‐MKKMC approach yields better accuracy in segmenting brain tumor with less time cost. 相似文献

16.

A Novel Hybrid Machine Learning Approach for Classification of Brain Tumor Images

Abdullah A. Asiri Amna Iqbal Javed Ferzund Tariq Ali Muhammad Aamir Khalaf A. Alshamrani Hassan A. Alshamrani Fawaz F. Alqahtani Muhammad Irfan Ali H. D. Alshehri 《计算机、材料和连续体（英文）》2022,73(1):641-655

Abnormal growth of brain tissues is the real cause of brain tumor. Strategy for the diagnosis of brain tumor at initial stages is one of the key step for saving the life of a patient. The manual segmentation of brain tumor magnetic resonance images (MRIs) takes time and results vary significantly in low-level features. To address this issue, we have proposed a ResNet-50 feature extractor depended on multilevel deep convolutional neural network (CNN) for reliable images segmentation by considering the low-level features of MRI. In this model, we have extracted features through ResNet-50 architecture and fed these feature maps to multi-level CNN model. To handle the classification process, we have collected a total number of 2043 MRI patients of normal, benign, and malignant tumor. Three model CNN, multi-level CNN, and ResNet-50 based multi-level CNN have been used for detection and classification of brain tumors. All the model results are calculated in terms of various numerical values identified as precision (P), recall (R), accuracy (Acc) and f1-score (F1-S). The obtained average results are much better as compared to already existing methods. This modified transfer learning architecture might help the radiologists and doctors as a better significant system for tumor diagnosis. 相似文献

17.

基于YOLOv3检测和特征点匹配的多目标跟踪算法

谭芳穆平安马忠雪《计量学报》2021,42(2):157-162

针对传统多目标跟踪算法中行人检测速度慢、易受光照变化、行人快速移动及部分遮挡因素的影响造成行人目标跟踪性能差等问题, 提出一种根据经典的Tracking-by-Detection 模式,采用深度学习YOLOv3算法检测行人目标,然后利用FAST角点检测算法与BRISK特征点描述算法对相邻帧间的行人目标进行特征点匹配,实现多目标行人跟踪的算法。实验结果表明行人目标在背光、快速移动、部分遮挡等复杂环境下均获得了良好的连续跟踪效果,平均精度达到87.7%,速度达到35帧/s。相似文献

18.

Anomaly Based Camera Prioritization in Large Scale Surveillance Networks

Altaf Hussain Khan Muhammad Hayat Ullah Amin Ullah Ali Shariq Imran Mi Young Lee Seungmin Rho Muhammad Sajjad 《计算机、材料和连续体（英文）》2022,70(2):2171-2190

Digital surveillance systems are ubiquitous and continuously generate massive amounts of data, and manual monitoring is required in order to recognise human activities in public areas. Intelligent surveillance systems that can automatically ide.pngy normal and abnormal activities are highly desirable, as these would allow for efficient monitoring by selecting only those camera feeds in which abnormal activities are occurring. This paper proposes an energy-efficient camera prioritisation framework that intelligently adjusts the priority of cameras in a vast surveillance network using feedback from the activity recognition system. The proposed system addresses the limitations of existing manual monitoring surveillance systems using a three-step framework. In the first step, the salient frames are selected from the online video stream using a frame differencing method. A lightweight 3D convolutional neural network (3DCNN) architecture is applied to extract spatio-temporal features from the salient frames in the second step. Finally, the probabilities predicted by the 3DCNN network and the metadata of the cameras are processed using a linear threshold gate sigmoid mechanism to control the priority of the camera. The proposed system performs well compared to state-of-the-art violent activity recognition methods in terms of efficient camera prioritisation in large-scale surveillance networks. Comprehensive experiments and an evaluation of activity recognition and camera prioritisation showed that our approach achieved an accuracy of 98% with an F1-score of 0.97 on the Hockey Fight dataset, and an accuracy of 99% with an F1-score of 0.98 on the Violent Crowd dataset. 相似文献

19.

Feature Fusion-Based Deep Learning Network to Recognize Table Tennis Actions

Chih-Ta Yen Tz-Yun Chen Un-Hung Chen Guo-Chang Wang Zong-Xian Chen 《计算机、材料和连续体（英文）》2023,74(1):83-99

A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study. The wearable device consisted of a six-axis sensor, Raspberry Pi 3, and a power bank. Multiple kernel sizes were used in convolutional neural network (CNN) to evaluate their performance for extracting features. Moreover, a multiscale CNN with two kernel sizes was used to perform feature fusion at different scales in a concatenated manner. The CNN achieved recognition of the four table tennis strokes. Experimental data were obtained from 20 research participants who wore sensors on the back of their hands while performing the four table tennis strokes in a laboratory environment. The data were collected to verify the performance of the proposed models for wearable devices. Finally, the sensor and multi-scale CNN designed in this study achieved accuracy and F1 scores of 99.58% and 99.16%, respectively, for the four strokes. The accuracy for five-fold cross validation was 99.87%. This result also shows that the multi-scale convolutional neural network has better robustness after five-fold cross validation. 相似文献

20.

基于深度学习的短纤维增强聚氨酯复合材料性能预测基于深度学习的短纤维增强聚氨酯复合材料性能预测

下载免费PDF全文

闫海邓忠民《复合材料学报》2019,36(6):1413-1420

结合深度学习在图像识别领域的优势,将卷积神经网络（CNN）应用于有限元代理模型,预测了平面随机分布短纤维增强聚氨酯复合材料的有效弹性参数,并针对训练过程出现的过拟合,提出了一种数据增强的方法。为验证该代理模型的有效性,比较了其与传统代理模型在预测有效杨氏模量和剪切模量上的精度差异。在此基础上结合蒙特卡洛法利用卷积神经网络代理模型研究了材料微几何参数不确定性的误差正向传递。结果表明:相对于传统代理模型,卷积神经网络模型能更好地学习图像样本的内部特征,得到更加精确的预测结果,并在训练样本空间外的一定范围内可以保持较好的鲁棒性;随着纤维长宽比的增大,微几何参数的不确定性对材料有效性能预测结果会传递较大的误差。相似文献