
Citation: Hakan Pabuçcu, Serdar Ongan, Ayse Ongan. Forecasting the movements of Bitcoin prices: an application of machine learning algorithms[J]. Quantitative Finance and Economics, 2020, 4(4): 679-692. doi: 10.3934/QFE.2020031
[1] | Hongxia Ni, Minzhen Wang, Liying Zhao . An improved Faster R-CNN for defect recognition of key components of transmission line. Mathematical Biosciences and Engineering, 2021, 18(4): 4679-4695. doi: 10.3934/mbe.2021237 |
[2] | Chen Chen, Guowu Yuan, Hao Zhou, Yutang Ma, Yi Ma . Optimized YOLOv7-tiny model for smoke detection in power transmission lines. Mathematical Biosciences and Engineering, 2023, 20(11): 19300-19319. doi: 10.3934/mbe.2023853 |
[3] | Kangjian Sun, Ju Huo, Qi Liu, Shunyuan Yang . An infrared small target detection model via Gather-Excite attention and normalized Wasserstein distance. Mathematical Biosciences and Engineering, 2023, 20(11): 19040-19064. doi: 10.3934/mbe.2023842 |
[4] | Miaolong Cao, Hao Fu, Jiayi Zhu, Chenggang Cai . Lightweight tea bud recognition network integrating GhostNet and YOLOv5. Mathematical Biosciences and Engineering, 2022, 19(12): 12897-12914. doi: 10.3934/mbe.2022602 |
[5] | Wenjie Liang . Research on a vehicle and pedestrian detection algorithm based on improved attention and feature fusion. Mathematical Biosciences and Engineering, 2024, 21(4): 5782-5802. doi: 10.3934/mbe.2024255 |
[6] | Kun Zheng, Bin Li, Yu Li, Peng Chang, Guangmin Sun, Hui Li, Junjie Zhang . Fall detection based on dynamic key points incorporating preposed attention. Mathematical Biosciences and Engineering, 2023, 20(6): 11238-11259. doi: 10.3934/mbe.2023498 |
[7] | Xian Fu, Xiao Yang, Ningning Zhang, RuoGu Zhang, Zhuzhu Zhang, Aoqun Jin, Ruiwen Ye, Huiling Zhang . Bearing surface defect detection based on improved convolutional neural network. Mathematical Biosciences and Engineering, 2023, 20(7): 12341-12359. doi: 10.3934/mbe.2023549 |
[8] | Zeyong Huang, Yuhong Li, Tingting Zhao, Peng Ying, Ying Fan, Jun Li . Infusion port level detection for intravenous infusion based on Yolo v3 neural network. Mathematical Biosciences and Engineering, 2021, 18(4): 3491-3501. doi: 10.3934/mbe.2021175 |
[9] | Jiaming Ding, Peigang Jiao, Kangning Li, Weibo Du . Road surface crack detection based on improved YOLOv5s. Mathematical Biosciences and Engineering, 2024, 21(3): 4269-4285. doi: 10.3934/mbe.2024188 |
[10] | Jing Zhou, Ze Chen, Xinhan Huang . Weakly perceived object detection based on an improved CenterNet. Mathematical Biosciences and Engineering, 2022, 19(12): 12833-12851. doi: 10.3934/mbe.2022599 |
Inspecting power transmission lines is a daily task in power grid companies. There are increasingly long-distance transmission lines in China. Patrol inspection is very heavy, and operation and maintenance costs are high. Long-distance transmission lines are located in complex terrain and cover a wide range. Manual inspection of key components has been unable to meet the actual needs, and more intelligent detection methods are urgently needed. A large number of transmission line images were taken by cameras installed on transmission lines or unmanned aerial vehicles. These images are used for power transmission line inspection based on computer vision. With the advantages of high efficiency, accuracy and security, the inspection based on images has gradually become an important way of transmission line inspection [1]. For the automatic inspection based on images, it is important to detect the key components of transmission lines accurately. The results of this work are good foundations for detecting key component defects and abnormal events.
Object detection, one of computer vision's most fundamental and challenging problems, has received significant attention in recent years [2]. Due to the tremendous successes of image classification based on deep learning, object detection techniques using deep learning have been actively studied in recent years [3]. In recent years, it has been found that adding an attention mechanism to the detection framework can improve target detection accuracy [4]. The current research also extends to detecting objects from video [5]. After detecting a target, many applications still need to track the target. Target tracking should solve the interference of target occlusion and complex background [6]. Object detection based on deep learning has dramatically affected many applications [7,8].
Some researchers have proposed automatic detection methods for key components of transmission lines. Lin et al. [9] improved the faster region-based convolution network (Faster R-CNN) model to achieve multi-object detection in transmission line patrol images. Although the accuracy has been improved, the detecting speed still needs to be improved. Li et al. [10] improved the single shot multibox detector (SSD) model to detect pin defects in transmission lines. The detection accuracy is higher than that of traditional algorithms, but the recall rate and AP value are not high. Zhang et al. [11] replaced the backbone network with re-parameterization visual geometry group (RepVGG) modules in the You Only Look Once Version 3 (YOLOv3) model and added a multi-scale detection box to the network. The improvement achieved foreign object detection on transmission lines. However, the detection speed still needs to be improved, although the accuracy has been improved. Tao et al. [12] used convolutional neural networks to learn the properties of insulators and to locate them. This approach can efficiently and automatically detect insulators in UAV images, but the accuracy is not sufficient. Jenssen et al. [13] used data enhancement and multi-stage component detection methods to solve training data deficiency, sample imbalance and small target detection. However, the improvement suffers from insufficient detection accuracy. Chen et al. [14] combined YOLOv3 with Super Resolution Convolutional Network (SRCNN), and the accuracy can be higher than Faster R-CNN and SSD by 1–3%. The improvement can achieve almost real-time, but there are still problems in small object detection. Liang et al. [15] detected grid component defects based on the Faster R-CNN model. The method effectively improves the detection accuracy, but there are false positives and false negatives in certain defect detection in this method. Ni et al. [16] also improved the Faster R-CNN model, and they used Inception-ResNet-V2 as a basic feature extraction network, which effectively improved the network operation efficiency. Although the accuracy of transmission line fault is 98.65%, the detection speed needs to be improved. Chen et al. [17] combined deformable convolution network (DCN) and feature pyramid network (FPN), and used a data-driven iterative learning algorithm to form an intelligent closed-loop image processing. This method provides new ideas for improving the efficiency of grid detection, but the detection accuracy needs to be enhanced. Liu et al. [18] constructed a large dataset for transmission lines to better suit the complex detection environment. Their method improves the performance of small objects, but the algorithm has high loss fluctuation and slow convergence.
Because the transmission line images have many tiny targets, significant changes in object scale and shooting angle, complex background, and the balance between detection accuracy and speed need to be taken into account at the same time, a robust and fast model is urgently needed [19,20,21]. The anchor box of the YOLOv5 model is adaptively generated and can predict at multiple scales. The YOLOv5 model is more suitable for our application than other models, and it can balance speed and accuracy better than the other models, such as Faster R-CNN, SSD, CenterNet, YOLOv3, and YOLOv4 (You Only Look Once Version 4). Therefore, we proposed an improved YOLOv5s model for detecting key targets in power transmission lines.
In this paper, we propose a target detection model for key components of transmission lines according to the requirements of intelligent inspection of power grids. Our model has higher detection accuracy than other models, and the detection speed is also faster than most models. Our model can be integrated into the front-end image acquisition equipment for power grid inspection and monitoring.
Our dataset is from Yunnan Limited Company of China Southern Power Grid. The images are captured by the surveillance cameras installed on transmission towers or unmanned aerial vehicles (UAV) for transmission line inspection. Currently, the power grid monitoring system is relatively complete, and the primary image data comes from the surveillance camera. UAVs are only used in a few places where the surveillance camera cannot cover. Therefore, most of the images in the dataset are captured by surveillance cameras, and UAVs capture very few. The dataset contains 4290 images with a resolution of 1920*1080 pixels. The background of these images is complex, including hills, land, and streets.
We have defined and annotated five types of detected targets according to the application needs of power grid companies, and they are screws, poles, insulators, transmission towers, and vehicles. The first four are key components of transmission lines. As huge trucks or engineering vehicles easily damage transmission lines, power grid companies especially require that vehicles are detected objects.
The five categories of labeled targets are shown in Figure 1.
After labeling the dataset, we counted the number and size of various targets, and the results are shown in Figure 2.
Figure 2(a) shows that the width and height of most detected targets only are less than 10% of the image size, and they are small targets. In Figure 2(b), the number of samples in the dataset is not balanced, among which there are fewer vehicles. The dataset shows a large gap between the height and width of the electric pole and the transmission tower.
In response to this dataset's application requirements and characteristics, we will use the YOLOv5s network and improve it for object detection of critical components of power grid transmission lines.
In practical applications, in the key areas (substations and abnormal monitoring key areas), we consider integrating the detection module into the cameras for real-time detection. In non-key areas (field transmission lines), we capture an image every 30 minutes, transmit it to the data center through 4G network, and the data center will detect it. As the object detection method for key components of transmission lines needs to be integrated into outdoor surveillance equipment, this method should be faster in low hardware configuration. YOLO5 is a rapid object detection model that can be applied to the actual working environment. In our application, we chose YOLOv5 as the basic model. According to our application characteristics, we improved the distance measurement in the K-means clustering, added an attention mechanism, and upgraded the loss function.
YOLO families are regression-based algorithms for object detection [22]. YOLOv5 is the fifth version of the YOLO series. YOLOv5 series contains four object detection versions: YOLOv5s, YOLOv5m (You Only Look Once Version 5 Middle), YOLOv5l (You Only Look Once Version 5 Large), and YOLOv5x (You Only Look Once Version 5 Extra Large). They have different network depths, network volumes, parameter quantities, and feature map widths [23,24,25]. Among them, YOLOv5s has the smallest network width and depth, so it is the fastest, but less accurate. Its network structure is shown in Figure 3.
In the YOLOv5s model, the K-means algorithm used Euclidean distance to measure sample distance. Since the poles and transmission towers are tall but narrow, the height-to-width ratio is large. Using the Euclidean distance to measure the distance between the predicted box and the real box of electric poles and transmission towers will lead to a significant error in the clustering results. To avoid this error, we considered replacing Euclidean distance with 1-IoU (Intersection over Union) distance. The 1-IoU distance can reduce this error caused by the large ratio of height to width of detected objects [26,27].
The 1-IoU distance DIoU based on IoU is calculated as follows:
DIoU=1−IoU=1−|A∩B||A∪B| | (1) |
where A represents the real box and B represents the prediction box.
Figure 4 shows the comparison of the Euclidean distance with the 1-IoU distance.
When the height-to-width ratio is large, we compare the two distance metrics to measure the distance between a real box and a prediction box in Figure 4. The red rectangle A represents a real box, and the green rectangle B represents a prediction box.
In Figure 4(a), the Euclidean distance between the real box A and the prediction box B is calculated using the distance between the two boxes' centers. Although the positions of boxes A and B are very different, their centers are very near, so the Euclidean distance is minimal. In Figure 4(b), The 1-IoU distance between the real box A and the prediction box B is calculated using the intersection ratio between A and B. According to Eq (1), the overlapping area of A and B is small, so the IoU is small, and the 1-IoU distance is large. Therefore, if the detected targets significantly differ in width and height, the 1-IoU distance is better.
We use the 1-IoU distance in the K-means algorithm to reduce the error of anchor matching in the YOLOv5s model.
Attention module has been proven to effectively enhance the representation ability of convolutional neural networks [28]. Many images have complex backgrounds and dense targets in the dataset of key components in the transmission line. It is necessary to improve its saliency to enhance the feature expression ability of the detected target in the complex background. Therefore, we introduce an attention mechanism to enhance features, mainly capturing the various iconic appearances of key components of transmission lines.
Attention mechanisms can be divided into two types: channel attention and spatial attention. Channel attention mainly explores the feature mapping relationship between different feature channels. Spatial attention uses multi-channel features in different spatial locations to build the relationship between two pairs, thus associating spatial context. The CBAM is an attention mechanism module that combines spatial and channel attention [35].
The CBAM automatically acquires the importance of each feature channel through learning. In addition, The CBAM automatically receives the significance of each feature space through similar learning methods and uses the obtained importance to enhance and suppress features that are not important to the current task. Our CBAM architecture for grid transmission line detection is shown in Figure 5.
Our network structure after adding the CBAM is shown in Figure 6. After adding the CBAM, our model can focus more on the detected target to reduce the classification error.
The class imbalance problem typically occurs the instances of some classes are many more than others. The class imbalance usually affects the effectiveness of classification. In the dataset of transmission line key components, vehicles' samples are much smaller than other categories.
Focal loss is a loss function that can reduce the impact of class imbalance [29]. It was originally used to solve model performance caused by image class imbalance. It adds weights to positive and negative samples through a weight factor at, and adds weights to the corresponding loss of samples according to the difficulty of sample discrimination by adding a modulating factor (1−pt)γ. That is, add a smaller weight to the samples that are easy to discriminate, and add a larger weight to the samples that are difficult to discriminate.
The focal loss is calculated as follows:
FL(pt)=−at(1−pt)γlog(pt) | (2) |
where
pt={p, y=11−p, otherwise | (3) |
where pt is the closeness to the ground truth (the class y). A large pt indicates that the closer to the class y, and a large pt means a high classification accurate. γ is a controllable parameter, and γ>0. at is the shared weight of controlling the total loss of positive and negative samples.
We gave the experimental results and analysis from several aspects of our model improvement, such as the distance measurement in K-means clustering, attention mechanism, and focal loss function. After that, we conducted ablation experiments and comparative experiments with other methods. Finally, we also performed detection experiments on another public insulator image dataset.
Due to the insufficient sample images, we used a data enhancement tool library (ImgAug) to augment our dataset. Data augmentation can expand training sets, improve the model's generalization ability, and effectively improve the robustness of the model.
The ImgAug library provides many image processing functions, which can efficiently realize the rotation, flipping, affine, brightness enhancement, contrast enhancement, color enhancement and other operations of the original image. The data augmentation effect of transmission line key components images using ImgAug is shown in Figure 7.
There were 4290 images in our original dataset, and the number of images increased to 11,335 after data augmentation. We have labeled these detected targets. In many image samples, there may be multiple detected targets in an image sample. After counting the labeled targets, the number of five classes is shown in Figure 8.
We randomly divided the training set, validation set and test set in the ratio of 8:1:1, as shown in Table 1.
Training set | Validation set | Test set | Total |
9067 | 1134 | 1134 | 11,335 |
We used Microsoft Windows 10 as the operating system, one NVIDIA GeForce GTX 2080Ti as the GPU and PyTorch 1.10.0 as the deep learning framework.
The learning rate momentum was set to 0.937, the batch size was 16, the initial learning rate was 0.01, the weight decay was 0.0005, and the training round was 300 to prevent overfitting.
The evaluation index used in the experiments were precision, recall and mean average precision (mAP). The precision (P) and recall (R) are as follows:
P=TPTP+FP | (4) |
R=TPTP+FN | (5) |
where, TP is the number of samples that were positive and also correctly classified as positive; FP is the number of samples that were negative but incorrectly classified as positive. FN is the number of samples that were positive but classified as negative.
After obtaining the P and R of each category, a precision–recall (P-R) curve can be shown. AP is represented by the area surrounded by the P-R curve and coordinates, and mAP is the average of the AP values of all categories. The AP and mAP are calculated as follows:
AP=∫10PRdR | (6) |
mAP=1NN∑k=1AP(k) | (7) |
where N represents the total number of categories, and AP(k) represents the AP of the category k.
The speed index of the model is FPS (frames per second), and its reciprocal is the required time to process each image.
In Section 3.2, we modified the distance measurement in K-means clustering. We replaced the Euclidean distance with the 1-IoU distance. Our improvement can make the anchor boxes generated better adapted to the transmission line key components image datasets. There are three sets of preset anchors in YOLOv5s, each containing three anchors of different dimensions and shapes. The three sets of anchors are used to detect small objects in an 80 × 80 feature map, medium-sized objects in a 40 × 40 feature map, and large objects in a 20 × 20 feature map. Table 2 shows the anchor box sizes obtained by two distance measurement methods.
Method | Small object | Medium object | Large object |
Euclidean distance | [[6,6], [9,9], [7,17]] | [[13,28], [18,52], [32,53]] | [[45,62], [158,160], [182,199]] |
1-IoU distance | [[4,4], [7,7], [6,14]] | [[10,10], [15,8], [11,24]] | [[25,15], [17,48], [41,103]] |
As seen from Table 2, after using the 1-IoU distance in the K-means clustering, the height-width ratio of the anchor boxes for detected large objects becomes significantly larger. It can be more suitable for detecting transmission towers and poles in the dataset.
In the YOLOv5s model, the comparative experimental results of two distance measurement methods are shown in Table 3.
Method | mAP@0.5/% | Precision/% | Recall/% |
Euclidean distance | 94.7 | 96.5 | 92.5 |
1-IoU distance | 95.3 | 97.1 | 91.5 |
As can be seen from Table 3, the mAP and accuracy have been improved after the 1-IoU distance is adopted. However, the recall decreases by 1%. To improve the model continuously, we added the CBAM attention mechanism module to upgrade the model's attention to the detected targets when the background is complex.
To solve the low recall of the model, we added the attention mechanism module to the YOLOv5s model to improve the model's attention to essential features. We tried several standard attention modules: SENet (Squeeze-and-Excitation Networks) [32], ECA (Efficient Channel Attention) [33], CA (Coordinate Attention) [34] and CBAM [35].
Table 4 shows the comparison of each attention mechanism module.
Method | mAP@0.5/% | Precision/% | Recall/% |
YOLOv5s | 94.7 | 96.5 | 92.5 |
YOLOv5s + 1-IoU | 95.3 | 97.1 | 91.5 |
YOLOv5s + 1-IoU + SENet | 94.9 | 95.6 | 91.6 |
YOLOv5s + 1-IoU + ECA | 94.9 | 95.1 | 91.6 |
YOLOv5s + 1-IoU + CA | 95.3 | 93.7 | 93 |
YOLOv5s + 1-IoU + CBAM | 95.5 | 97.1 | 91.8 |
Based on the three evaluation indexes in Table 4, the CBAM considers both accuracy and recall. Therefore, we chose to add the CBAM module to our improved model.
Figure 9 shows the comparison results for adding the CBAM module into the backbone of the YOLOv5s model. The left images in Figure 9 are the original images, and the detected targets are in the red boxes. The center images show the detected target of the YOLOv5s model, and the right images show the detected targets after adding the CBAM module. In the middle and right images of Figure 9, the darker color represents the greater attention.
As seen in Figure 9, the CBAM module can enhance the saliency of the dense small targets and the targets in dark or complex backgrounds.
Although the class imbalance has been improved after our dataset augmentation, the number of screws and vehicles is still small in Section 4.1. In Table 4, the recall needs to be improved. After analyzing the experimental results, we found that the recall of screws and vehicles was too low. It is most likely due to the small sample images of these two categories.
Therefore, we used the focal loss function to reduce the impact of class imbalance. Table 5 shows the recall comparison using the focal loss function in each category.
Recall/% | Tower | Screws | Vehicle | Insulator | Pole | All |
YOLOv5s | 93.6 | 94.8 | 78.9 | 100 | 95.2 | 92.5 |
YOLOv5s + focal loss | 95.7 | 96.3 | 81.2 | 96.2 | 94.3 | 92.7 |
YOLOv5s + Data Augmentation | 92.8 | 92.1 | 94.3 | 96.3 | 92.5 | 93.6 |
Ours model + Data Augmentation | 92.9 | 94.7 | 96.3 | 96.5 | 91.5 | 94.4 |
As seen in Table 5, after our data augmentation and using the focal loss function, the recall has improved. In particular, the recall of vehicles has increased significantly.
Figure 10 shows the detected result comparison for vehicles.
In Figure 10(a), a car is missed, and the car is detected in Figure 10(b). In addition, the confidence of the detected cars is also higher in Figure 10(b). Therefore, our model reduces the missed detection of cars and improves the recall.
We conducted more detailed ablation experiments to verify our improvements' effectiveness further. Table 6 shows the results of all ablation experiments.
CBAM | 1-IoU | Focal loss | Data augmentation | mAP@0.5/% | Precision/% | Recall/% |
94.7 | 96.5 | 92.5 | ||||
√ | 95.6 | 96.7 | 92.6 | |||
√ | 95.3 | 97.1 | 91.5 | |||
√ | 94.7 | 96.2 | 92.7 | |||
√ | 96.1 | 97.0 | 93.6 | |||
√ | √ | 95.5 | 97.1 | 91.8 | ||
√ | √ | 94.7 | 94.1 | 92.5 | ||
√ | √ | 95.1 | 96.4 | 92.6 | ||
√ | √ | √ | 95.9 | 97.9 | 92.9 | |
√ | √ | √ | √ | 98.1 | 97.5 | 94.4 |
As can be seen in Table 6, each of our improvements can improve the three evaluation indexes, especially the mAP. Finally, compared to the original YOLOv5s model, the mAP, precision, and recall enhanced by 3.4%, 1.0%, and 1.9%, respectively.
The comparison between our improved model and the original YOLOv5s model is shown in Figure 11.
It can be seen from Figure 11 that our improved model has higher confidence in detecting all categories, and there is no missing detection.
Figure 12 shows the loss change during the training for our improved model and the original YOLOv5s model.
In Figure 12, the gray curve is the loss curve of the original YOLOv5s model, and the dark blue curve is the loss curve of our improved model. The yellow curve is the loss curve of the original YOLOv5s model after data augmentation, and the light blue curve is the loss curve of our improved model after data augmentation. It can be seen that the loss values gradually stabilize as the training progresses. Regardless of the data augmentation, the loss value of our improved model is always smaller than that of the original YOLOv5s model. It indicates that our improved model has less loss and better convergence during training.
The mAP curves of our improved model and the original YOLOv5s model are compared in Figure 13. The mAP@0.5 indicates the average AP of each category when the IoU sets to 0.5. The mAP@0.5:0.95 represents the average mAP on the different IoU thresholds (from 0.5 to 0.95, in steps of 0.05).
As shown in Figure 13, the mAP@0.5 and mAP@0.5:0.95 of our improved model in this paper are higher than the original YOLOv5s model.
In Figure 13(a), the mAP@0.5 of our improved model eventually stabilizes at around 0.959 after iteration, while the mAP@0.5 of the original YOLOv5s model eventually stabilizes at around 0.947. Therefore, our improved model improves the mAP@0.5 by 1.2%. After our data augmentation, the mAP@0.5 of the original YOLOv5s model and our improved model rise to 96.1% and 98.1%, respectively, and it can be seen that the data augmentation effect is significant. After our data augmentation, the mAP@0.5 of our improved model is still higher than that of the original YOLOv5s model.
Similarly, in Figure 13(b), the four models' mAP@0.5:0.95 values also show that our improved model is adequate.
We compared our improved model with the Faster R-CNN, SSD, CenterNet, YOLOv3, YOLOv4 and YOLOv5s models. We trained all object detection models using the same datasets with the same division and parameters, and the comparison experiment results are shown in Figure 14.
Figure 14 shows that the mAP and recall of our improved model are higher than those of other models. Although our improved model's detection speed (FPS) is lower than the original YOLOv5s model, it is better than the other five models. Through comprehensive analysis, the experimental results prove that our improved model can balance the detection speed and accuracy.
The reference [30] provided an available insulator image dataset and proposed a novel deep convolutional neural network (CNN) cascading architecture for localization and detecting defects in insulators. The dataset is 840 composite insulator aerial images collected by UAV, and each image has a resolution of 1152 × 864 pixels. The reference [31] also used this dataset.
In contrast experiments, we randomly divided the insulator dataset: 50% as the training set, 25% as the verification set, and the remaining 25% as the test set. We compared our improved model with the models in the references [30] and [31], and the results are shown in Table 7.
Method | mAP@0.5/% | Recall/% |
CNN cascading architecture [30] | 91.0 | 96.0 |
Attention mechanism + Fast RCNN [31] | 94.3 | 98.42 |
Our model | 99.5 | 100 |
Table 7 shows that our improved model also performs better than the model of references [30] and [31] in the composite insulator dataset.
In this paper, we proposed an improved YOLOv5s model to meet the requirements of detecting key components of power transmission lines. Our model can automatically detect the key components of the transmission line, which is the preliminary work of the automatic transmission line inspection. The research work can reduce the workload and cost of transmission line inspection.
We modified the distance measurement in the K-means clustering, added the CBAM attention mechanism, and used the focal loss function. Our model improved the detection accuracy of key components of power transmission lines. The experimental results show that our improved model achieves 98.1% mAP@0.5, 97.5% precision, and 94.4% recall. However, the speed of our improved model is slightly slower than the YOLOv5s model.
Next, we will use fine-grained identification to detect the key components' defects and anomalies. We will refer to the references [36,37,38] for defect detection and abnormal detection of key components of power transmission lines to improve intelligent detection of power transmission lines.
This research was funded by the Key R & D Projects of Yunnan Province (Grant No. 202202AD080004), the Natural Science Foundation of China (Grant No. 62061049, 12263008), the Yunnan Provincial Department of Science and Technology-Yunnan University Joint Special Project for Double-Class Construction (Grant No. 202201BF070001-005), and the Application and Foundation Project of the Yunnan Province (Grant No. 202001BB050032).
The authors declare there is no conflict of interest.
[1] | Adcock R, Gradojevic N (2019) Non-fundamental, non-parametric Bitcoin forecasting. Phys A 531: 121727. |
[2] |
Armano G, Marchesi M, Murru A (2005) A hybrid genetic-neural architecture for stock indexes forecasting. Inf Sci 170: 3-33. doi: 10.1016/j.ins.2003.03.023
![]() |
[3] |
Atsalakis GS, Atsalaki IG, Pasiouras F, et al. (2019) Bitcoin price forecasting with neuro-fuzzy techniques. Eur J Oper Res 276: 770-780. doi: 10.1016/j.ejor.2019.01.040
![]() |
[4] |
Atsalakis GS, Valavanis KP (2009) Forecasting stock market short-term trends using a neuro-fuzzy based methodology. Expert Syst Appl 36: 10696-10707. doi: 10.1016/j.eswa.2009.02.043
![]() |
[5] |
Balcilar M, Bouri E, Gupta R, et al. (2017) Can volume predict Bitcoin returns and volatility? A quantiles-based approach. Econ Model 64: 74-81. doi: 10.1016/j.econmod.2017.03.019
![]() |
[6] | Breiman L (1984) Classification and regression trees (Online pub), New York, NY: Routledge. |
[7] |
Butner JE, Munion AK, Baucom BRW, et al. (2019) Ghost hunting in the nonlinear dynamic machine. PloS One 14: 1-21. doi: 10.1371/journal.pone.0226572
![]() |
[8] |
Chen Z, Li C, Sun W (2020) Bitcoin price prediction using machine learning: An approach to sample dimension engineering. J Comput Appl Math 365: 1-13. doi: 10.1007/s12190-020-01341-8
![]() |
[9] |
Corbet S, Eraslan V, Lucey B, et al. (2019) The effectiveness of technical trading rules in cryptocurrency markets. Financ Res Lett 31: 32-37. doi: 10.1016/j.frl.2019.04.027
![]() |
[10] | Felizardo L, Oliveira R, Del-Moral-Hernandez E, et al. (2019) Comparative study of Bitcoin price prediction using WaveNets, Recurrent Neural Networks and other Machine Learning Methods, In 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), 1-6. |
[11] |
Gyamerah SA (2019) Modelling the volatility of Bitcoin returns using GARCH models. Quant Finan Econ 3: 739-753. doi: 10.3934/QFE.2019.4.739
![]() |
[12] |
Huang JZ, Huang W, Ni J (2019) Predicting bitcoin returns using high-dimensional technical indicators. J Financ Data Sci 5: 140-155. doi: 10.1016/j.jfds.2018.10.001
![]() |
[13] |
Jang H, Lee J (2018) An Empirical Study on Modeling and Prediction of Bitcoin Prices With Bayesian Neural Networks Based on Blockchain Information. IEEE Access 6: 5427-5437. doi: 10.1109/ACCESS.2017.2779181
![]() |
[14] | Ji S, Kim J, Im H (2019) A comparative study of bitcoin price prediction using deep learning. Mathematics 7: 1-20. |
[15] |
Kara Y, Acar Boyacioglu M, Baykan ÖK (2011) Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Syst Appl 38: 5311-5319. doi: 10.1016/j.eswa.2010.10.027
![]() |
[16] |
Kim KJ (2003) Financial time series forecasting using support vector machines. Neurocomputing 55: 307-319. doi: 10.1016/S0925-2312(03)00372-2
![]() |
[17] | Kwon DH, Kim JB, Heo JS, et al. (2019) Time series classification of cryptocurrency price trend based on a recurrent LSTM neural network. J Inf Process Syst 15: 694-706. |
[18] | Lahmiri S, Bekiros S (2020) Intelligent forecasting with machine learning trading systems in chaotic intraday Bitcoin market. Chaos Solitons Fractals 133: 109641. |
[19] |
Makridakis S, Spiliotis E, Assimakopoulos V (2018) Statistical and Machine Learning forecasting methods: Concerns and ways forward. PLOS ONE 13: 1-26. doi: 10.1371/journal.pone.0194889
![]() |
[20] |
Mallqui DCA, Fernandes RAS (2019) Predicting the direction, maximum, minimum and closing prices of daily Bitcoin exchange rate using machine learning techniques. Appl Soft Comput 75: 596-606. doi: 10.1016/j.asoc.2018.11.038
![]() |
[21] | McNally S, Roche J, Caton S (2018) Predicting the Price of Bitcoin Using Machine Learning, Proceedings—26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018,339-343. |
[22] |
Miller N, Yang Y, Sun B, et al. (2019) Identification of technical analysis patterns with smoothing splines for bitcoin prices. J Appl Stat 46: 2289-2297. doi: 10.1080/02664763.2019.1580251
![]() |
[23] | Nguyen DT, Le HV (2019) Predicting the Price of Bitcoin Using Hybrid ARIMA and Machine Learning, In: T. K. Dang, J. Küng, M. Takizawa, & S. H. Bui (Eds.), Future Data and Security Engineering, Cham: Springer International Publishing, 696-704. |
[24] |
Panagiotidis T, Stengos T, Vravosinos O (2018) On the determinants of bitcoin returns: A LASSO approach. Financ Res Lett 27: 235-240. doi: 10.1016/j.frl.2018.03.016
![]() |
[25] |
Patel J, Shah S, Thakkar P, et al. (2015) Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques. Expert Syst Appl 42: 259-268. doi: 10.1016/j.eswa.2014.07.040
![]() |
[26] | Quinlan JR (1986) Induction of decision trees. Mach Learn 1: 81-106. |
[27] | Quinlan JR (1988) C4.5: programs for machine learning, London, England: Morgan Kaufmann Publishers, Inc. |
[28] | Rebane J, Karlsson I, Denic S, et al. (2018) Seq2Seq RNNs and ARIMA models for Cryptocurrency Prediction: A Comparative Study. SIGKDD Fintech 18: 2-6. |
[29] | Shu M, Zhu W (2020) Real-time prediction of Bitcoin bubble crashes. Phys A 548: 124477. |
[30] | Vapnik V (1995) The nature of statistical learning theory, New York, NY: Springer. |
[31] | Yao W, Xu K, Li Q (2019) Exploring the Influence of News Articles on Bitcoin Price with Machine Learning, In: 2019 IEEE Symposium on Computers and Communications (ISCC), 1-6. |
1. | Elisavet Bellou, Ioana Pisica, Konstantinos Banitsas, Aerial Inspection of High-Voltage Power Lines Using YOLOv8 Real-Time Object Detector, 2024, 17, 1996-1073, 2535, 10.3390/en17112535 | |
2. | İpek İNAL ATİK, Isolator Detection in Power Transmission Lines using Lightweight Dept-wise Convolution with BottleneckCSP YOLOv5, 2023, 9, 2149-9144, 150, 10.22399/ijcesen.1307309 | |
3. | Yulong Zhang, Shaowei Cao, Lingxia Mu, Xianghong Xue, Jing Xin, Youmin Zhang, 2024, A Fault Detection Method for Power Transmission Lines Using Aerial Images, 979-8-3503-5788-2, 1247, 10.1109/ICUAS60882.2024.10556891 | |
4. | Zejian Feng, Martina Karaskova, Marwa Mahmoud, 2023, Open-Sheep-Face: A Comprehensive Application for Sheep Face Analysis and Pain Estimation, 979-8-3503-2745-8, 1, 10.1109/ACIIW59127.2023.10388128 | |
5. | Kangjian Sun, Ju Huo, Qi Liu, Shunyuan Yang, An infrared small target detection model via Gather-Excite attention and normalized Wasserstein distance, 2023, 20, 1551-0018, 19040, 10.3934/mbe.2023842 | |
6. | Zhenyue Wang, Guowu Yuan, Hao Zhou, Yi Ma, Yutang Ma, Foreign-Object Detection in High-Voltage Transmission Line Based on Improved YOLOv8m, 2023, 13, 2076-3417, 12775, 10.3390/app132312775 | |
7. | Zheng Zhang, Xiang Lu, Shouqi Cao, An efficient detection model based on improved YOLOv5s for abnormal surface features of fish, 2024, 21, 1551-0018, 1765, 10.3934/mbe.2024076 | |
8. | Kangjian Sun, Ju Huo, Heming Jia, Lin Yue, Reinforcement learning guided Spearman dynamic opposite Gradient-based optimizer for numerical optimization and anchor clustering, 2023, 11, 2288-5048, 12, 10.1093/jcde/qwad109 | |
9. | Fuhong Meng, Guowu Yuan, Hao Zhou, Hao Wu, Yi Ma, Improved MViTv2-T model for insulator defect detection, 2024, 9, 2578-1588, 1, 10.3934/electreng.2025001 | |
10. | Andrew Ponomarev, Anton Agafonov, Alexander Smirnov, Nikolay Shilov, Andrey Sukhanov, Andrey Shulzhenko, 2024, Chapter 40, 978-3-031-77687-8, 420, 10.1007/978-3-031-77688-5_40 | |
11. | Wenrui Wang, Fanglin Lu, Bo Wu, Jianfeng Yu, Xunguang Yan, Hongyong Fan, GFRF R-CNN: Object Detection Algorithm for Transmission Lines, 2025, 82, 1546-2226, 1439, 10.32604/cmc.2024.057797 |
Training set | Validation set | Test set | Total |
9067 | 1134 | 1134 | 11,335 |
Method | Small object | Medium object | Large object |
Euclidean distance | [[6,6], [9,9], [7,17]] | [[13,28], [18,52], [32,53]] | [[45,62], [158,160], [182,199]] |
1-IoU distance | [[4,4], [7,7], [6,14]] | [[10,10], [15,8], [11,24]] | [[25,15], [17,48], [41,103]] |
Method | mAP@0.5/% | Precision/% | Recall/% |
Euclidean distance | 94.7 | 96.5 | 92.5 |
1-IoU distance | 95.3 | 97.1 | 91.5 |
Method | mAP@0.5/% | Precision/% | Recall/% |
YOLOv5s | 94.7 | 96.5 | 92.5 |
YOLOv5s + 1-IoU | 95.3 | 97.1 | 91.5 |
YOLOv5s + 1-IoU + SENet | 94.9 | 95.6 | 91.6 |
YOLOv5s + 1-IoU + ECA | 94.9 | 95.1 | 91.6 |
YOLOv5s + 1-IoU + CA | 95.3 | 93.7 | 93 |
YOLOv5s + 1-IoU + CBAM | 95.5 | 97.1 | 91.8 |
Recall/% | Tower | Screws | Vehicle | Insulator | Pole | All |
YOLOv5s | 93.6 | 94.8 | 78.9 | 100 | 95.2 | 92.5 |
YOLOv5s + focal loss | 95.7 | 96.3 | 81.2 | 96.2 | 94.3 | 92.7 |
YOLOv5s + Data Augmentation | 92.8 | 92.1 | 94.3 | 96.3 | 92.5 | 93.6 |
Ours model + Data Augmentation | 92.9 | 94.7 | 96.3 | 96.5 | 91.5 | 94.4 |
CBAM | 1-IoU | Focal loss | Data augmentation | mAP@0.5/% | Precision/% | Recall/% |
94.7 | 96.5 | 92.5 | ||||
√ | 95.6 | 96.7 | 92.6 | |||
√ | 95.3 | 97.1 | 91.5 | |||
√ | 94.7 | 96.2 | 92.7 | |||
√ | 96.1 | 97.0 | 93.6 | |||
√ | √ | 95.5 | 97.1 | 91.8 | ||
√ | √ | 94.7 | 94.1 | 92.5 | ||
√ | √ | 95.1 | 96.4 | 92.6 | ||
√ | √ | √ | 95.9 | 97.9 | 92.9 | |
√ | √ | √ | √ | 98.1 | 97.5 | 94.4 |
Training set | Validation set | Test set | Total |
9067 | 1134 | 1134 | 11,335 |
Method | Small object | Medium object | Large object |
Euclidean distance | [[6,6], [9,9], [7,17]] | [[13,28], [18,52], [32,53]] | [[45,62], [158,160], [182,199]] |
1-IoU distance | [[4,4], [7,7], [6,14]] | [[10,10], [15,8], [11,24]] | [[25,15], [17,48], [41,103]] |
Method | mAP@0.5/% | Precision/% | Recall/% |
Euclidean distance | 94.7 | 96.5 | 92.5 |
1-IoU distance | 95.3 | 97.1 | 91.5 |
Method | mAP@0.5/% | Precision/% | Recall/% |
YOLOv5s | 94.7 | 96.5 | 92.5 |
YOLOv5s + 1-IoU | 95.3 | 97.1 | 91.5 |
YOLOv5s + 1-IoU + SENet | 94.9 | 95.6 | 91.6 |
YOLOv5s + 1-IoU + ECA | 94.9 | 95.1 | 91.6 |
YOLOv5s + 1-IoU + CA | 95.3 | 93.7 | 93 |
YOLOv5s + 1-IoU + CBAM | 95.5 | 97.1 | 91.8 |
Recall/% | Tower | Screws | Vehicle | Insulator | Pole | All |
YOLOv5s | 93.6 | 94.8 | 78.9 | 100 | 95.2 | 92.5 |
YOLOv5s + focal loss | 95.7 | 96.3 | 81.2 | 96.2 | 94.3 | 92.7 |
YOLOv5s + Data Augmentation | 92.8 | 92.1 | 94.3 | 96.3 | 92.5 | 93.6 |
Ours model + Data Augmentation | 92.9 | 94.7 | 96.3 | 96.5 | 91.5 | 94.4 |
CBAM | 1-IoU | Focal loss | Data augmentation | mAP@0.5/% | Precision/% | Recall/% |
94.7 | 96.5 | 92.5 | ||||
√ | 95.6 | 96.7 | 92.6 | |||
√ | 95.3 | 97.1 | 91.5 | |||
√ | 94.7 | 96.2 | 92.7 | |||
√ | 96.1 | 97.0 | 93.6 | |||
√ | √ | 95.5 | 97.1 | 91.8 | ||
√ | √ | 94.7 | 94.1 | 92.5 | ||
√ | √ | 95.1 | 96.4 | 92.6 | ||
√ | √ | √ | 95.9 | 97.9 | 92.9 | |
√ | √ | √ | √ | 98.1 | 97.5 | 94.4 |
Method | mAP@0.5/% | Recall/% |
CNN cascading architecture [30] | 91.0 | 96.0 |
Attention mechanism + Fast RCNN [31] | 94.3 | 98.42 |
Our model | 99.5 | 100 |