| Abstract:In view of the large volume of defective texts in the current relay protection systems and the limitations of traditional mining methods, such as insufficient text feature extraction, inaccurate semantic recognition, and low operation efficiency, a defect text deep mining method based on the Linear Transformer-CNN model is proposed. First, massive defect text data are preprocessed, and the processed words are input into MacBERT to generate comprehensive word embeddings. Next, a linear attention mechanism is introduced into the Transformer to improve overall operation efficiency. Then, a multi-layer CNN module is added to compensate for the Linear Transformer’s limited ability to extract defect text features. Finally, the comprehensive word embeddings are fed into the multi-layer CNN and Linear Transformer modules to extract local key features and long-distance semantic features of defect texts, respectively. The fused features are then classified using a SoftMax layer. Experimental results show that, compared with traditional text mining methods, the training and testing time of the proposed method is shorter, and the classification accuracy reaches 94.24%, enabling fast and accurate classification of defect texts. |