Abstract:Because of differences in recording standard over different periods and variations in manufacturers’ interpretation of these standards, homologous recording data often exhibit personalized differences in channel names and channel index numbers, making it difficult to achieve accurate matching of homologous recording data. To solve this problem, a method for matching homologous recording data based on the Sentence-MacBERT model is proposed. First, the characteristics of the recording format are analyzed, and a verification information table is constructed based on these format characteristics. Then, the verification information table is used to automatically verify the recording files. Finally, a Sentence-MacBERT homologous channel matching model is constructed based on the BERT model, and the homologous recording data matching is completed. Case studies show that the verification information table can be used to automatically verify the recording files, and alerts are generated for the recording files that fail to parse. The Sentence-MacBERT model is excellent in channel name matching, effectively completing the homologous matching of recording data and helping operators in analyzing faults.