微特电机 ›› 2025, Vol. 53 ›› Issue (4): 34-38.

• 设计分析 • 上一篇    下一篇

自注意力优化密度聚类的风机数据清洗方法


  

  1. 运达能源科技集团股份有限公司,杭州 310000
  • 收稿日期:2025-01-15 出版日期:2025-04-28 发布日期:2025-04-28

Transformer-Optimized DBSCAN for Wind Turbine Data Cleaning

  1. Windey Energy Technology Group Co., Ltd.,Hangzhou 310000, China
  • Received:2025-01-15 Online:2025-04-28 Published:2025-04-28

摘要: 针对风电机组监控与数据采集系统常受多种因素影响,导致数据异常问题,提出一种基于自注意力编
码器改进的密度聚类模型方法,结合自注意力编码器的特征提取能力和密度聚类的空间特性,通过引入相对位置编
码和优化多头注意力机制,提升对监控与数据采集系统异常数据识别能力。 实验结果表明,所提方法的数据清洗效
果和模型精度与传统方法相比更优,其中异常数据剔除率达到 26. 58%,并且在拟合风速-功率曲线时,平均绝对误
差、均方根误差最低,决定系数最高。 清洗后的监控与数据采集系统数据应用于机组故障诊断,将风电机组故障识
别准确性提高到了 92%以上、故障预警及时性提前了 20%,故障类型分类精度提高了 30%。 该方法不仅提高了风电
机组的运行效率和可靠性,还为风电场的运行管理和决策提供了较为可靠的数据支持。

关键词: 自注意力编码器, 密度聚类算法, 数据清洗, 监控与数据采集系统, 风电机组

Abstract: The supervisory control and data acquisition( SCADA) system for wind turbines is often affected by various
factors, leading to data anomalies. This article proposes a A method for improving the density-based spatial clustering of
applications with noise( DBSCAN) clustering model based on Transformer autoencoder was proposed. By combining the
feature extraction ability of Transformer and the density clustering characteristics of DBSCAN, relative position encoding
and optimized multi head attention mechanism were introduced to enhance the recognition ability of SCADA abnormal data.
The experiment shows that compared with traditional methods, the data cleaning effect and model accuracy were better. The
abnormal data removal rate of this method reached 26. 58% and when fitting the wind speed power curve, MAE and RMSE
were the lowest and R2 was the highest. The cleaned SCADA data used for unit fault diagnosis can significantly improve the
accuracy of wind turbine fault identification over 92%, the timeliness of fault warning 20% in advance, and the accuracy of
fault type classification 30% improvement. This method not only improved the operational efficiency and reliability of wind
turbines, but also provided more reliable data support for the operation management and decision-making of wind farms.

Key words: transformer autoencoder, density-based spatial clustering of applications with noise ( DBSCAN ), data
cleaning, supervisory control and data acquisition( SCADA) system, wind turbine unit