Abstract:With the continuous construction of smart substation and the development of its condition monitoring system, the data size of electric power equipment on condition monitoring is leaping. Aiming at deficiencies of current electric power data warehouse on massive condition monitoring data storage, query and analysis, a method of data warehouse based on Hive for fast query and analysis on multidimensional data is proposed. First, through analyzing condition monitoring system and production management system, the static information and condition monitoring information of electric power equipment are stored in Hive data warehouse. Second, the architecture of the data warehouse and storage structure of massive condition data are designed, adopting Hadoop distributed file system (HDFS) for distributed storage and management, MapReduce as computing model of massive data query and analysis, and Hive Query Language (HiveQL) as a control tool of data warehouse. The process of data warehouse is given respectively. Finally, an experimental platform of data warehouse for electric power equipment condition information based on Hive is established, results of multidimensional data queries on 5 nodes and 10 nodes cluster show that this method has good scalability, and can meet the needs of fast query on large scaled multidimensional condition data of electric power equipment.