Complex building’s energy system operation patterns analysis using bag of words representation with hierarchical clustering
 Usman Habib^{1, 3}Email author,
 Khizar Hayat^{2} and
 Gerhard Zucker^{1}
https://doi.org/10.1186/s4029401600200
© The Author(s) 2016
Received: 1 February 2016
Accepted: 13 May 2016
Published: 13 June 2016
Abstract
Purpose
Due to the large quantity of data that are recorded in energy efficient buildings, understanding the behavior of various underlying operations has become a complex and challenging task. This paper proposes a method to support analysis of energy systems and validates it using operational data from a cold water chiller. The method automatically detects various operation patterns in the energy system.
Methods
The use of kmeans clustering is being proposed to automatically identify the On (operational) cycles of a system operating with a duty cycle. The latter’s data is subsequently transformed to symbolic representations by using the symbolic aggregate approximation method. Afterward, the symbols are converted to bag of words representation (BoWR) for hierarchical clustering. A gap statistics method is used to find the best number of clusters in the data. Finally, operation patterns of the energy system are grouped together in each cluster. An adsorption chiller, operating under real life conditions, supplies the reference data for validation.
Results
The proposed method has been compared with dynamic time warping (DTW) method using cophenetic coefficients and it has been shown that the BoWR has produced better results as compared to DTW. The results of BoWR are further investigated and for finding the optimal number of clusters, gap statistics have been used. At the end, interesting patterns of each cluster are discussed in detail.
Conclusion
The main goal of this research work is to provide analysis algorithms that automatically find the various patterns in the energy system of a building using as little configuration or field knowledge as possible. A bag of word representation method with hierarchical clustering has been proposed to assess the performance of a building energy system.
Keywords
Building energy performance Fault detection and diagnosis (FDD) Clustering Symbolic aggregate approximation (SAX) Bag of words representation (BoWR) Hierarchical clustering Coefficient of performance (CoP) Dynamic time warping (DTW) Heating ventilation and air conditioning (HVAC) Gap statistics analysisBackground
This paper is an extension of work originally presented in proceedings of Frontiers of information technology (FIT’15) Conference 2015 (Habib and Zucker 2015). The energy systems of a typical contemporary building are usually complex and may contain several subsystems deployed independently of each other. In order to analyze various energy performance aspects of a given building, a lot of raw data is recorded during its monitoring Khan et al. (2011). The recorded data is studied at later stages in order to find interesting features, using a variety of visualization tools (Mourad and BertrandKrajewski 2002). The massive amount of recorded data makes any detailed performance analysis a formidable task. Moreover, there is a high chance of overlooking some important patterns in the data which, if noticed properly, may help identify faults that can compromise energy efficiency.
Patterns are regular, usually repetitive, sequences in a given data and may owe their existence to a specific event. A pattern is thus dependent on the characteristics of a system and may represent the underlying processes and structure of the system. Methods that can automatically identify interesting patterns from buildings’ data, help to get useful insights into the various parameters of energy usage as well as the source of faults in different components. In this context, data mining techniques like clustering are feasible tools to address these issues. The process of automatically finding the various patterns in the data can make the subsequent analysis easier, more feasible and lesser laborious (Miller et al. 2015; Iglesias and Kastner 2013; Narayanaswamy et al. 2014; Lin and Li 2009).
We aim to exploit machine learning for finding various patterns in energy related building data . The idea is to realize all this with minimum possible configuration changes and knowledge of the relevant field. More specifically, in order to automatically find different patterns in the adsorption chiller’s operation, in this article, we propose to use a bag of words representation (BoWR) with subsequent hierarchical clustering. The suggested method has been applied to the operation data of a water chiller and compared to another approach called dynamic time warping (DTW) using cophenetic correlation. The dynamic time warping (DTW) method uses a dynamic programming technique for defining the best alignment between the two time series data Keogh and Ratanamahatana (2004). Furthermore, the cophenetic correlation demonstrates that the cluster tree has a strong correlation with the distances between objects in the distance vector Lin and Li (2009). The On/Off state information required for the suggested technique is detected by using the kmeans clustering algorithm. As we are taking the sensor readings that are placed outside the chiller, the sensors reading will reflect the behavior of the chiller during its operational cycle. Thus, the On (operational) states are of greater importance for assessing the performance of chillers and faults detection and diagnosis (FDD). Moreover, avoiding the Off cycle for finding different patterns will reduce the amount of data as well. The On (operational) cycles are discretized by using the symbolic aggregate approximation (SAX) method. These discretized values are called symbols or words. After transformation of the On cycles to words, a normalized histogram for each On cycle is created; called bag of words representation (BoWR). The normalized BoWR is used because the On (operational) cycle’s vary in there duration. The hierarchical clustering uses the normalized BoWR of the On cycles for finding the various operational patterns of the chiller. The details of the different clusters created by hierarchical clustering are also explained in detail.
The rest of the paper is arranged as follows. The next section discusses the state of the art methods available in the literature. In the subsequent section, the design of the demonstrated system is elaborated. This is followed by a section describing the methodology of the proposed solution for finding the patterns in the data. The penultimate section explains the different experiments and results, followed by a "Conclusion" section.
State of the art
This section discusses state of the art methods, from literature, proposed for finding operation patterns in different energy systems, in the context of buildings. The energy systems can be modeled using simulation tools.
Complex systems (CA)

people creating social systems,

our nervous system, with brain spinal cord and neurons being the subsystems, and

a weather forecast system with factors like wind flow, pressure and temperature contributing in predictions.

cities can be considered as system and different aspect such as social physics, urban economics, transportation theory, regional science, and urban geography can be considered as subsystem (agents) for designing the cities Batty (2007).
Complex adaptive systems (CAS)
A Complex adaptive system (CAS) can be defined as an open system “with large variability and diversity of elements or agents, with dynamic interactions among them that create nonlinear feedback systems” (Faucher 2010). Such systems are usually linked to the learning activities, in order to provide various features of CAS, like selforganization and unpredictability. They are also described as “special cases of complex systems, which can be called as ’complex macroscopic collection’ of relatively similar microstructures that are partially connected. These macrostructures are formed to adapt the changes in the environment, and increase its survivability” (Kayman 2014).
 1.
Adaptation: This characteristic of CAS is relevant to the adaptability of the system to changes in the environment.
 2.
Selforganizing: This characteristic is dependent on the structure of the system as well as its internal processes; the underlying question being how the larger dynamic system organizes itself in critical situations.
 3.
Emergence: This characteristic defines the qualitative change in the behavior of the system during a change in its observation scale. It is one of the common characteristic of CAS where the behavior of the system is more complex than the sum of the behaviors of the components of the system. The emergent property is lost when the system is decomposed into its component parts or when an elimination of some component occurs.
Buildings as CAS
Complex systems scale from large systems like ecosystem (Levin 1998; Grimm et al. 2005) or social ecological systems (Olsson et al. 2004) to smaller systems such as secure authentication systems (Habib et al. 2011) or buildings Oosterhuis (2012) and their energy system (Azar and Menassa 2010, 2011; Jensen et al. 2016). Limited area notwithstanding, the analysis of a building’s energy system is a complex task as it consists of several subsystems. In order to make a detailed analysis of the energy systems, the buildings are monitored using sensors. Nowadays, it is feasible to maintain a record of the historic operation data in the building. While there exist other domains that have considerably higher amounts of data, the operation data in buildings are specifically challenging, since there is commonly no appropriate underlying data model that can be generally applied to operation data; data is very specific to one building or component. Thus, data analytics methods have to supply a high degree of unsupervised automation in order to treat different types of data. Thus, today the main approach for data analysis is a simple visualization of the process parameters using time graphs as visualization tools. Such visualizations may further require manual followup performance analysis. Methods of analysis like these can be time intensive and there is always a chance to miss out some areas of interest that may eventually be of greater importance (Mourad and BertrandKrajewski 2002).
In order to make the buildings energy efficient, their prototype models are simulated for energy performance. For better designing and the ability to handle the dynamic nature of the building’s characteristics, each component of the building can be modeled as an active part; thus different components of the building will constitute a complex network (Oosterhuis 2012). There are many energy modeling methods that are generally used for predicting the buildings performance during the design phase. The actual energy consumption reading usually deviates from the predicted value during the modeling phase (Azar and Menassa 2010, 2011). Some of the reasons for this deviation are the dynamic parameters like occupant’s behavior, climate, and buildings properties (Azar and Menassa 2011). The agent based modeling can be used to handle such dynamic parameters. For example, the dynamic nature of occupants’ behavior can be correlated with the impact on energy consumption in commercial buildings (Azar and Menassa 2010, 2011) or in managing ventilation system in residential buildings (Jensen et al. 2016). There are several bottom up approaches put forward for the agent based modeling. The authors in Grimm et al. (2005) have proposed a framework using a pattern oriented approach for agent based modeling to handle the complexity and uncertainty problems.
Other methods for analysis of energy systems in buildings
The reasons for analyzing data from the energy subsystems are manifold and include such objectives like assessment of the overall system performance, comparison with other systems, calculation of operating costs, and prediction of energy consumption and faults etc. The International energy agency (IEA) has launched an implementing agreement (i.e., a technology initiative) called “IEA Solar Heating and Cooling Programme (SHC)”. Within this implementing agreement IEA SHC Task 38 “Solar AirConditioning and Refrigeration” was one of the research topics. The IEA SHC Task 38 (subtask A3aB3b: “Monitoring procedure for solar cooling system”) defines a generic monitoring policy that provides information on sensor locations and naming for the evaluation of systems, evaluation of the system performance, and comparison of different energy systems (Napolitano et al. 2011). In the literature, one can find many methods for faults detection and diagnosis (FDD) in building components. One important area is concerned with the Heating, Ventilation and AirConditioning (HVAC) (Pietruschka et al. 2015; Isermann 2005; Fan and Qiao 2011; Katipamula and Brambley 2005; Capozzoli et al. 2015; Katipamula and Brambley 2005; Lee and Eun 2015; Narayanaswamy et al. 2014). Prior knowledge about the system can be useful in finding some of the simple undetected faults using first principles (i.e. energy balance, mass balance and other physical principles), but still there is a requirement for more sophisticated techniques to judge various aspects of a building’s energy performance. One known class of techniques that makes use of historic operation data describes the behavior of the system, characterized as black box models, which are fitted using the historical data (Katipamula and Brambley 2005, 2005). Faults can also be detected in buildings with machine learning algorithms using the information from the installed electricity consumption meters as shown in (Figueiredo et al. 2005), Domínguez et al. (2013). There are different parameters available that can be useful for the prediction of electricity consumption for each HVAC component; multivariate analysis can be used to calculate these parameters (Djuric and Novakovic 2012).
In order to detect various patterns in any energy system using data driven techniques, the focus is on extracting information from the recorded data using little to none domain expertise. There are several machine learning techniques that can be used for extracting information from the data, e.g. clustering can be used for finding similar daily performance patterns in the buildings (Miller et al. 2015; Seem 2005), detecting the abnormal performance from electricity consumption Seem (2007), and further enhancing the performance optimization algorithms (Kusiak and Song 2008). Moreover, at a larger scale, wavelet transformations and clustering can be used for the classification of electrical demand profiles of buildings (Florita et al. 2013).
The data is usually stored as a time series for later analysis. The time series data can be represented with different available techniques that can further help in finding the similarity between the data having same behavior. An example is the symbolic aggregate approximation (SAX), a category of Piecewise Aggregate Approximation (PAA), that can be used to improve the speed and usability of several analysis techniques Lin et al. (2007). The similarities between different time series data can even be calculated by simply using the Euclidean distance parameter, but the problem in this method is that even a slighter shift of data can lead to erroneous results (Lin and Li 2009). A comparison of time series data similarity algorithms (Euclidean, DTW, wavelets) is carried out in Lin and Li (2009) par rapport the method of bag of patterns using hierarchical clustering. The authors have concluded that the bag of patterns representation (BoPR) approach performed better for finding similarities in the time series data as compared to other methods. The use of bag of words model can be seen in various fields with classification (Anwar et al. 2015).
One of the wellknown methods used for finding similar groups via data mining is clustering (Armano and Javarone 2013; Shah et al. (2015). The decision of the optimal number of clusters is an important issue in unsupervised methods, in general, and in hierarchical clustering, in particular. A clustering algorithm can give better results if the intercluster variations are minimum and intracluster variations are maximum (Tibshirani et al. 2001). Clustering algorithms can also be used for finding various energy states in the building, e.g., kmeans clustering can be used to detect the state (On/Off) of machine, as data toggle between these two states (Habib et al. 2015; Zucker et al. 2015a, b). Another example of using clustering for finding system states can be found in Zucker et al. (2014), where the XMeans clustering algorithm is used for automatically detecting the system states (On/Off), in order to examine the operational data of adsorption.
Cluster evaluation methods
Design of the demonstration system

The low temperature (LT)cycle is representing the part of the system that is handling the low temperature water produced by the chiller.

The medium temperature (MT) cycle represents the system portion where the unwanted heat of the system is transferred to the environment using cooling tower.

The high temperature (HT) cycle is showing the section of the system where heat is provided to produce cold water by the chiller.
Parameters description
Sensors  Description 

E6  High temperature (HT) electricity consumption meter 
E7  Medium temperature (MT) electricity consumption meter 
E8  Low temperature (LT) electricity consumption meter 
\(Q6a\_m3h\)  HT cycle Flow (water) reading 
\(Q12\_m3h\)  MT cycle Flow (water) reading 
\(Q7\_m3h\)  LT cycle Flow (water) reading 
\(T\_HTre\)  HT cycle temperature on return side 
\(T\_HTsu\)  HT cycle temperature on supply side 
\(T\_MTre\)  MT cycle temperature on return side 
\(T\_MTsu\)  MT cycle temperature on supply side 
\(T\_LTre\)  LT cycle temperature on return side 
\(T\_LTsu\)  LT cycle temperature on supply side 
\(Q6a\_KW\)  HT cycle Energy consumption reading 
\(Q12\_KW\)  MT cycle Energy consumption reading 
\(Q7\_KW\)  LT cycle Energy consumption reading 
PR6  Pressure in HT cycle 
PR7  Pressure in LT cycle 
PR8  Pressure in MT cycle 
Selected features for hierarchical clustering
Features  Descrition 

\(\Delta Temp\_LT\)  Temperature difference of low temperature cycle 
\(\Delta Temp\_HT\)  Temperature difference of high temperature cycle 
\(\Delta Temp\_MT\)  Temperature difference of medium temperature cycle 
\(Q6a\_m3h\)  Flow in high temperature cycle 
\(Q7\_m3h\)  Flow in low temperature cycle 
\(Q12\_m3h\)  Flow in medium temperature cycle 
\(Q6a\_KW\)  Energy reading in high temperature cycle 
\(Q7\_KW\)  Energy reading in low temperature cycle 
\(Q12\_KW\)  Energy reading in medium temperature cycle 
Methods
This section describes the methodology proposed in this paper. The first step followed in the analysis of data is always the preprocessing and finding outliers. The data used has already been processed; therefore it can be used without the preprocessing step.
Selected algorithms analysis
Methods  Algorithms  Knowledge of the field required  Configuration required 

Duty cycle detection  kmeans  No  No 
Duty cycle representation  BoWR  No  No 
Clustering  Hierarchical clustering  No  No 
On state (operational) detection using Kmeans clustering
Symbolic aggregate approximation (SAX) transformation
Each cycle data is first broken down into M non overlapping subsequences, in a uniform manner, just like the example illustrated in Fig. 3, wherein the partitions are represented by alphabets a, b, c and d. This process is called as chunking, and the period (xaxis) can be of different time length (P) depending on the application where it is used (Miller et al. 2015). The value of P is taken as 5 min in this research. The symbol of each data point is assigned according the breakpoints. The number of break points (M) taken for this research is 60. This transforms the data for each cycle to symbols. The SAX representation is specific for each a length of each cycle. In order to generalize the symbolic representation for each cycle with different lengths, the BoWR is used.
Bag of words representation (BoWR)
Hierarchical clustering
There are different techniques available to decide the best level or number of clusters for hierarchical clustering. One such technique is the gap method Tibshirani et al. (2001). A clustering algorithm gives better results when the intracluster difference is as small as possible while the intercluster difference is as high as possible.
Methodology overview

The first step is to find the On (operational) cycles in the data by using the kmeans algorithm. The latter can be applied to any energy system because the two states are their in any energy dependent system and On duty cycle can be readily detectable.

The On cycles data are transformed to symbolic data with the SAX transformation method. This step also does not need any field knowledge and is applicable to almost all energy systems.

A BoWR was created for the symbols of each On cycle. This procedure does not need any field knowledge.

The BoWR are clustered by using the hierarchical clustering for finding various operation patterns of the chiller. This process does not need any field knowledge.

The gap statistics is used to find the optimal number of clusters in the data. This procedure does not need any field knowledge.

The cluster patterns can be further investigated using the average performance indicators of each cluster.
Experiments and results
Cophenetic coefficients of dynamic time warping (DTW) and BoWR
No.  Clustering methods  Bag of word representation (BoWR)  Dynamic time warping (DTW) 

1  Average  0.9897  0.0375 
2  Centroid  0.9851  0.037 
3  Complete  0.9753  0.035 
4  Median  0.9803  0.0363 
5  Single  0.9848  0.0414 
6  Ward  0.9835  0.0363 
7  Weighted  0.9888  0.0368 
Cluster information of the five clusters with hierarchical clustering
Cluster_no  Percentage of cycles in cluster (%)  Average CoP of on cycles in cluster  Average time of on cyclesin cluster (hours) 

\(Cluster_1\)  0.73  0.16  67.65 
\(Cluster_2\)  98.75  0.54  0.95 
\(Cluster_3\)  0.06  0.62  0.09 
\(Cluster_4\)  0.34  0.87  0.07 
\(Cluster_5\)  0.12  0.7  0.06 
Comparison of proposed method with CAS modeling
The main point in this research is to find various patterns in the operation of the energy system in buildings using minimum possible input from the engineers. For the analysis of the energy system, the data has been selected using IEA SHC Task 38. The issues that may surface, while modeling a current system using CAS, can be traced back to the complete knowledge of the system, its behaviors or states and the interaction of the subsystems; a problem of scale dealing with a very large statespace representation. Secondly, complex dynamic systems will require transitions between completely different behaviors in the form of what is called phase transitions. Hence a critical transition detection will require a detailed statespace model.
Conclusions
The main goal of this research work is to provide analysis algorithms that automatically find the various patterns in the energy system of a building using as little configuration or field knowledge as possible. A bag of word representation method with hierarchical clustering has been proposed to assess the performance of a building energy system. In the first phase, a kmeans clustering algorithm is used to find the On (operational) cycles of the chiller. These On cycles are represented with symbols by using symbolic aggregate approximation (SAX) method. Furthermore, the symbolic representation is transformed to BoWR, which is provided to hierarchical clustering. The proposed method has been compared with dynamic time warping (DTW) method using cophenetic coefficients and it has been shown that the BoWR has produced better results as compared to DTW. The results of BoWR are further investigated and for finding the optimal number of clusters, gap statistics have been used. At the end, interesting patterns of each cluster are discussed in detail.
In future, the current research can be used in the field of automatic faults detection and diagnostics (FDD) in buildings, as the current research helps in finding the different performance patterns. This would help the experts in the field to look only for those areas where the performance is bad. Further research is needed in order to find intelligent ways of diagnosing the faults
Declarations
Authors’ contributions
UH, KH and GZ conceived and designed the experiments. The experiments are performed by UH. The data has been analyzed by UH, KH and GZ. The paper is written by UH, KH and GZ. All authors read and approved the final manuscript.
Acknowledgements
This work was partly funded by the Austrian Funding Agency in the funding programme e!MISSION within the project “extrACT”, Project Number 838688.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Andrii C (2014) Exploring behavioral patterns in complex adaptive systems. PhD thesis, University of Pittsburgh, PennsylvaniGoogle Scholar
 Anwar H, Zambanini S, Kampel M (2015) Efficient scale and rotation invariant encoding of visual words for image classification. IEEE Signal Process Lett 22(10):1762–1765View ArticleGoogle Scholar
 Armano G, Javarone MA (2013) Clustering datasets by complex networks analysis. Complex Adapt Syst Model 1(1):5View ArticleGoogle Scholar
 Avram V, Rizescu D (2014) Measuring external complexity of complex adaptive systems using onicescu’s informational energy. Mediterr J Soc Sci 5(22):407Google Scholar
 Azar E, Menassa CC (2011) Agentbased modeling of occupants and their impact on energy use in commercial buildings. J Comp Civ Eng 26(4):506–518View ArticleGoogle Scholar
 Azar E, Menassa C (2010) A conceptual framework to energy estimation in buildings using agent based modeling. In: Proceedings of the 2010 winter simulation conference (WSC), pp 3145–3156Google Scholar
 Batty M (2007) Cities and complexity: understanding cities with cellular automata, agentbased models, and fractals. The MIT press, MassachusettsGoogle Scholar
 Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat 3(1):1–27MathSciNetMATHGoogle Scholar
 Capozzoli A, Lauro F, Khan I (2015) Fault detection analysis using data mining techniques for a cluster of smart office buildings. Expert Syst Appl 42(9):4324–4338View ArticleGoogle Scholar
 Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227View ArticleGoogle Scholar
 Djuric N, Novakovic V (2012) Identifying important variables of energy use in low energy office building by using multivariate analysis. Energy Build 45:91–98View ArticleGoogle Scholar
 Domínguez M, Fuertes JJ, Alonso S, Prada MA, Morán A, Barrientos P (2013) Power monitoring system for university buildings: architecture and advanced analysis tools. Energy Build 59:152–160View ArticleGoogle Scholar
 Fan W, Qiao P (2011) Vibrationbased damage identification methods: a review and comparative study. Struct Health Monit 10(1):83–111View ArticleGoogle Scholar
 Faucher JB (2010) Reconceptualizing knowledge management: knowledge, social energy, and emergent leadership in social complex adaptive systems. PhD thesis, University of Otago, DunedinGoogle Scholar
 Figueiredo V, Rodrigues F, Vale Z, Gouveia JB (2005) An electric energy consumer characterization framework based on data mining techniques. IEEE Trans Power Syst 20(2):596–602View ArticleGoogle Scholar
 Florita AR, Brackney LJ, Otanicar TP, Robertson J (2013) Classification of commercial building electrical demand profiles for energy storage applications. J Solar Energy Eng 135(3):031020–031020View ArticleGoogle Scholar
 Grimm V, Revilla E, Berger U, Jeltsch F, Mooij WM, Railsback SF, Thulke HH, Weiner J, Wiegand T, DeAngelis DL (2005) Patternoriented modeling of agentbased complex systems: lessons from ecology. Science 310(5750):987–991View ArticleGoogle Scholar
 Habib U, Jørstad I, Thanh DV, Khan IA (2011) A framework for secure linux based authentication in enterprises via mobile phone. J Basic Appl Sci Res 1(12):3058–3066Google Scholar
 Habib U, Zucker G (2015) Finding the different patterns in buildings data using bag of words representation with clustering. In: 2015 13th International conference on Frontiers of information technology, pp 303–308Google Scholar
 Habib U, Zucker G, Blochle M, Judex F, Haase J (2015) Outliers detection method using clustering in buildings data. In: Industrial electronics society, IECON 2015—41st Annual Conference of the IEEE, pp 000694–000700Google Scholar
 Hadzikadic M (2010) Energy in the context of complex adaptive systems: Predatorprey dynamics. In: LAWDNLatinAmerican workshop on dynamic networks, p 1Google Scholar
 Iglesias F, Kastner W (2013) Analysis of similarity measures in times series clustering for the discovery of building energy patterns. Energies 6(2):579–597View ArticleGoogle Scholar
 Isermann R (2005) Modelbased faultdetection and diagnosis—status and applications. Ann Rev Control 29(1):71–85View ArticleGoogle Scholar
 Jensen T, Holtz G, Baedeker C, Chappin ÉJ (2016) Energyefficiency impacts of an airquality feedback device in residential buildings: an agentbased modeling assessment. Energ Build 19(1):4Google Scholar
 Katipamula S, Brambley MR (2005) Review article: methods for fault detection, diagnostics, and prognostics for building systems—a review. HVAC&R Res 11(1):3–25View ArticleGoogle Scholar
 Katipamula S, Brambley MR (2005) Review article: methods for fault detection, diagnostics, and prognostics for building systems—a review. HVAC&R Res 11(2):169–187View ArticleGoogle Scholar
 Kayman EA (2014) Chaos in education as an intelligent complex adaptive system. Chaos and complexity theory in world politics 280Google Scholar
 Keogh E, Ratanamahatana CA (2004) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386View ArticleGoogle Scholar
 Ketchen DJ, Shook CL (1996) The application of cluster analysis in strategic management research: an analysis and critique. Strateg Manag J 17(6):441–458View ArticleGoogle Scholar
 Khan A, Hornbæk K (2011) Big data from the built environment. Proceedings of the 2Nd International Workshop on Research in The Large, LARGE ’11ACM, New York, pp 29–32Google Scholar
 Korhonen J, Snäkin JP (2015) Quantifying the relationship of resilience and ecoefficiency in complex adaptive energy systems. Ecol Econom 120:83–92View ArticleGoogle Scholar
 Kusiak A, Song Z (2008) Clusteringbased performance optimization of the boilerturbine system. IEEE Trans Energ Convers 23(2):651–658View ArticleGoogle Scholar
 Lee ET, Eun HC (2015) Damage identification through the comparison with pseudobaseline data at damaged state. Eng Comp 40:1–8View ArticleGoogle Scholar
 Levin SA (1998) Ecosystems and the biosphere as complex adaptive systems. Ecosystems 1(5):431–436View ArticleGoogle Scholar
 Levin MS (2007) Towards hierarchical clustering (Extended Abstract). In: Diekert V, Volkov MV, Voronkov A (ed) Computer Science—theory and applications: proceedings of second international symposium on computer science in Russia, CSR 2007, Ekaterinburg, pp 205–215Google Scholar
 Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15(2):107–144MathSciNetView ArticleGoogle Scholar
 Lin J, Li Y (2009) Finding structural similarity in time series data using bagofpatterns representation. In: Winslett M (ed) Scientific and statistical database management, vol 5566, Lecture notes in computer science. Springer, Berlin, pp 461–477Google Scholar
 Miller C, Nagy Z, Schlueter A (2015) Automated daily pattern filtering of measured building performance data. Autom Constr 49:1–17View ArticleGoogle Scholar
 Moffat J (2010) Complexity theory and network centric warfare. DIANE Publishing, PennsylvaniaGoogle Scholar
 Mourad M, BertrandKrajewski JL (2002) A method for automatic validation of long time series of data in urban hydrology. Water Sci Technol 45(4–5):263–270Google Scholar
 Napolitano A, Sparber W, Thür A, Finocchiaro P, Nocke B (2011) Monitoring procedure for solar cooling systems. Technical Report IEA Task 38, international energy agencyGoogle Scholar
 Narayanaswamy B, Balaji B, Gupta R, Agarwal Y (2014) Data driven investigation of faults in HVAC systems with model, cluster and compare (MCC). In: Proceedings of the 1st ACM conference on embedded systems for energyefficient buildings. ACM, New York, pp 50–59Google Scholar
 Narayanaswamy B, Balaji B, Gupta R, Agarwal Y (2014) Data driven investigation of faults in HVAC systems with model, cluster and compare (MCC). Proceedings of the 1st ACM conference on embedded systems for energyefficient buildings, BuildSys ’14ACM, New York, pp 50–59Google Scholar
 Olsson P, Folke C, Berkes F (2004) Adaptive comanagement for building resilience in socialecological systems. Environ Manag 34(1):75–90View ArticleGoogle Scholar
 Oosterhuis K (2012) Simply complex, toward a new kind of building. Front Arch Res 1(4):411–420View ArticleGoogle Scholar
 Pietruschka D, Dalibard A, Ben I, Focke H, Judex F, Preisler Helm M, Ohnewein P, Frein A, Muscherá M (2015) Report for selfdetection on monitoring procedure. Technical Report IEA Task 48/B6, international energy agencyGoogle Scholar
 Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comp Appl Math 20:53–65View ArticleMATHGoogle Scholar
 Saraçli S, Doğan N, Doğan İ (2013) Comparison of hierarchical cluster analysis methods by cophenetic correlation. J Inequal Appl 2013(1):1–8View ArticleMATHGoogle Scholar
 Seem JE (2005) Pattern recognition algorithm for determining days of the week with similar energy consumption profiles. Energy Build 37(2):127–139View ArticleGoogle Scholar
 Seem JE (2007) Using intelligent data analysis to detect abnormal energy consumption in buildings. Energy Build 39(1):52–58View ArticleGoogle Scholar
 Shah MA, Abbas G, Dogar AB, Halim Z (2015) Scaling hierarchical clustering and energy aware routing for sensor networks. Complex Adapt Syst Model 3(1):5View ArticleGoogle Scholar
 Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc 63(2):411–423MathSciNetView ArticleMATHGoogle Scholar
 Vesanto J, Alhoniemi E (2000) Clustering of the selforganizing map. IEEE Trans Neural Netw 11(3):586–600View ArticleGoogle Scholar
 Zucker G, Habib U, Blöchle M, Judex F, Leber T (2015) Sanitation and analysis of operation data in energy systems. Energies 8(11):12776–12794View ArticleGoogle Scholar
 Zucker G, Habib U, Blöchle M, Wendt A, Schaat S, Siafara LC (2015) Building energy management and data analytics. In: 2015 international symposium on smart electric distribution systems and technologies (EDST), pp 462–467Google Scholar
 Zucker G, Malinao J, Habib U, Leber T, Preisler A, Judex F (2014) Improving energy efficiency of buildings using data mining technologies. In: 2014 IEEE 23rd international symposium on industrial electronics (ISIE), pp 2664–2669Google Scholar