正在加载图片...
180 X Liang et aL Building and Environment 102 (2016)179-192 12.131. power of data mining methods in recognizing pattern of occupant Owing to the significant impacts on energy consumption and behavior and energy consumption areas,but the research area of prediction in buildings,a number of studies focused on the occu- occupancy schedule leaning and predicting still needs exploration. pant's energy use characteristics,which is defined as the presence The aim of this study is to present a new approach for occupancy of occupants in the building and their actions to (or do not to)in- schedule learning and predicting in office buildings by using data fluence the energy consumption [14].D'Oca and Hong [15] mining based methods.The process of this study includes recog- observed and identified the patterns of window opening and nizing the patterns of occupant presence,summarizing the rules of closing behavior in an office building.Zhou et al.[16]analyzed the recognized patterns and finally predicting the occupancy lighting behavior in large office buildings based on a stochastic schedules.This study hypothesizes the identified patterns and rules model.Zhang et al.[17]simulated occupant movement,light and by the proposed data mining approach are right.Namely,they can equipment use behavior synthetically with agent-based models. present the true characteristics of the occupancy data.This hy- Sun et al.[18]investigated the impact of overtime working on pothesis is validated by comparing the accuracy of prediction be- energy consumption in an office building.Azar and Menassa 16 tween the proposed method and the traditional methods.If the showed the education and learning effect of energy saving accuracy of the prediction results is improved,it indicates the hy- behavior,and proposed the impacts of energy conservation pro- pothesis is true. motion on energy saving. This model only needs a few types of inputs,typically the time Before modelling occupant's energy use characteristics,there is series data of occupant number entering and exiting a building. a more essential research question:how to identify the pattern of Another advantage of this model is that it allows for relatively occupant presence and predict the occupancy schedule?Without simple operations,excluding probability distribution fitting and the answer to this question,the occupant's energy use character- other complex mathematical processing.That means this method istics cannot get down to the ground.However,due to the highly can be well adaptive to practical projects.The results of this study stochastic activities and insufficient data,it is difficult to observe are critical to provide insight into the pattern of occupant presence, and predict occupant presence.Previous studies did not pay facilitate the energy simulation and prediction as well as improve enough attention to occupancy schedule and this question has not energy saving operation and retrofit. been well addressed.In general,three typical methods were applied to model occupant presence in previous studies.First 2.Methodology method is fix schedules.Occupants are categorized into several groups (e.g.,early bird,timetable complier and flexible worker) 2.1.Framework of occupancy schedule learning and prediction then each group is assigned to a specific schedule [17].Combining the schedules of each group proportionally can generate the Traditional methods of transforming data to knowledge nor- schedule of the whole building.The second method assumes that mally used statistical tests,regression and curve fitting by a certain occupant presence satisfies a certain probability distribution.The probability distribution.These methods are effective when data is distribution can be Poisson distribution[16],binomial distribution small volume,accurate and standardized.However,when the vol- 18.uniform distribution and triangle distribution [19].The occu- ume of data is growing exponentially in recent years,these pancy schedule can be obtained by a virtual occupant generation methods become slow and expensive.More seriously,when there following the certain distribution.The third method is analyzing is considerable missing data,the deviated data or the data format is practical observation data.D'Oca and Hong 8 observed 16 private disunion (e.g.the time steps are different,mix of numbers and offices with single or dual occupancy and Wang et al.[20]observed words),these methods cannot be applied or cannot deduce satis- 35 offices with single occupancy. fied results.Data mining is an emerging method which can process Although these methods had advantages and improved occu- big data and unstructured data effectively and robustly.Machine pancy schedule modeling,there are still some limitations:(1)the learning.as a main method of data mining.is specifically good at assumptions are not solid.Occupancy schedule is highly stochastic, identifying patterns and inducting rules.Since this study includes it is inappropriate to simply define that occupants belong to a huge volume of data and aims to induct rules of occupancy certain group or follow a certain distribution;(2)the previous schedules,data mining is selected as the research method. research emphasized on summarizing rules of occupant presence, Data mining,which is also named knowledge discovery in da- but less attention has been paid to predicting schedules in future tabases (KDD).is a relatively young and interdisciplinary field of The results are not practical if they cannot guide future work;(3) computer science.It is the process of discovering new patterns the results of schedules lack validation with real data;(4)observed from large data sets,involving methods at the intersection of data mainly focused on a single or multiple offices,so the data are pattern recognition,machine learning,artificial intelligence,cloud limited and results may be biased if applied to the whole building architecture,and data visualization [27.Normally,the process of To bridge the aforementioned research gaps,this study proposes KDD involves six steps:(1)Data selection;(2)Data cleaning and a data mining based approach to learning and predicting occupancy preprocessing:(3)Data transformation;(4)Data mining:(5)Data schedule for the whole building.Data mining can be defined as: interpretation and evaluation;and(6)Knowledge extraction 8]. "The analysis of large observation data sets to find unsuspected This study proposes a data mining based approach to discover relationships and to summarize the data in novel ways so that occupancy schedule patterns and extrapolate occupancy schedule owners can fully understand and make use of the data"[21.Data from observed big data streams of a building.The framework of this mining methods have significant advantages in revealing under- proposed method includes six steps,illustrated in Fig.1. lying patterns of data,which has been widely used in various Step 1:problem framing.The first step is to clarify problem research and industry fields,such as marketing.biology.engi- definition,boundary,assumption and key metric of success.The neering and social science [22].However,the applications of data research problem is defined as how to predict occupancy schedule mining in occupancy schedule and building energy consumption is from historical observed data.The scope of this study focuses on still underdeveloped.Some previous studies applied data mining the schedule prediction for weekdays in office buildings.The key methods to discover the pattern of occupant behavior [15,23,24]. metric of success is the similarity of prediction results to the and others focused on interactions between occupants and energy observed data. consumption [8,25,26].These studies demonstrated the strong Step 2:data acquisition and preparation.The second step is to[12,13]. Owing to the significant impacts on energy consumption and prediction in buildings, a number of studies focused on the occu￾pant's energy use characteristics, which is defined as the presence of occupants in the building and their actions to (or do not to) in- fluence the energy consumption [14]. D'Oca and Hong [15] observed and identified the patterns of window opening and closing behavior in an office building. Zhou et al. [16] analyzed lighting behavior in large office buildings based on a stochastic model. Zhang et al. [17] simulated occupant movement, light and equipment use behavior synthetically with agent-based models. Sun et al. [18] investigated the impact of overtime working on energy consumption in an office building. Azar and Menassa [6] showed the education and learning effect of energy saving behavior, and proposed the impacts of energy conservation pro￾motion on energy saving. Before modelling occupant's energy use characteristics, there is a more essential research question: how to identify the pattern of occupant presence and predict the occupancy schedule? Without the answer to this question, the occupant's energy use character￾istics cannot get down to the ground. However, due to the highly stochastic activities and insufficient data, it is difficult to observe and predict occupant presence. Previous studies did not pay enough attention to occupancy schedule and this question has not been well addressed. In general, three typical methods were applied to model occupant presence in previous studies. First method is fix schedules. Occupants are categorized into several groups (e.g., early bird, timetable complier and flexible worker), then each group is assigned to a specific schedule [17]. Combining the schedules of each group proportionally can generate the schedule of the whole building. The second method assumes that occupant presence satisfies a certain probability distribution. The distribution can be Poisson distribution [16], binomial distribution [18], uniform distribution and triangle distribution [19]. The occu￾pancy schedule can be obtained by a virtual occupant generation following the certain distribution. The third method is analyzing practical observation data. D'Oca and Hong [8] observed 16 private offices with single or dual occupancy and Wang et al. [20] observed 35 offices with single occupancy. Although these methods had advantages and improved occu￾pancy schedule modeling, there are still some limitations: (1) the assumptions are not solid. Occupancy schedule is highly stochastic, it is inappropriate to simply define that occupants belong to a certain group or follow a certain distribution; (2) the previous research emphasized on summarizing rules of occupant presence, but less attention has been paid to predicting schedules in future. The results are not practical if they cannot guide future work; (3) the results of schedules lack validation with real data; (4) observed data mainly focused on a single or multiple offices, so the data are limited and results may be biased if applied to the whole building. To bridge the aforementioned research gaps, this study proposes a data mining based approach to learning and predicting occupancy schedule for the whole building. Data mining can be defined as: “The analysis of large observation data sets to find unsuspected relationships and to summarize the data in novel ways so that owners can fully understand and make use of the data” [21]. Data mining methods have significant advantages in revealing under￾lying patterns of data, which has been widely used in various research and industry fields, such as marketing, biology, engi￾neering and social science [22]. However, the applications of data mining in occupancy schedule and building energy consumption is still underdeveloped. Some previous studies applied data mining methods to discover the pattern of occupant behavior [15,23,24], and others focused on interactions between occupants and energy consumption [8,25,26]. These studies demonstrated the strong power of data mining methods in recognizing pattern of occupant behavior and energy consumption areas, but the research area of occupancy schedule leaning and predicting still needs exploration. The aim of this study is to present a new approach for occupancy schedule learning and predicting in office buildings by using data mining based methods. The process of this study includes recog￾nizing the patterns of occupant presence, summarizing the rules of the recognized patterns and finally predicting the occupancy schedules. This study hypothesizes the identified patterns and rules by the proposed data mining approach are right. Namely, they can present the true characteristics of the occupancy data. This hy￾pothesis is validated by comparing the accuracy of prediction be￾tween the proposed method and the traditional methods. If the accuracy of the prediction results is improved, it indicates the hy￾pothesis is true. This model only needs a few types of inputs, typically the time series data of occupant number entering and exiting a building. Another advantage of this model is that it allows for relatively simple operations, excluding probability distribution fitting and other complex mathematical processing. That means this method can be well adaptive to practical projects. The results of this study are critical to provide insight into the pattern of occupant presence, facilitate the energy simulation and prediction as well as improve energy saving operation and retrofit. 2. Methodology 2.1. Framework of occupancy schedule learning and prediction Traditional methods of transforming data to knowledge nor￾mally used statistical tests, regression and curve fitting by a certain probability distribution. These methods are effective when data is small volume, accurate and standardized. However, when the vol￾ume of data is growing exponentially in recent years, these methods become slow and expensive. More seriously, when there is considerable missing data, the deviated data or the data format is disunion (e.g. the time steps are different, mix of numbers and words), these methods cannot be applied or cannot deduce satis- fied results. Data mining is an emerging method which can process big data and unstructured data effectively and robustly. Machine learning, as a main method of data mining, is specifically good at identifying patterns and inducting rules. Since this study includes huge volume of data and aims to induct rules of occupancy schedules, data mining is selected as the research method. Data mining, which is also named knowledge discovery in da￾tabases (KDD), is a relatively young and interdisciplinary field of computer science. It is the process of discovering new patterns from large data sets, involving methods at the intersection of pattern recognition, machine learning, artificial intelligence, cloud architecture, and data visualization [27]. Normally, the process of KDD involves six steps: (1) Data selection; (2) Data cleaning and preprocessing; (3) Data transformation; (4) Data mining; (5) Data interpretation and evaluation; and (6) Knowledge extraction [8]. This study proposes a data mining based approach to discover occupancy schedule patterns and extrapolate occupancy schedule from observed big data streams of a building. The framework of this proposed method includes six steps, illustrated in Fig. 1. Step 1: problem framing. The first step is to clarify problem definition, boundary, assumption and key metric of success. The research problem is defined as how to predict occupancy schedule from historical observed data. The scope of this study focuses on the schedule prediction for weekdays in office buildings. The key metric of success is the similarity of prediction results to the observed data. Step 2: data acquisition and preparation. The second step is to 180 X. Liang et al. / Building and Environment 102 (2016) 179e192
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有