Deep imputation network for missing-value imputation of concurrent continuous missing patterns using relationship among multiple IoT data streams in a smart space = 다중 사물인터넷 데이터 스트림의 관계성을 이용한 연속 누락 패턴 값 대체에 관한 연구
Missing values with various missing patterns and the high missing rates from many different data streams of Internet of Things (IoT) devices remain challenging problems. In this regard, while analyzing a dataset collected from a smart space with multiple IoT devices, we found a continuous missing pattern that is quite different from the existing missing-value patterns. A pattern has large blocks of continuous missing values of over a few seconds and up to a few hours in length. In addition, concurrent continuous missing patterns occurring simultaneously in multiple IoT data streams are very common. Given the explosive growth of IoT data streams, such a pattern is a vital factor regarding the availability and reliability of IoT applications; however, it cannot be solved using existing missing-value imputation methods. Therefore, a novel approach for the missing-value imputation of concurrent continuous missing patterns is required in a smart space with multiple IoT data streams. We deliberated that, even if the missing values of a continuous missing pattern occur in multiple IoT data streams, imputation of these missing values is possible through the learning of other related data streams using a machine-learning method. To solve the missing values of a continuous missing pattern, we analyzed multiple IoT data streams in single smart space and determined their relationship such as their correlation and causality, where the correlation is a linearly proportionate inter-dependency, and causality indicates the cause and effect among multiple IoT data streams in a smart space. To substitute the missing values of concurrent continuous missing patterns, we propose a deep-learning based missing-value imputation model exploiting information on various relationships, i.e., a deep imputation network (DeepIN), in an IoT environment. DeepIN uses multiple long short-term memories that are constructed according to the relationship of each IoT data stream. We evaluated DeepIN on the IoT dataset from our real smart-office testbed, and the results of our experiments showed reasonable accuracy (75$\sim$80\%) of missing value imputation when continuous missing patterns occur simultaneously in up to three IoT data streams. Furthermore, our proposed approach dramatically improves the imputation performance over the state-of-the-art missing-value imputation algorithm. Based on our extensive experiments and analyses, we found that the causality information among multiple IoT data streams is the most significant factor in the missing-value imputation of concurrent continuous missing patterns. Based on these results, our proposed approach can be a promising methodology that enables many smart space services and applications, even if a long-term block of values is missing in IoT environments.