正在加载图片...
1 FINE:A Framework for Distributed Learning on Incomplete Observations for Heterogeneous Crowdsensing Networks Luoyi Ful,Songjun Ma2,Lingkun Kong',Shiyu Liang2,Xinbing Wang1.2 Dept.of {Computer Science and Engineering,Electronic Engineering2} Shanghai Jiao Tong University,Shanghai,China Email:{yiluofu,masongjun,klk316980786,lsy18602808513,xwang8}@sjtu.edu.cn Abstract-In recent years there has been a wide range of consuming and prone to error for central server collecting data applications of crowdsensing in mobile social networks and from all mobile devices,especially those who are distributed vehicle networks.As centralized learning methods lead to un- far away from the server.Second,dealing with large volume reliabitlity of data collection,high cost of central server and concern of privacy,one important problem is how to carry out of data by centralized algorithms requires an expensive high- an accurate distributed learning process to estimate parameters configuration data center that possesses huge memory for of an unknown model in crowdsensing.Motivated by this, data storage and processing.Third,managing data by central we present the design,analysis and evaluation of FINE,a servers make the private information of users more likely to be distributed learning Framework for Incomplete-data and Non- exposed to the adversary [13]-[15].which might cause severe smooth Estimation.Our design,devoted to develop a feasible framework that efficiently and accurately learns the parameters information leakage. in crowdsensing networks,well generalizes the previous learning The three problems above imply the necessity of a dis- methods in that it supports heterogeneous dimensions of data tributed realization of parameter learning in crowdsensing records observed by different nodes,as well as minimisation environments.However,when applying existing distributed based on non-smooth error functions.In particular,FINE uses a novel Distributed Record Completion algorithm that allows each learning methods to our scenarios,restrictions of two charac- node to obtain the global consensus by an efficient communication teristics in the common distributed framework spawn addition- with neighbours,and a Distributed Dual Average algorithm that al problems.To illustrate,first,for mathematical tractability, achieves the efficiency of minimizing non-smooth error functions. the error function is usually assumed to be smooth and Our analysis shows that all these algorithms converge,of which convex for the design of efficient algorithms;while as the the convergence rates are also derived to confirm their efficiency emerging crowdsensing applications may incorporate different We evaluate the performance of our framework with experiments on synthetic and real world networks. properties,the training error functions may be non-smooth in nature [16],[17].For instance,in distributed detections [18]. source intensity functions may be non-smooth,resulting in I.INTRODUCTION non-smoothness in training error as well.Second,the common Recently,there emerge massive applications of crowdsens- framework requires each terminal to acquire a set of complete ing/participatory sensing in mobile social networks and vehicle records,i.e.,each record with data elements in all dimensions networks [17].The crowd acquire some (potentially high to ensure the accuracy of the learning process.This implicitly dimensional)data from the environment and each user in assumes that the terminals are homogeneous in functionalities the crowd can exploit the cooperatively acquired data to so that each of them should record the same types of signals perform a learning process for an accurate estimation of the (e.g.,each mobile phone can record traveling speed,waiting parameters of some specific models.This,in turn,leads to an time as well as ambient noise at every position).In contrast,it accurate prediction of future events and correct decision of the is impossible for each terminal to record full-dimension data in following action. the crowdsensing applications.For example,one mobile phone In this paper,we aim to address the issue of the accurate can only be responsible for data acquisition at its own position, learning in the undirected-static-random crowdsensing net- leaving the observation of elements in other positions (i.e., works.In order to solve this problem,there have been var- dimensions)a job of other mobile phones.Moreover,mobile ious proposed approaches [8]-[10]whose learning processes phones may hold different types of sensors,and therefore are are usually formulated as optimizations of the total training unable to acquire all kinds of signals. error,likelihood function and etc [11](e.g.,liner regression, We are thus motivated to propose a distributed learning support vector machines or expectation-maximization [12). Framework of Incomplete-data and Non-smooth Estimation However,these methods usually employ centralized learning (FINE),which aims to exhibit high compatibility to learning algorithms,which leads to three major problems.First,in real applications in crowdsensing environments.There are two world crowdsensing settings,mobile devices are likely to be major challenges in the design:1)It is difficult for each node located over an enormous space,which makes it both energy to supplement the unknown dimensions of the observed vector,1 FINE: A Framework for Distributed Learning on Incomplete Observations for Heterogeneous Crowdsensing Networks Luoyi Fu1 , Songjun Ma2 , Lingkun Kong1 , Shiyu Liang2 , Xinbing Wang1,2 Dept. of {Computer Science and Engineering1 , Electronic Engineering2} Shanghai Jiao Tong University, Shanghai, China Email: {yiluofu,masongjun,klk316980786,lsy18602808513,xwang8}@sjtu.edu.cn Abstract—In recent years there has been a wide range of applications of crowdsensing in mobile social networks and vehicle networks. As centralized learning methods lead to un￾reliabitlity of data collection, high cost of central server and concern of privacy, one important problem is how to carry out an accurate distributed learning process to estimate parameters of an unknown model in crowdsensing. Motivated by this, we present the design, analysis and evaluation of FINE, a distributed learning Framework for Incomplete-data and Non￾smooth Estimation. Our design, devoted to develop a feasible framework that efficiently and accurately learns the parameters in crowdsensing networks, well generalizes the previous learning methods in that it supports heterogeneous dimensions of data records observed by different nodes, as well as minimisation based on non-smooth error functions. In particular, FINE uses a novel Distributed Record Completion algorithm that allows each node to obtain the global consensus by an efficient communication with neighbours, and a Distributed Dual Average algorithm that achieves the efficiency of minimizing non-smooth error functions. Our analysis shows that all these algorithms converge, of which the convergence rates are also derived to confirm their efficiency. We evaluate the performance of our framework with experiments on synthetic and real world networks. I. INTRODUCTION Recently, there emerge massive applications of crowdsens￾ing/participatory sensing in mobile social networks and vehicle networks [1]–[7]. The crowd acquire some (potentially high dimensional) data from the environment and each user in the crowd can exploit the cooperatively acquired data to perform a learning process for an accurate estimation of the parameters of some specific models. This, in turn, leads to an accurate prediction of future events and correct decision of the following action. In this paper, we aim to address the issue of the accurate learning in the undirected-static-random crowdsensing net￾works. In order to solve this problem, there have been var￾ious proposed approaches [8]–[10] whose learning processes are usually formulated as optimizations of the total training error, likelihood function and etc [11] (e.g.,, liner regression, support vector machines or expectation-maximization [12]). However, these methods usually employ centralized learning algorithms, which leads to three major problems. First, in real world crowdsensing settings, mobile devices are likely to be located over an enormous space, which makes it both energy consuming and prone to error for central server collecting data from all mobile devices, especially those who are distributed far away from the server. Second, dealing with large volume of data by centralized algorithms requires an expensive high￾configuration data center that possesses huge memory for data storage and processing. Third, managing data by central servers make the private information of users more likely to be exposed to the adversary [13]–[15], which might cause severe information leakage. The three problems above imply the necessity of a dis￾tributed realization of parameter learning in crowdsensing environments. However, when applying existing distributed learning methods to our scenarios, restrictions of two charac￾teristics in the common distributed framework spawn addition￾al problems. To illustrate, first, for mathematical tractability, the error function is usually assumed to be smooth and convex for the design of efficient algorithms; while as the emerging crowdsensing applications may incorporate different properties, the training error functions may be non-smooth in nature [16], [17]. For instance, in distributed detections [18], source intensity functions may be non-smooth, resulting in non-smoothness in training error as well. Second, the common framework requires each terminal to acquire a set of complete records, i.e., each record with data elements in all dimensions to ensure the accuracy of the learning process. This implicitly assumes that the terminals are homogeneous in functionalities so that each of them should record the same types of signals (e.g., each mobile phone can record traveling speed, waiting time as well as ambient noise at every position). In contrast, it is impossible for each terminal to record full-dimension data in the crowdsensing applications. For example, one mobile phone can only be responsible for data acquisition at its own position, leaving the observation of elements in other positions (i.e., dimensions) a job of other mobile phones. Moreover, mobile phones may hold different types of sensors, and therefore are unable to acquire all kinds of signals. We are thus motivated to propose a distributed learning Framework of Incomplete-data and Non-smooth Estimation (FINE), which aims to exhibit high compatibility to learning applications in crowdsensing environments. There are two major challenges in the design: 1) It is difficult for each node to supplement the unknown dimensions of the observed vector
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有