The distributed alternating direction method of mul-tipliers (ADMM) is an effective algorithm to solve large-scale op-timization problems. However, there are still massive computation and communication cost in distributed ADMM when processing high-dimensional data. To solve this problem, we propose a distributed ADMM with sparse computation and Allreduce communication (SCAC-ADMM) which can process high-dimensional data effectively. In the algorithm, each node optimizes a sub-model of the target model in parallel. Then, the target model is obtained by aggregating all sub-models. The features in the sub-model are named associated features. In SCAC-ADMM, we first design a selecting method of associated features to determine the composition of each sub-model. This method can limit the dimension of the sub-model by setting appropriate parameters, so as to limit the computation cost. Secondly, to reduce the communication traffic caused by transmitting high-dimensional parameters, we propose a novel Allreduce communication model which can only aggregate associated parameters in sub-models. Experiments on high-dimensional datasets show that SCAC-ADMM has less computation cost and higher communication efficiency than traditional distributed ADMM. When solving large-scale logistic regression problem, SCAC-ADMM can reduce the system time by 73% compared with traditional distributed ADMM.
Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.