In recent work, the decentralized algorithm has received more attention. In the centralized network, the worker nodes need to communicate with the central nodes, which results in the growth of communication traffic with the network expansion. Based on the purpose of reducing the communication costs in the distributed system, we proposed a decentralized algorithm based on ADMM - Grouping Ring All-Reduce ADMM (GR-ADMM) in this paper. First, GR-ADMM adopts decentralized architecture to avoid the problem of communication bottleneck in the central network. Second, to ensure the scalability of the distributed system, GR-ADMM introduces the Ring All-Reduce to the ADMM. Ring All-Reduce architecture has the advantage of its constant communication overhead. However, its performance is bounded by the stragglers (i.e., slow nodes). Third, GR-ADMM adopts the grouping strategy to alleviate the problem of stragglers. Experiments show that our algorithm has better convergence performance than QSGD and GADMM, especially in massive clusters. Compared with GADMM’s, the overall communication cost of GR-ADMM is reduced by 72%.
Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.