Data/Feature Distributed Stochastic Coordinate Descent for Logistic Regression

Cited 0 time in webofscience Cited 8 time in scopus
  • Hit : 186
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorKang, Dongyeopko
dc.contributor.authorLim, Woosangko
dc.contributor.authorShin, Kijungko
dc.contributor.authorSael, Leeko
dc.contributor.authorKang, U.ko
dc.date.accessioned2019-03-19T01:15:14Z-
dc.date.available2019-03-19T01:15:14Z-
dc.date.created2019-03-04-
dc.date.created2019-03-04-
dc.date.issued2014-11-05-
dc.identifier.citation23rd ACM International Conference on Information and Knowledge Management, CIKM 2014, pp.1269 - 1278-
dc.identifier.urihttp://hdl.handle.net/10203/251600-
dc.description.abstractHow can we scale-up logistic regression, or L1 regularized loss minimization in general, for Terabyte-scale data which do not fit in the memory? How to design the distributed algorithm efficiently? Although there exist two major algorithms for logistic regression, namely Stochastic Gradient Descent (SGD) and Stochastic Coordinate Descent (SCD), they face limitations in distributed environments. Distributed SGD enables data parallelism (i.e., different machines access different part of the input data), but it does not allow feature parallelism (i.e., different machines compute different subsets of the output), and thus the communication cost is high. On the other hand, Distributed SCD allows feature parallelism, but it does not allow data parallelism and thus is not suitable to work in distributed environments. In this paper we propose DF-DSCD (Data/Feature Distributed Stochastic Coordinate Descent), an efficient distributed algorithm for logistic regression, or L1 regularized loss minimization in general. DF-DSCD allows both data and feature parallelism. The benefits of DF-DSCD are (a) full utilization of the capabilities provided by modern distributing computing platforms like MapReduce to analyze web-scale data, and (b) independence of each machine in updating parameters with little communication cost. We prove the convergence of DF-DSCD both theoretically, and also show empirical evidence that it is scalable, handles very high-dimensional data with up to 29 millions of features, and converges 2.2 times faster than competitors.-
dc.languageEnglish-
dc.publisherAssociation for Computing Machinery, Inc-
dc.titleData/Feature Distributed Stochastic Coordinate Descent for Logistic Regression-
dc.typeConference-
dc.identifier.scopusid2-s2.0-84937597355-
dc.type.rimsCONF-
dc.citation.beginningpage1269-
dc.citation.endingpage1278-
dc.citation.publicationname23rd ACM International Conference on Information and Knowledge Management, CIKM 2014-
dc.identifier.conferencecountryCC-
dc.identifier.conferencelocationShanghai, China-
dc.identifier.doi10.1145/2661829.2662082-
dc.contributor.localauthorShin, Kijung-
dc.contributor.nonIdAuthorKang, Dongyeop-
dc.contributor.nonIdAuthorLim, Woosang-
dc.contributor.nonIdAuthorSael, Lee-
dc.contributor.nonIdAuthorKang, U.-
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0