DSpace at KOASAS: Data/Feature Distributed Stochastic Coordinate Descent for Logistic Regression

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Conference Papers(학술회의논문)

Data/Feature Distributed Stochastic Coordinate Descent for Logistic Regression

Cited 0 time in webofscience

Cited 8 time in

Hit : 186
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Kang, Dongyeop	ko
dc.contributor.author	Lim, Woosang	ko
dc.contributor.author	Shin, Kijung	ko
dc.contributor.author	Sael, Lee	ko
dc.contributor.author	Kang, U.	ko
dc.date.accessioned	2019-03-19T01:15:14Z	-
dc.date.available	2019-03-19T01:15:14Z	-
dc.date.created	2019-03-04	-
dc.date.created	2019-03-04	-
dc.date.issued	2014-11-05	-
dc.identifier.citation	23rd ACM International Conference on Information and Knowledge Management, CIKM 2014, pp.1269 - 1278	-
dc.identifier.uri	http://hdl.handle.net/10203/251600	-
dc.description.abstract	How can we scale-up logistic regression, or L1 regularized loss minimization in general, for Terabyte-scale data which do not fit in the memory? How to design the distributed algorithm efficiently? Although there exist two major algorithms for logistic regression, namely Stochastic Gradient Descent (SGD) and Stochastic Coordinate Descent (SCD), they face limitations in distributed environments. Distributed SGD enables data parallelism (i.e., different machines access different part of the input data), but it does not allow feature parallelism (i.e., different machines compute different subsets of the output), and thus the communication cost is high. On the other hand, Distributed SCD allows feature parallelism, but it does not allow data parallelism and thus is not suitable to work in distributed environments. In this paper we propose DF-DSCD (Data/Feature Distributed Stochastic Coordinate Descent), an efficient distributed algorithm for logistic regression, or L1 regularized loss minimization in general. DF-DSCD allows both data and feature parallelism. The benefits of DF-DSCD are (a) full utilization of the capabilities provided by modern distributing computing platforms like MapReduce to analyze web-scale data, and (b) independence of each machine in updating parameters with little communication cost. We prove the convergence of DF-DSCD both theoretically, and also show empirical evidence that it is scalable, handles very high-dimensional data with up to 29 millions of features, and converges 2.2 times faster than competitors.	-
dc.language	English	-
dc.publisher	Association for Computing Machinery, Inc	-
dc.title	Data/Feature Distributed Stochastic Coordinate Descent for Logistic Regression	-
dc.type	Conference	-
dc.identifier.scopusid	2-s2.0-84937597355	-
dc.type.rims	CONF	-
dc.citation.beginningpage	1269	-
dc.citation.endingpage	1278	-
dc.citation.publicationname	23rd ACM International Conference on Information and Knowledge Management, CIKM 2014	-
dc.identifier.conferencecountry	CC	-
dc.identifier.conferencelocation	Shanghai, China	-
dc.identifier.doi	10.1145/2661829.2662082	-
dc.contributor.localauthor	Shin, Kijung	-
dc.contributor.nonIdAuthor	Kang, Dongyeop	-
dc.contributor.nonIdAuthor	Lim, Woosang	-
dc.contributor.nonIdAuthor	Sael, Lee	-
dc.contributor.nonIdAuthor	Kang, U.	-

Appears in Collection: EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Data/Feature Distributed Stochastic Coordinate Descent for Logistic Regression

KOASAS

Communities & Collections