DSpace at KOASAS: Valkyrie: Leveraging inter-TLB locality to enhance GPU performance

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Conference Papers(학술회의논문)

Valkyrie: Leveraging inter-TLB locality to enhance GPU performance

Cited 0 time in webofscience

Cited 0 time in

Hit : 134
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Baruah, Trinayan	ko
dc.contributor.author	Sun, Yifan	ko
dc.contributor.author	Mojumder, Saiful A	ko
dc.contributor.author	Abellán, José L	ko
dc.contributor.author	Ukidave, Yash	ko
dc.contributor.author	Joshi, Ajay	ko
dc.contributor.author	Rubin, Norman	ko
dc.contributor.author	Kim, John	ko
dc.contributor.author	Kaeli, David	ko
dc.date.accessioned	2021-12-01T06:50:15Z	-
dc.date.available	2021-12-01T06:50:15Z	-
dc.date.created	2021-11-26	-
dc.date.issued	2020-10-06	-
dc.identifier.citation	2020 ACM International Conference on Parallel Architectures and Compilation Techniques, PACT 2020, pp.456 - 466	-
dc.identifier.uri	http://hdl.handle.net/10203/289868	-
dc.description.abstract	Programming on a GPU has been made considerably easier with theintroduction of Virtual Memory features, which support commonpointer-based semantics between the CPU and the GPU. However,supporting virtual memory on a GPU comes with some additionalcosts and overhead, with the largest being from the support foraddress translation. The fact that a massive number of threads runconcurrently on a GPU means that the translation lookaside bu!ers(TLBs) are oversubscribed most of the time. Our investigation intoa diverse set of GPU workloads shows that TLB misses can beextremely high (up to 99%), which inevitably leads to signi"cantperformance degradation due to long-latency page-table walks. Ourpro"ling of TLB-sensitive workloads reveals a high degree of pagesharing across the di!erent cores of a GPU. In many applications,a page can be accessed in temporal proximity by multiple cores,following similar memory access patterns. To support the inherent sharing present in GPU workloads, we propose Valkyrie, anintegrated cooperative TLB prefetching mechanism and an interL1-TLB probing scheme that can e#ciently reduce TLB bottlenecksin GPUs. Our evaluation using a diverse set of GPU workloadsreveals that Valkyrie is able to achieve an average speedup of 1.95?,while adding modest hardware overhead.	-
dc.language	English	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	Valkyrie: Leveraging inter-TLB locality to enhance GPU performance	-
dc.type	Conference	-
dc.identifier.scopusid	2-s2.0-85094207692	-
dc.type.rims	CONF	-
dc.citation.beginningpage	456	-
dc.citation.endingpage	466	-
dc.citation.publicationname	2020 ACM International Conference on Parallel Architectures and Compilation Techniques, PACT 2020	-
dc.identifier.conferencecountry	US	-
dc.identifier.conferencelocation	Virtual	-
dc.identifier.doi	10.1145/3410463.3414639	-
dc.contributor.localauthor	Kim, John	-
dc.contributor.nonIdAuthor	Baruah, Trinayan	-
dc.contributor.nonIdAuthor	Sun, Yifan	-
dc.contributor.nonIdAuthor	Mojumder, Saiful A	-
dc.contributor.nonIdAuthor	Abellán, José L	-
dc.contributor.nonIdAuthor	Ukidave, Yash	-
dc.contributor.nonIdAuthor	Joshi, Ajay	-
dc.contributor.nonIdAuthor	Rubin, Norman	-
dc.contributor.nonIdAuthor	Kaeli, David	-

Appears in Collection: EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Valkyrie: Leveraging inter-TLB locality to enhance GPU performance

KOASAS

Communities & Collections