DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kislal, Orhan | ko |
dc.contributor.author | Kotra, Jagadish | ko |
dc.contributor.author | Tang, Xulong | ko |
dc.contributor.author | Kandemir, Mahmut Taylan | ko |
dc.contributor.author | Jung, Myoungsoo | ko |
dc.date.accessioned | 2019-12-13T12:29:23Z | - |
dc.date.available | 2019-12-13T12:29:23Z | - |
dc.date.created | 2019-11-28 | - |
dc.date.created | 2019-11-28 | - |
dc.date.issued | 2018-06-18 | - |
dc.identifier.citation | 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018, pp.312 - 327 | - |
dc.identifier.issn | 0362-1340 | - |
dc.identifier.uri | http://hdl.handle.net/10203/269575 | - |
dc.description.abstract | Going beyond a certain number of cores in modern architectures requires an on-chip network more scalable than conventional buses. However, employing an on-chip network in a manycore system (to improve scalability) makes the latencies of the data accesses issued by a core non-uniform. This non-uniformity can play a significant role in shaping the overall application performance. This work presents a novel compiler strategy which involves exposing architecture information to the compiler to enable an optimized computation-to-core mapping. Specifically, we propose a compiler-guided scheme that takes into account the relative positions of (and distances between) cores, last-level caches (LLCs) and memory controllers (MCs) in a manycore system, and generates a mapping of computations to cores with the goal of minimizing the on-chip network traffic. The experimental data collected using a set of 21 multi-threaded applications reveal that, on an average, our approach reduces the on-chip network latency in a 6×6 manycore system by 38.4% in the case of private LLCs, and 43.8% in the case of shared LLCs. These improvements translate to the corresponding execution time improvements of 10.9% and 12.7% for the private LLC and shared LLC based systems, respectively. | - |
dc.language | English | - |
dc.publisher | Association for Computing Machinery | - |
dc.title | Enhancing computation-to-core assignment with physical location information | - |
dc.type | Conference | - |
dc.identifier.wosid | 000452469600022 | - |
dc.identifier.scopusid | 2-s2.0-85049567574 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 312 | - |
dc.citation.endingpage | 327 | - |
dc.citation.publicationname | 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018 | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | Philadelphia, PA | - |
dc.identifier.doi | 10.1145/3192366.3192386 | - |
dc.contributor.localauthor | Jung, Myoungsoo | - |
dc.contributor.nonIdAuthor | Kislal, Orhan | - |
dc.contributor.nonIdAuthor | Kotra, Jagadish | - |
dc.contributor.nonIdAuthor | Tang, Xulong | - |
dc.contributor.nonIdAuthor | Kandemir, Mahmut Taylan | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.