PCM is a promising non-volatile memory technology, as it can offer a unique trade-off between density and latency compared with DRAM and flash memory. Albeit PCM is much faster than flash memory, it is still notably slower than DRAM, which can significantly degrade system performance. In this paper, we analyze a PCM implementation in depth, and identify the primary cause of PCM's long latency, i.e., a long interconnect (high resistance/capacitance) path between a cell and a sense-amp/write-driver. This in turn requires (1) a very large charge pump consuming: similar to 20% of PCM chip space, similar to 50% of latency of write operations, and similar to 2x more power than a write operation itself; and (2) a large current sense-amp with long time to pre-charge the interconnect path. Then, we propose Low-Latency PCM (LL-PCM) architecture. Our analysis shows that LL-PCM can give 119% higher performance and consume 43% lower memory energy than PCM for memory-intensive applications. LL-PCM is only similar to 1% larger than PCM, as the cost of reducing the resistance/capacitance of the interconnect path is negated by its 4.1x smaller charge pump.