A key practical constraint on the design of hybrid automatic repeat request (HARQ) schemes is the size of the on-chip buffer that is available at the receiver to store previously received packets. In fact, in modern wireless standards such as LTE and LTE-A, the HARQ buffer size is one of the main drivers of the modem area and power consumption. This has recently highlighted the importance of HARQ buffer management, that is, of the use of buffer-aware transmission schemes and of advanced compression policies for the storage of received data. This work investigates HARQ buffer management by leveraging information-theoretic achievability arguments based on random coding. Specifically, standard HARQ schemes, namely Type-I, Chase Combining, and Incremental Redundancy, are first studied under the assumption of a finite-capacity HARQ buffer by considering both coded modulation, via Gaussian signaling, and Bit Interleaved Coded Modulation (BICM). The analysis sheds light on the impact of different compression strategies, namely the conventional compression log-likelihood ratios and the direct digitization of baseband signals, on the throughput. The optimization of coding blocklength is also investigated, highlighting the benefits of HARQ buffer-aware transmission scheme.