Deep Predictive Video Compression Using Mode-Selective Uni- and Bi-Directional Predictions Based on Multi-Frame Hypothesis

Cited 8 time in webofscience Cited 3 time in scopus
  • Hit : 299
  • Download : 225
DC FieldValueLanguage
dc.contributor.authorPark, Woonsungko
dc.contributor.authorKim, Munchurlko
dc.date.accessioned2021-02-02T06:30:05Z-
dc.date.available2021-02-02T06:30:05Z-
dc.date.created2021-01-27-
dc.date.created2021-01-27-
dc.date.created2021-01-27-
dc.date.created2021-01-27-
dc.date.issued2021-01-
dc.identifier.citationIEEE ACCESS, v.9, pp.72 - 85-
dc.identifier.issn2169-3536-
dc.identifier.urihttp://hdl.handle.net/10203/280465-
dc.description.abstractRecently, deep learning-based image compression has shown significant performance improvement in terms of coding efficiency and subjective quality. However, there has been relatively less effort on video compression based on deep neural networks. In this paper, we propose an end-to-end deep predictive video compression network, called DeepPVCnet, using mode-selective uni- and bi-directional predictions based on multi-frame hypothesis with a multi-scale structure and a temporal-context-adaptive entropy model. Our DeepPVCnet jointly compresses motion information and residual data that are generated from the multi-scale structure via the feature transformation layers. Recent deep learning-based video compression methods were proposed in a limited compression environment using only P-frame or B-frame. Learned from the lesson of the conventional video codecs, we firstly incorporate a mode-selective framework into our DeepPVCnet with uni- and bi-directional predictive modes in a rate-distortion minimization sense. Also, we propose a temporal-context-adaptive entropy model that utilizes the temporal context information of the reference frames for the current frame coding. The autoregressive entropy models for CNN-based image and video compression is difficult to compute with parallel processing. On the other hand, our temporal-context-adaptive entropy model utilizes temporally coherent context from the reference frames, so that the context information can be computed in parallel, which is computationally and architecturally advantageous. Extensive experiments show that our DeepPVCnet outperforms AVC/H.264, HEVC/H.265 and state-of-the-art methods in an MS-SSIM perspective.-
dc.languageEnglish-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.titleDeep Predictive Video Compression Using Mode-Selective Uni- and Bi-Directional Predictions Based on Multi-Frame Hypothesis-
dc.typeArticle-
dc.identifier.wosid000607730600006-
dc.identifier.scopusid2-s2.0-85098761125-
dc.type.rimsART-
dc.citation.volume9-
dc.citation.beginningpage72-
dc.citation.endingpage85-
dc.citation.publicationnameIEEE ACCESS-
dc.identifier.doi10.1109/ACCESS.2020.3046040-
dc.contributor.localauthorKim, Munchurl-
dc.description.isOpenAccessY-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorAVC/H.264-
dc.subject.keywordAuthordeep learning-
dc.subject.keywordAuthorframe prediction-
dc.subject.keywordAuthorHEVC/H.265-
dc.subject.keywordAuthorand video compression-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
09300040.pdf(2.1 MB)Download
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 8 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0