Recurrent DETR: Transformer-Based Object Detection for Crowded Scenes

Cited 9 time in webofscience Cited 0 time in scopus
  • Hit : 160
  • Download : 0
Recent Transformer-based object detectors have achieved remarkable performance on benchmark datasets, but few have addressed the real-world challenge of object detection in crowded scenes using transformers. This limitation stems from the fixed query set size of the transformer decoder, which restricts the model's inference capacity. To overcome this challenge, we propose Recurrent Detection Transformer (Recurrent DETR), an object detector that iterates the decoder block to render more predictions with a finite number of query tokens. Recurrent DETR can adaptively control the number of decoder block iterations based on the image's crowdedness or complexity, resulting in a variable-size prediction set. This is enabled by our novel Pondering Hungarian Loss, which helps the model to learn when additional computation is required to identify all the objects in a crowded scene. We demonstrate the effectiveness of Recurrent DETR on two datasets: COCO 2017, which represents a standard setting, and CrowdHuman, which features a crowded setting. Our experiments on both datasets show that Recurrent DETR achieves significant performance gains of 0.8 AP and 0.4 AP, respectively, over its base architectures. Moreover, we conduct comprehensive analyses under different query set size constraints to provide a thorough evaluation of our proposed method.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2023
Language
English
Article Type
Article
Citation

IEEE ACCESS, v.11, pp.78623 - 78643

DOI
10.1109/ACCESS.2023.3293532
URI
http://hdl.handle.net/10203/327849
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 9 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0