Training-free framework that converts SAM3 into a real-time multi-class open-vocabulary detector. Achieves 55.8 AP on COCO val2017 (80 classes) at 15.8 FPS (4 classes, 1008px) on a single RTX 4080.
HOI-DETR is a transformer-based framework for detecting hands, hand-held objects, and their interactions in images and video. Built on the Co-DETR architecture, it adds a lightweight interaction ...
Abstract: You Only Look Once (YOLO) has established itself as a prominent object detection framework due to its excellent balance between speed and accuracy. This article provides a thorough review of ...
Abstract: Tiny-object detection is increasingly crucial in fields such as remote sensing, traffic monitoring, and robotics. Inspired by human visual perception, the attention mechanism has become a ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results