Unified Multi-Task Learning vs. Decoupled Transformer-based Perception: A Comparative Analysis

Caño Pascual, Pablo; Fran Abadía, Pablo; Valdes-Ramirez, Danilo; González Briones, Alfonso; Barrio Val, Pablo

Título

Unified Multi-Task Learning vs. Decoupled Transformer-based Perception: A Comparative Analysis

Autor(es)

Caño Pascual, Pablo

Fran Abadía, Pablo

Valdes-Ramirez, Danilo

González Briones, Alfonso

Barrio Val, Pablo

Palabras clave

Autonomous driving perception

Multi-task learning

Real-time object detection

Vision Transformers

Drivable area segmentation

Clasificación UNESCO

1203 Ciencia de los ordenadores

Fecha de publicación

2026-03-02

Serie / N.º

ACM conference proceedings;

Resumen

[EN]Efficient environmental perception is a cornerstone of Advanced Driver Assistance Systems (ADAS) and autonomous driving. A persistent architectural dilemma in this domain is whether to employ unified Multi-Task Learning (MTL) frameworks, which optimize computation through shared backbones, or modular multi-model pipelines, which prioritize task-specific accuracy. This paper presents a comparative analysis of these two paradigms for joint object detection and drivable area estimation. Specifically, we evaluate YOLOPX, a representative anchor-free MTL architecture, against a decoupled multi-model system that integrates RT-DETRv2 for vehicle detection and the lightweight YOLO11n-seg for drivable area segmentation on the BDD100K benchmark under identical hardware conditions. The results show that, although the MTL YOLOPX model achieves higher throughput, the decoupled system delivers substantially better detection performance, particularly in the stricter 𝑚𝐴𝑃 50:95 metric, while preserving competitive segmentation quality and maintaining real-time latency suitable for edge deployment. These findings suggest that modular designs, rather than monolithic MTL models, can offer a more favorable balance between safety-critical detection accuracy and computational efficiency for next-generation intelligent vehicles.

URI

https://hdl.handle.net/10366/171782

Aparece en las colecciones