| dc.contributor.author | Caño Pascual, Pablo | |
| dc.contributor.author | Fran Abadía, Pablo | |
| dc.contributor.author | Valdes-Ramirez, Danilo | |
| dc.contributor.author | González Briones, Alfonso | |
| dc.contributor.author | Barrio Val, Pablo | |
| dc.date.accessioned | 2026-06-10T07:59:53Z | |
| dc.date.available | 2026-06-10T07:59:53Z | |
| dc.date.issued | 2026-03-02 | |
| dc.identifier.uri | http://hdl.handle.net/10366/171782 | |
| dc.description.abstract | [EN]Efficient environmental perception is a cornerstone of Advanced Driver Assistance Systems (ADAS) and autonomous driving. A
persistent architectural dilemma in this domain is whether to employ unified Multi-Task Learning (MTL) frameworks, which optimize computation through shared backbones, or modular multi-model pipelines, which prioritize task-specific accuracy. This paper presents a comparative analysis of these two paradigms for joint object detection and drivable area estimation. Specifically, we evaluate YOLOPX, a representative anchor-free MTL architecture, against a decoupled multi-model system that integrates RT-DETRv2 for vehicle detection and the lightweight YOLO11n-seg for drivable area segmentation on the BDD100K benchmark under identical hardware conditions. The results show that, although the MTL YOLOPX model achieves higher throughput, the decoupled system delivers substantially better detection performance, particularly in the stricter 𝑚𝐴𝑃 50:95 metric, while preserving competitive segmentation quality and maintaining real-time latency suitable for edge deployment. These findings suggest that modular designs, rather than monolithic MTL models, can offer a more favorable balance between safety-critical detection accuracy and computational efficiency for next-generation intelligent vehicles. | es_ES |
| dc.description.sponsorship | Project “A catalyst for EuropeaN ClOUd Services in the era of data spaces, high-performance and edge computing (NOUS)”, Grant Agreement Number 101135927. Funded by the European Union. | es_ES |
| dc.format.mimetype | application/pdf | |
| dc.language.iso | eng | es_ES |
| dc.relation.ispartofseries | ACM conference proceedings; | |
| dc.rights | Attribution 4.0 International | es_ES |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | es_ES |
| dc.subject | Autonomous driving perception | es_ES |
| dc.subject | Multi-task learning | es_ES |
| dc.subject | Real-time object detection | es_ES |
| dc.subject | Vision Transformers | es_ES |
| dc.subject | Drivable area segmentation | es_ES |
| dc.title | Unified Multi-Task Learning vs. Decoupled Transformer-based Perception: A Comparative Analysis | es_ES |
| dc.type | info:eu-repo/semantics/article | es_ES |
| dc.subject.unesco | 1203 Ciencia de los ordenadores | es_ES |
| dc.relation.projectID | European Union 101135927 | es_ES |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | es_ES |
| dc.journal.title | International conference on Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) solutions for Europe’s Next-Gen Cloud Infrastructure | es_ES |
| dc.type.hasVersion | info:eu-repo/semantics/draft | es_ES |