Metrics and Benchmarks

for Automated Driving

Metrics and Benchmarks

for Automated Driving


Metrics and Benchmarks for Automated Driving

Abstract

Deep neural networks (DNNs) conquer more and more of the autonomous vehicle driving stack – up to end-to-end trained systems. Numerous challenges and benchmarks exist, inviting to push metrics higher and higher. But does this directly translate to a safer or improved automated vehicle operation? The practitioner asks: How good is good enough, and are we even measuring the right things?

Our workshop invites both scientific and industrial approaches to deeper understand and overcome what we call the “crisis of metrics”. We request contributions to the questions: Which metrics are the right ones for which problems? What are the dos and don’ts? Do singular metrics, e.g. of perception outputs, still have their place in end-to-end trained systems? How to combine metrics? Are metrics invariant enough towards diverse system outputs that could all be considered as safe and reasonable operation? Should metrics be designed to take downstream applications into account instead of being fully independent to prevent silo thinking across research groups or departments?

In the age of generative approaches, challenges regarding metrics go even beyond. How to measure quality without ground truth or usefulness for certain purposes? Is it necessary that generated videos/images are real-world like to be useful, or is it rather enough if specific aspects that the downstream application (like object detectors) focus on are accurately generated?

What are the right losses in DNN training and how is the interplay with the right metrics? Are intermediate losses useful in end-to-end learning systems, or do we only focus on the system’s outcome? Should metrics and losses be developed, which are invariant with respect to multiple “good” solutions?

With this workshop, we bring practitioners and researchers together to discuss and potentially advance the state of the art of metrics and benchmarks for automated driving and to obtain a common understanding about the challenges ahead.

We invite contributions on the following topics:

Metrics and Benchmarks

  • Metrics for End-to-End Driving Systems
  • Alignment between single function metrics and system performance
  • Novel performance metrics and loss functions
  • Evaluation of un- and self-supervised learning
  • Plausibility checks of generative AI
  • Alignment of loss functions and system performance
  • Quantitative and qualitative comparison of Automated Driving stacks

Best Practices

  • Suitability of Generative AI data for training, test and validation
  • Best practices for fair performance comparisons
  • Novel test and validation data sets and procedures
  • Minimal performance requirements for reliable and safe AD systems
  • Online monitoring of function and system performance
  • Challenges and pitfalls of existing metrics

Organizers Info

Corina
Apachite

Continental AG

Holger
Caesar

TU Delft

Tim
Fingscheidt

TU Braunschweig

Christian
Hubschneider

FZI

Ziquan
Liu

Queen Mary University of London

Ulrich
Kreßel

Mercedes-Benz AG

Thomas
Monninger

Mercedes-Benz
RDNA Inc.

Jörg
Reichardt

Continental AG

Ömer
Sahin Tas

FZI

Andrei
Vatavu

Mercedes-Benz AG

Zifan
Zeng

Huawei Technologies Düsseldorf GmbH

Xingyu
Zhao

University of Warwick

Marius
Zöllner

FZI

Dates & Agenda

March 15, 2025
March 30, 2025

April 25, 2025
June 22, 2025

Workshop Paper Submission
Note of Acceptance

Final Paper Submission
Workshops Date

March 15, 2025
Workshop Paper Submission
March, 30 2025
Note of Acceptance


April, 25 2025
Final Paper Submission
June 22, 2025
Workshops Date

TIMEAGENDA ITEMSPEAKER
12:00-13:30Lunch break and joint poster session with Ensuring and Validating Safety for Automated Vehicles
Workshop: Metrics and Benchmarks for Automated Driving
13:30-14:00Keynote on BenchmarksDariu Gavrila
Delft University of Technology
14:00-14:30Metrics for Automated Driving: Challenges and Pitfalls
The talk highlights challenges and pitfalls of commonly used evaluation metrics in modular automated driving stacks, in End-To-End stacks, as well as in generative AI. How good is good enough, and are we even measuring the right things? How useful are intermediate metrics in their current form in the overall system context and how to come up with metrics that really matter in the end?
Matthias Schreier
Technical University of Applied Sciences Würzburg-Schweinfurt
14:30-15:00Paper PresentationNN
15:00-15:45Coffee Break
15:45-17:054 Invited Talks
MAN TruckScenes: a truckload of data to fill research gaps
Dive into MAN TruckScenes, the first public large-scale multimodal dataset for Autonomous Trucking! It’s addressing various challenges such as trailer occlusions, novel sensor perspectives, long-range perception and diverse weather conditions. Comprising 747 scenes, it features sensor data from cameras, lidars, and 4D radars, along with annotations for tracked bounding boxes and scene tags. Thereby MAN TruckScenes not only provides a foundation for exploring new perception solutions, but also serves as a valuable resource for developing benchmarks and metrics to tackle these challenges!
Fabian Kuttenreich
MAN Truck & Bus SE
Bridging the Reality Gap: Simulation-Based Evaluation of Driving Models
Real-world evaluation of driving models is expensive, difficult to scale, and lacks controllability. Simulation offers a scalable alternative, enabling reproducible testing in both open-loop and closed-loop settings. This talk examines methods for assessing the domain gap between the real world and simulation and explores the role of scene reconstruction models in generating high-fidelity virtual environments.
Jannik Zürn
Wayve Technologies Ltd.
NNZiquan Liu
Queen Mary University of London
NNXingyu Zhao
University of Warwick
17:05-17:30Conclusions, closing remarksWorkshop Chair

Keynote Speakers

Dariu
Gavrila

TU Delft

Fabian
Kuttenreich

MAN Truck & Bus SE

Ziquan
Liu

Queen Mary University of London

Matthias Schreier

Technical University of Applied Sciences Würzburg-Schweinfurt

Xingyu
Zhao

University of Warwick

Jannik
Zürn

Wayve Technologies Ltd.

IEEE IV 2025

Scroll to Top