Metrics and Benchmarks
for Automated Driving
Paper Submission until: March 15, 2025

Metrics and Benchmarks
for Automated Driving
Paper Submission until: March 15, 2025

Paper Submission until:
March 15, 2025
Abstract
Deep neural networks (DNNs) conquer more and more of the autonomous vehicle driving stack – up to end-to-end trained systems. Numerous challenges and benchmarks exist, inviting to push metrics higher and higher. But does this directly translate to a safer or improved automated vehicle operation? The practitioner asks: How good is good enough, and are we even measuring the right things?
Our workshop invites both scientific and industrial approaches to deeper understand and overcome what we call the “crisis of metrics”. We request contributions to the questions: Which metrics are the right ones for which problems? What are the dos and don’ts? Do singular metrics, e.g. of perception outputs, still have their place in end-to-end trained systems? How to combine metrics? Are metrics invariant enough towards diverse system outputs that could all be considered as safe and reasonable operation? Should metrics be designed to take downstream applications into account instead of being fully independent to prevent silo thinking across research groups or departments?
In the age of generative approaches, challenges regarding metrics go even beyond. How to measure quality without ground truth or usefulness for certain purposes? Is it necessary that generated videos/images are real-world like to be useful, or is it rather enough if specific aspects that the downstream application (like object detectors) focus on are accurately generated?
What are the right losses in DNN training and how is the interplay with the right metrics? Are intermediate losses useful in end-to-end learning systems, or do we only focus on the system’s outcome? Should metrics and losses be developed, which are invariant with respect to multiple “good” solutions?
With this workshop, we bring practitioners and researchers together to discuss and potentially advance the state of the art of metrics and benchmarks for automated driving and to obtain a common understanding about the challenges ahead.
We invite contributions on the following topics:
Metrics and Benchmarks
- Metrics for End-to-End Driving Systems
- Alignment between single function metrics and system performance
- Novel performance metrics and loss functions
- Evaluation of un- and self-supervised learning
- Plausibility checks of generative AI
- Alignment of loss functions and system performance
- Quantitative and qualitative comparison of Automated Driving stacks
Best Practices
- Suitability of Generative AI data for training, test and validation
- Best practices for fair performance comparisons
- Novel test and validation data sets and procedures
- Minimal performance requirements for reliable and safe AD systems
- Online monitoring of function and system performance
- Challenges and pitfalls of existing metrics
Organizers Info

Corina
Apachite
Continental AG

Holger
Caesar
TU Delft

Tim
Fingscheidt
TU Braunschweig

Christian
Hubschneider
FZI

Ziquan
Liu
Queen Mary University of London

Ulrich
Kreßel
Mercedes-Benz AG

Thomas
Monninger
Mercedes-Benz
RDNA Inc.

Jörg
Reichardt
Continental AG

Ömer
Sahin Tas
FZI

Andrei
Vatavu
Mercedes-Benz AG

Zifan
Zeng
Huawei Technologies Düsseldorf GmbH

Xingyu
Zhao
University of Warwick

Marius
Zöllner
FZI
Dates & Agenda
March 15, 2025
March 30, 2025
April 25, 2025
June 22, 2025
Workshop Paper Submission
Note of Acceptance
Final Paper Submission
Workshops Date
March 15, 2025
Workshop Paper Submission
March, 30 2025
Note of Acceptance
April, 25 2025
Final Paper Submission
June 22, 2025
Workshops Date
TIME | AGENDA ITEM | SPEAKER |
12:00-13:30 | Lunch break and joint poster session with Ensuring and Validating Safety for Automated Vehicles | |
Workshop: Metrics and Benchmarks for Automated Driving | ||
13:30-14:00 | Keynote on Benchmarks | Dariu Gavrila Delft University of Technology |
14:00-14:30 | Metrics for Automated Driving: Challenges and Pitfalls The talk highlights challenges and pitfalls of commonly used evaluation metrics in modular automated driving stacks, in End-To-End stacks, as well as in generative AI. How good is good enough, and are we even measuring the right things? How useful are intermediate metrics in their current form in the overall system context and how to come up with metrics that really matter in the end? | Matthias Schreier Technical University of Applied Sciences Würzburg-Schweinfurt |
14:30-15:00 | Paper Presentation | NN |
15:00-15:45 | Coffee Break | |
15:45-17:05 | 4 Invited Talks | |
MAN TruckScenes: a truckload of data to fill research gaps Dive into MAN TruckScenes, the first public large-scale multimodal dataset for Autonomous Trucking! It’s addressing various challenges such as trailer occlusions, novel sensor perspectives, long-range perception and diverse weather conditions. Comprising 747 scenes, it features sensor data from cameras, lidars, and 4D radars, along with annotations for tracked bounding boxes and scene tags. Thereby MAN TruckScenes not only provides a foundation for exploring new perception solutions, but also serves as a valuable resource for developing benchmarks and metrics to tackle these challenges! | Fabian Kuttenreich MAN Truck & Bus SE |
|
Bridging the Reality Gap: Simulation-Based Evaluation of Driving Models Real-world evaluation of driving models is expensive, difficult to scale, and lacks controllability. Simulation offers a scalable alternative, enabling reproducible testing in both open-loop and closed-loop settings. This talk examines methods for assessing the domain gap between the real world and simulation and explores the role of scene reconstruction models in generating high-fidelity virtual environments. | Jannik Zürn Wayve Technologies Ltd. |
|
NN | Ziquan Liu Queen Mary University of London |
|
NN | Xingyu Zhao University of Warwick |
|
17:05-17:30 | Conclusions, closing remarks | Workshop Chair |
Keynote Speakers

Dariu
Gavrila
TU Delft

Fabian
Kuttenreich
MAN Truck & Bus SE

Ziquan
Liu
Queen Mary University of London

Matthias Schreier
Technical University of Applied Sciences Würzburg-Schweinfurt

Xingyu
Zhao
University of Warwick

Jannik
Zürn
Wayve Technologies Ltd.
IEEE IV 2025



