AV metrics and safety with Jeff Wishart
This week we dive into automated driving systems and their need for metrics we can all understand with returning guest, Dr. Jeff Wishart. Jeff recently published a couple of articles focused on the dynamic driving task, modern convenience and safety features and how to better understand these for all in the industry. Hopefully, regulators will take note and use Jeff’s work as a resource in their work.
Subscribe using your favorite podcast service:
Transcript
note: this is a machine generated transcript and may not be completely accurate. This is provided for convience and should not be used for attribution.
Anthony: [00:00:00] You are listening to There Auto Be A Law, the Center for Auto Safety Podcast with executive director Michael Brooks, chief engineer Fred Perkins, and hosted by me Anthony Cimino. For over 50 years, the Center for Auto Safety has worked to make cars safer.
Hey listeners, welcome to October 1st, starting a new month. Today we have Dr. Jeff Wishart back to the show. He’s an adjunct professor at Arizona State University. Welcome back Jeff.
Jeff Wishart: Thanks Anthony. Great to see you guys. Yeah, so happy to be back.
Anthony: We’re gonna get into a couple.
Deep Dive into Dynamic Driving Task Assessment
Anthony: We’re gonna deep dive into a couple articles you wrote recently.
One Dynamic driving Task Assessment metrics for automated driving systems. And another one, safety Impacts of Active Safety and Driving Automation features. So I started going through the dynamic graphing task [00:01:00] assessments metrics. That is a mouthful. Yeah.
Jeff Wishart: Yeah. Apologies for that.
Anthony: No it’s great. So let’s start off first on a very high level.
What exactly is this? What has your research been around?
Jeff Wishart: So this is coming through se standards se, international Standards Body so it’s a standard standards development organization, and this is the On-Road automated driving committee of which Fred is also a member. Where this is, we’re developing standards documents for the automated driving system industry.
And so this is a re a recommended practice. So it’s, they go through, they have three different tiers. The lowest tier is information report, then a recommended practice, and then a standard ba basically an order of consensus that we can, we see from the stakeholders.
Challenges in Automated Driving Systems
Jeff Wishart: And so this recommended practice is around what do you have to measure in order to make an assessment for the dynamic driving task [00:02:00] DDT performance of an automated vehicle.
And so right now, some of your listeners probably know that, right? All we have at the moment are crash data that comes through the standing general order from Nitsa. And that’s useful in a lot of ways, but it doesn’t get us everything that we need and not by long shot. So these this recommended practice is the motivation is to get, have a set of metrics.
So no, no single metric will be sufficient. A set of metrics that can be used to make that overall assessment. And so this is the first step in a, what I foresee to be a series of documents to get to that place where we can make an evaluation of how these avs are performing out on public roads.
Because right now we really don’t know all that much.
Anthony: Okay. So we’ve seen situations where Waymo, as we’ve called it, is driving drunk. ’cause it will go into a, the wrong lane of traffic, do this for a while, and then come back. And like you’re saying we only have the information [00:03:00] when they hit a stationary object, like they hit in an alley, they hit that light pole or telephone pole.
But even that, we’re not getting the data before that necessarily of why did it think this was okay? We don’t. And these situations where it crosses other lanes of traffic, we have no idea of why that happened. Waymo just, puts out a PR press release and read comments on it and says, you guys aren’t safe.
Get off my loan.
Transparency and Data Collection Issues
Anthony: So it is this to try and make manufacturers, these developers to say, Hey, let’s be a lot more transparent. This is why this situation happened.
Jeff Wishart: We don’t get into the policy of whether developers should be more transparent or not, although I’m certainly supportive of that. What I would say is that as you stated there, we don’t get any data unless it shows up in the SGO or someone happens to film it.
We or somehow we, the press finds out about an incident, but none of these incidents that rise to the level of a crash do we get, [00:04:00] we miss a lot of them. So these metrics are a way to f. To get some of those data, get, find out about some of those incident, find out about how these vehicles are performing.
Metrics for Evaluating AV Performance
Jeff Wishart: And I should note that the, these metrics can be used in the development process, so in simulation on a closed course, but they can also be used once vehicles are out on public roads, whether they’re still in the testing phase with fallback test drivers or in the deployment phase where they’re actually commercialized.
So you can, there these metrics are universally a applicable.
Michael: Yeah. You ha you have to assume, yeah. Go that manufacturers are using some of these or something similar to this right now. We at least hope so. Just for, internal error reporting or to assist them in developing and improving their operations.
Obviously they’re probably not using all of these or an exact copy of this and this, these. The ad metrics only apply to, the vehicle performance measures and not to a lot of other things involving avs. The passenger, how the [00:05:00] passengers are treated and all sorts of other things.
But. It seems like they, they have a much broader range of use cases beyond simply, the internal company reporting.
The Importance of Standardized Metrics
Michael: We’ve, we talked a little about transparency, but also, would they allow possibly at some point, for comparison, between av manufacturers safety records or for, more advanced reporting of crash or disengagement information to regulators.
I was even thinking along the lines of an, an autonomous vehicle end cap where we could actually take some of this data and compare AV companies so that, it, it incentivizes the ones who were bringing up the rear to raise their level of operation, that type of thing.
Fred: Yeah you’re exactly right. Let, lemme jump in, if I may. I just wanna give a little additional context. The state-of-the-art right now, if you will, for safety reporting by the companies and basically the information that they provide to regulators. Is essentially, we haven’t killed anybody yet, or we haven’t killed too many people yet, [00:06:00] and that’s the only data that they accumulate and, except for the total number of miles that they’ve driven, using some form of the software that’s associated with the cars.
So the need exists for government, public, for people like us, advocate, safety advocates to look deeper into the vehicles. For example, you, what you’d wanna know is that each successive software release being used to run the autonomous vehicles is providing better results than the previous one. But if the only data that you’re getting is we still haven’t killed anybody, then you cannot really compare the evolution of the software or the evolution of the safety or even a progress towards a known goal that is somewhat less than.
We haven’t killed anybody. For example, you might wanna have a goal to say we wanna have a crash rate [00:07:00] of, X crashes per a thousand miles. How do you do that if the only thing you’re recording is the numbers of deaths associated with it? Or even what they’re doing now is looking at the total number of crashes, but rejecting the crashes that they say are not their fault.
Clearly the presence of an autonomous vehicle in a crash scenario is a factor in that crash evolving. If the autonomous vehicle isn’t there, the crash would never have occurred. But if you reject all of the instances of a crash where the vehicle is deemed not at fault, which is a legal term, not an engineering term, then you don’t really have any basis of statistics or metrics to say.
We’re getting better or we’re getting worse, or we shouldn’t have, we should have done this, or we should not have done that. So that’s the gap that Jeff is really trying to fill with this [00:08:00] report. What is the basis for rational collection of metrics that can be used to assess the operational safety of a vehicle to assess the operational characteristics of a vehicle?
Short of, we haven’t killed anybody yet because, we, the public and the regulators need to be able to have some insights into what the evolution of the technology and what the experience gained is having, what is the effect that’s having on the overall safety of the vehicles. And that’s exactly the gap that this report, that, jeff chaired and Jeff is largely responsible for intends to fill or, jeff is, yeah. Yeah.
Jeff Wishart: You, yeah. You guys are both exactly right. You both, you and Michael. I think I’ll address your points first, Fred.
Real-World Applications and Testing
Jeff Wishart: And, you guys have been talking about that incident in really unfortunate incident in Tempe.
Again I have the unfortunate circumstances of being near living near to the two most famous two deaths [00:09:00] that have occurred because of unmade vehicles. Both very different. One in testing, one by when Uber, a TG, and now with the Waymo vehicle. To Fred’s point, if you don’t count any of the crashes that aren’t your fault you don’t learn.
If you are doing something that, yes, maybe everything you did was legal, but it’s not necessarily what the other road users around you are expecting, so that, that could be a problem. And we know that AVS have historically had issues of getting rear-ended in this case, they got rear-ended in by the motorcyclist when they’re making a right-hand turn for, so I’m not sure exactly.
We don’t, and especially we don’t know the details of this unfortunate incident, but we know that historically there’s something about the way that. Avs, make that right hand turn usually at an intersection. That with alighted intersection that is causing human drivers to crash into them resulting in not, may not causing, but resulting in human drivers crashing into the [00:10:00] back.
And to your point, Michael, yes. You, I’m sure that the companies are are using some version of, or their own set of metrics. And we did have industry people as part of the development team, so the, a lot of input from industry, which is great. What this is trying to do is have a common set of a common language to, to, so everyone’s speaking, everyone’s using the same metrics, so that, as you say, we can make comparisons between two different manufacturers vehicles or between, to Fred’s point manufacturers vehicles if they make a change.
So that’s really important and that, as I said, that’s further down the road. This is just the first step of metrics for, and I’ll get to your, another point you made, Michael, which is this is just part of the overall performance that’s really important to, to look into. And so right now, what I, the DVT metrics.
DT assessment metrics, DA metrics as we call ’em in this report, J-S-A-E-J 32 37 for listeners who wanna search, that [00:11:00] is only what I call nominal driving. So this is when things are pretty much normal. This is, we can use these metrics to measure the performance, but there are other aspects of the driving performance that you care about.
So if there’s a, if you leave your operational design domain for example, or you have a, some sort of failure that requires that the automated driving system can no longer perform the trip. It goes into what’s called DDT fallback and how, so then you’ve got your mini mitigated risk condition at the end with mitigated risk maneuvers one or more to get to that end condition.
And so there are a whole sort of set of metrics that we’ll need to develop to evaluate that the performance of that.
Michael: Yeah, in addition, like even maybe a soft, like a set of metrics that make sure, the software and hardware is working properly, or that type of
Jeff Wishart: thing. I’ll get into, yeah, maybe I’ll get into that after I make the next point about CR crash avoidance.
So if, as I said, there’s nominal that now we have DDD fallback and then we have [00:12:00] crash avoidance metrics. So if you’re trying to avoid a, a crash, an IM that’s imminent. There are all sorts of metrics that we could develop to measure that, the ability of the AV to avoid or mitigate that crash. But as you said, there are all sorts of other metrics.
Do you care about the occupant? Do you care about how about the remote operations? Yeah. What about the fleet maintenance? There are all sorts of operational metrics, and I’ll say this, that I just proposed a new task force under the OAD committee called for operations task force. Because as we get more of these vehicles out on public roads we need to be thinking more about operations in particular.
And so they’ll get into all of, or hopefully I’m not the chair. We’ll get into all of those those issues. But I guess to your point Michael, about the getting information on the subsystems.
Black Box, Gray Box, and White Box Metrics
Jeff Wishart: So in J 3 2 37 we have a taxonomy of black, gray and white box metrics. So black box metrics are metrics that don’t need any data coming from the automated driving system.
They don’t actually even need any data coming [00:13:00] from the vehicle itself. You can get everything from offboard sources, such as a camera at an intersection. The measurement uncertainty will be different depending on where you get your measurement. Gray box metrics give, you need some limited access to the automated driving system, but it’s just status messages.
So it’s nothing, nothing too proprietary. And then white box metrics, you need some limited more le or less limited, more extensive access to the automated driving system. And we did this because when I first started the research behind this recommended practice, I was working with Intel, MobileEye and other AV developers.
And it became very clear that the data, they’re very sensitive around data. And perhaps and rightfully and a lot of the times they’ve, there have been times when they’ve been asked to provide data without a real purpose behind it. And just say, give, they’ve been told, just give us all your data.
What are you gonna do with it? And that’s, so that’s not the situation we wanted to be in. So we wanted to think of, okay, on the white box metrics, what do we need [00:14:00] that will give some insight into some subsystems, like the perception, but we don’t need to get into anything too proprietary that, that’s very, really sensitive to the developers.
And so this is the first iteration, and maybe we’ll find that we need to delve a little bit more deeply into the white box metrics and add some more. But we are, we started with a gentle touch. Hopefully we can get industry behind it. And then we’ll see how that goes. And that, and again, we’ll, we will iterate as necessary as we hopefully get people implementing this recommended practice.
And we’ll get feedback on what’s working and what maybe what isn’t.
Anthony: Before we go too much further, can there’s been a lot of acronyms thrown around and Joe, just so we can define a few of them, we had DDT, which I thought was a wrestling move da, a whole bunch of things. So just so for the the non-technical listeners at home.
Jeff Wishart: I try not to do use too many when I speak, although I’m sure that I do. DDT dynamic driving Task DA is the DDT assessment. So [00:15:00] dynamic driving task assessment. Got it. ODD, operational design domain and a DS automated driving system. Perfect. Thank you.
Anthony: What dynamic driving task, this just is the action of driving the vehicle, right?
Jeff Wishart: All the things that you would need to do in order to operate your vehicle. So it include, includes the looking, monitoring the environment for different events and objects. It includes things like turn, turning on your signal, which not everyone does as you guys discuss a lot. But the, it’s all the maneuvers and behaviors that you need to engage in to, to operate your vehicle.
Anthony: Okay. So I wanna I’ve got a question about one of your black box options here. ’cause this relates to what you’re saying is. A lot of issues these vehicles have is they act in ways that other road users don’t expect. Exactly. One of the black boxes is traffic law violation. And as drivers, we know there’s certain traffic laws that we just don’t follow.
And it’s not that we’re being scoff laws. Like for example, I’ve mentioned living in [00:16:00] New York City, you’re regularly going through red lights because that’s what everyone else expects you to do and that’s the only way to move through. Or I have a situation where, I think I mentioned here where there was a often off ramp it said, all traffic to the right, but nobody went that way.
’cause you’d loop around people selling guns and drugs and so everyone would make a left turn crossing, crossing over a lane until one day a cop stopped me and gave me a $275 ticket and three points on my license. But so how to, so with, aVS being programmed to strictly follow the law, one would assume there’s all these edge cases that we as human drivers, do not obey and do not follow, and we’re not being dangerous.
How are, how is something like that handled?
Jeff Wishart: It’s a good question and a tough one. I think of a couple of examples in addition to yours, like if the rest of traffic is going at 10 miles an hour over the speed limit, you’ve, you follow the speed limit or do you follow the speed of traffic? And then there’s the [00:17:00] infamous Pittsburgh left which is not codified in law, but it’s I guess people do it in Pittsburgh.
I’ve never been there, so I don’t know. But I, people tell me that it’s true and that’s a tough one. And again this recommended practice doesn’t have anything to say about that. It just says, did you violate the law or not? And then what you do with that information is up to the implementer or the evaluator of the implementer.
But I would say it, it’s a tough call. I think we want our EVs to follow the traffic laws, but maybe we’ll get to a point where certain ones. Are flexible and in certain situations we’ll want them to break the law. So for example, if there’s a tree that’s fallen down and covering across the entire lane and there’s a human traffic controller, probably a police officer directing people to go over a double line to go around the tree, you want them to be able to do that.
So in certain, in instances, you’d want that capability. But in general, I would say that we [00:18:00] wouldn’t want people to break the traffic laws. But there we don’t have the data to really support. I don’t think we have data to support breaking traffic laws, but I guess we don’t have the data to support never breaking traffic laws as you in cer in certain in instances like you mentioned.
Fred: Yeah. An Anthony, let me just say that there’s not a lot of evidence that the AVS are being designed to always obey the law. There’s a lot of anecdotal evidence. Reports from San Francisco mostly about cars infringing on crosswalks, endangering children, endangering, crossing guards going the wrong way up streets, et cetera.
So it, it seems like the programmers, it constantly hedging the obeying traffic laws versus the expediency of zooming along at, more than legal speeds or in ways that a human driver is trained not to operate a [00:19:00] vehicle.
Anthony: But that, sorry. No, I that I wonder is that, that they’re not being programmed to vol to follow these laws or their perception systems are failing somehow?
We have no way of knowing would that’s what I would ask Jeff is ’cause you had traffic law violations in black box where you’re like, we don’t need access directly to the data. But wouldn’t you need access directly the data to figure out why it violated a traffic law?
Jeff Wishart: On some level, we don’t, if I’m a regulator, I’m just gonna put on a regulator hat for a second.
I don’t care if you broke the traffic law, I’m gonna say that’s, I’m gonna do with that in a certain way, perhaps. I don’t need to know why you did it. Maybe I do want to know. And then if you are, let’s say you’re a nitsa, you could start a, a safety defect investigation. Okay. We find out that AV developer X is breaking doing a lot of traffic law violations.
Now we need to investigate and then we get the, we would have a con a conversation where we’re getting more of the data. But I don’t think, I think just knowing that you’re breaking the traffic laws is I guess [00:20:00] as a first step and something that we can all agree upon is a normally a bad thing.
Yeah.
Anthony: Okay. Fair enough. So that’s, now I understand the black box, gray box, white box a little bit more. Okay. It’s from the perspective of the regulators what they need, right?
Jeff Wishart: Yeah, exactly. I, that’s, but we will hopefully be getting feedback from regulators and say, oh, we need this, or we need that, or we don’t need this.
Perhaps we’ll get that too. So this is the first version of this document, and it’s the first version really anywhere in the world of a set of metrics. So this is we’re breaking new ground here, so maybe we’ll find that people have thoughts once they start implementing. And that’s my strong hope.
Anthony: Do you imagine a kind of update to endcap the new car assessment program related to this? What be, besides everyone having the same nomenclature and being able to speak the same way, like what’s the. What’s your hope?
Jeff Wishart: Yeah, that would be my dream. I think that we would need to be not [00:21:00] just a, when I think of a test program to evaluate avs, it can’t just be a series of movements that you could a program towards a train that the AV to pass, right?
It has to be, I see a whole randomized a whole program then includes randomized scenario selection. So you would parameterize, randomize a lot of parameters in all sorts of different scenarios that they wouldn’t know in advance. And you just, they would have to run through that and perform using these metrics, perform to a certain level.
And I think I’ve talked on the show before about other research that I’ve done, where I’ve taken these, the metrics that I’ve been developed, that I helped develop and. Created a methodology to give you a score out of a hundred. So if you did a left hand turn at a signal, like a signaled intersection, how did you do it?
Did you stay in your lane? Did you almost hit any pedestrians, all these sorts of things. Did you follow all the laws? Did you not avoid all the other road users [00:22:00] with within a certain margin? All these sorts of things that would go into calculating your score. And then you get that score and maybe you have to get an 80 to a pass, or, and maybe it’s a 90 that’s up to the regulator.
So I would foresee a whole testing program that could be a, an end cap. It could be F-M-V-S-S. I’m not sure how that’s gonna work out. This, these, this set of metrics could be used in all sorts of different ways.
Michael: I like the idea of randomization because that’s something that we don’t currently see in end cap, where I think it’s a valid criticism that manufacturers are able to design to pass the end cap tests or the IHS tests or these very specific protocols.
But, in terms of overall safety, what does that really, where does that really get you?
Jeff Wishart: Yeah. Yeah. And we saw that in the fuel economy with the Dieselgate, right? Yes. They were, it’s not just that, that one company, it was other companies that you know exactly what test you’re gonna run, so you tune your engine or your powertrain for [00:23:00] that drive cycle, and that’s not ideal, right?
Yeah. We want real world there are lots of different real world applications and way ways of driving. And for avs all the different scenarios that they have to encounter in the real world. So you need to be able to get some sort of. The analogy obviously is the, to the human driver.
The human driver tests where you go out and they ask you to make a turn. They ask you to parallel park, you drive around for a little bit. Now they’re not testing everything, but they, ’cause they can’t, but they get a certain, they get a feel for how well that particular human driver can do complete the DDT dynamic driving task.
But in this case we have the ability to do it in a lot of this in simulation. So we can do a lot more scenarios and hopefully we’ll get to a point where we can get some confidence that the a DS under test is, has some competency.
Fred: One of the things we’ve advocat for here is that the government set up a series of tests that explores the [00:24:00] corners of the performance envelope necessary.
To determine that an AV is acceptably safe, right? So high speed turns or low speed turns, or whatever it is, there, there are certain tests that you could set up that give you reasonable confidence that the AV is going to be able to perform properly in a real world environment. What Jeff has done here is provided a language for the discussion of that envelope and figure out how you would actually determine what those envelope corners are, and then how would you measure the performance of the vehicle in those corners of the envelope.
It’s, it is extremely valuable and as Jeff said, this is the world’s first really foray into this environment. Very important work. Appreciate that Fred.
Anthony: How would this work? Because I imagine a lot of AV companies would be like, oh, you’re testing this version. But we just put a hot patch [00:25:00] in, we just upgraded.
So that is not relevant anymore. Or even worse than that, you’re testing one version and we slipped out an update. We didn’t tell you about it, and we didn’t run any tests on it, and oh, it broke left turn signals. So this car is only for sale in Massachusetts, since they don’t use left turn signals.
Jeff Wishart: Yeah, I mean it’s, that’s a real open question.
And so another publication that, that came out this year I led a team for the SE edge research report, Fred’s a contributing contributing to her to, to that report. It’s, yeah, we don’t know. And so I, in that report, I talk about Waymo and their performance. They’re the only way the only ve company’s vehicles that we have any sort of data on that we, for, we can make an evaluation of performance.
But as you say the, if they make a change, what does that mean for the previous miles under the previous software version? Or if they make some sort of other design change, maybe they take away a sensor. Yeah. Or they change sensor and they add a sensor. They change even they change the [00:26:00] location of a sensor.
They change the calibration of that sensor. Any design change it’s an open question of how big a change does it need to be before the previous data are irrelevant. And so that, that’s, yeah. We don’t have an answer to that question at the moment.
Anthony: Yeah. ’cause I think just as a human driver, every time you need to renew your license in person, they make you take an eye test.
‘Cause a decade has passed or whatnot.
Jeff Wishart: They do not in Arizona where I live. That’s state dependent.
Anthony: Oh really? Oh yeah. New York does that. ’cause it’s hilarious. ’cause when I was like 16 they had that I had to wear corrective lenses. But years later I had LASIK and they’re like, still gotta take it.
And I’m like, all right. And yeah, we’re not removing this from your license. I’m like, okay.
Jeff Wishart: In Arizona, they, you get your license and then it’s like for 50 years. So you don’t you don’t need to go back and renew for some ridiculous amount of time. And a lot can happen in 50 years. It’s not just your eyesight going.
Anthony: That’s Wow. Okay. Yeah. State’s rights, everybody. Ha. [00:27:00] And if you like state’s rights, go to auto safety.org and click on donate. I had to throw it in there somewhere. I didn’t know where it was gonna be, but hey that’s how this happens.
Fred: Did they have Piggly Wigglys in Arizona?
Jeff Wishart: I don’t believe so.
I wondered if that was gonna come up, but No, I don’t think so. Okay. I wish.
Safety Envelope Metrics Explained
Anthony: Let’s get in into some of your equations here a little bit more. ’cause I know Michael and I were talking about this briefly beforehand, and we’re going, we got the wrong degree, man. What’s going on here? So there’s one, you have a formulation here, and it’s the measured boundary over the safety envelope boundary.
And this equals the sir. And obviously we’re not gonna, what does this mean to the average person? What are you, is this the scoring you’re just talking about? Is this.
Jeff Wishart: No it’s still up upstream of the scoring. Okay. So this is a safety envelope ratio. So the sir so it, this is a black box metric.
So there are the first portion of the meat of the document are these safety envelope metrics. So safety envelope is [00:28:00] what’s called the space you temporal boundary. So it’s the space around your vehicle that between you and other road users. And most people know that you should keep some space between you and other road users.
And so if you set, if you say that I should be say, 10 meters behind the vehicle in front of me based on our speeds and a car following situation. And so that’s, let’s say the ratio is five. If I go down to such that the ratio is now four, that’s a violation. So it’s a sir. Violation.
So that’s one of the safety envelope metrics that we have. We also have if I, cause if it’s supposed to be 10 and I cause it to be nine meters, or I guess I should say feet in, in the us then we have a safety envelope violation. But if the vehicle in, if I’m the vehicle in front and the vehicle behind me is the one that, that causes the infringement, now that’s just in safety envelope infringement.
So we, we distinguish between those two things. [00:29:00] And then regardless of who causes the safety envelope infringement you should always restore your safety envelope time within a certain timeframe. And so if I make it so that it’s nine meters, I should bring it back to 10 meters and in a certain amount of time.
Anthony: I love that. I think I, I wish human drivers had that.
Jeff Wishart: Yeah. Yeah. A lot of these are applicable to human drivers. It’s not just automated driving systems that should drive with a safety envelope that’s, that they maintain and at a safe distance. A lot of human drivers don’t follow defensive safe driving techniques as you guys talk about a lot.
Anthony: No, it’s very offensive. Yeah. So this, these papers are both relatively new. As the essay paper, has this actually been published so far?
Jeff Wishart: Yeah, it came out in August. Oh, wow. So it’s, it, yeah. It’s fairly hot off the press. Okay. The SE Edge research report, I think was in April.
Anthony: Okay.
Industry Feedback and Future Directions
Anthony: What’s been the industry feedback?
Jeff Wishart: So far, I’ve had a lot of interest. I’ve gotten on calls [00:30:00] with people from different companies who want to learn more.
Introduction to Metrics and Feedback
Jeff Wishart: So that I think that’s promising. And I get on a call. I’m happy to talk to anybody, so if anyone’s listening and wants to learn more, happy to jump on a call to explain walk through.
I’ve got a PowerPoint that I, that has a whole sort of explainer and we walk through the different metrics and so I can describe all of the features. All of the metrics have a bunch of characteristics that help. Give information. So they each have a definition. They each have assumptions and subjectivity.
They each have observable variables. They each have the formulation that, that you mentioned earlier, Anthony applicability, where is this metric applicable? And then we give an example usage to show how you could use this metric. Each metric has that same structure. So we can talk I, I could talk people through that and so I’m getting good feedback there.
I think regulators are interested both at the national, state and city levels. So far so good, but it’s still pretty early.
Anthony: That’s great
Fred: to hear it for anywhere.
Contact Information and Call for Feedback
Fred: Our listeners, if you want to contact Jeff, [00:31:00] just send us a note at contact@autosafety.org and we’ll be happy to refer that your request to Jeff.
Anthony: You realize they just opened it up for you to get a whole bunch of emails about perpetual motion machines.
Jeff Wishart: You guys are gonna filter those out, right? So you guys, you’re my middle man
Michael: only.
Discussion on Safety Impacts and Automation
Michael: So switching gears to the safety impacts edge report something that we’ve discussed a good bit is, there’s a lot of conversation, at least in the media.
I constantly get questions from reporters who are just assume that some of the cruise control assist features, some of the driving some of the automations that are available in today’s vehicles are safety features. And yet we, we’ve heard from the Insurance Institute that they did a study on kind of the relationship there and there there’s really not a lot of evidence yet that.
These, cruise control [00:32:00] type assisted driving features are really producing any positive safety outcomes yet. When it comes to that, some of, a lot of the active safety features, automatic emergency braking of all sorts. Lane departure, is one, adaptive headlights.
There’s some other ones that, blind spot monitoring. They do tend to show some, pretty easily provable, at least safety benefits to drivers. Is there, do you have any sense of, where we’re moving with the automations and it seems like they’re more geared towards driver convenience in a lot of respects.
And they may have even in some cases, at least as, as we’ve discussed a lot with Tesla autopilot and full self-driving, they may have negative safety consequences.
Jeff Wishart: Yeah, that’s exactly right, Michael.
Active Safety Features and Their Benefits
Jeff Wishart: It, so I hope people will read the whole report, but as you say, active safety features, which are intermittent and they, so they intervene intermittently to correct a human driver.
Those are depending on the manufacturer and there’s wide variance between [00:33:00] manufacturers, even between models of a single manufacturer. But those have proven benefits and rear a EB is the best front A frontal A EB is second. There, these are really really powerful ways of reducing crashes.
So I, I think those are, there’s still a lot of work to be done, especially at higher speeds and at nighttime with pedestrians, things like that with a EB. But we know that a lot of these act, these active safety features are very good. Crash avoidance features. And crash mitigation features too, but mo hopefully mostly crash avoidance on the driver support feature.
So this is se level one, level two from a J 30 16.
Challenges with Driver Support Features
Jeff Wishart: As you say, these are mostly, they could become safety features, but they are right now more convenience features. We don’t have the data to support any guidance or advice that these are safer than a human driver who’s with no driving automation features, [00:34:00] especially as you mentioned it’s the human machine interaction issues at that level.
Two, we’ve got. The drivers su supervising the vehicle. But as the level two system becomes more and more capable so maybe they can go beyond just a divided highway situation. They can go to surface streets, they can go to parking lots. So all sorts of HMI issues ranging from do I start to have my skills atrophy because I’m not driving as much to, how do we keep drivers in the loop?
Now, there are things there’s a system, driver management system, a driver monitoring system, I should say. Are like, like other things, depends on the manufacturer for the implementation, but we’re, everything’s a proxy, right? You are looking at eye gaze, you’re looking at blinking, you’re looking at head pose.
You’re look, you’re looking at ha hands touching the wheel. Perhaps you can use different proxies for [00:35:00] driver situational awareness, but we don’t really have a good understanding of how to ke make sure that the driver main is still in the loop right now. We need to learn from the aviation industry that has pilots that have been doing this for decades.
We need to really think about that. But yeah, as you say, this is, it’s far too early to be able to say that these issues are not inherent. And I say that in the report that we don’t know that yet. It’s, these may not ever be safer necessarily.
Anthony: That’s surprising that the features like, automatic lane departure and active cruise control have an increased safety.
Jeff Wishart: Because active cruise control it’s, yeah, it’s too early, but if it can cause complacency and to help and for you to tune out. If you’re going down a divided highway and you turn on adaptive cruise control, let’s say you’re still steering, but you’ve turned on adaptive cruise control, you might be, if it’s a very straight highway, you might be able to lose focus.
And so you lose situational [00:36:00] awareness and become complacent. And if something else were to happen, say there’s a vehicle coming in from the lane beside you cutting in you may not be sufficiently situationally aware such that you make the right decision. And then if you layer on the steering for, to make it a level two system then you doubly, or you increase that, that possibility that you are, have no longer sufficient situational awareness.
Anthony: I’m so happy my wife doesn’t listen to this podcast ’cause those are my two favorite features while driving, but I am still paying attention people. Good. And you’re all driving worse than I am.
Jeff Wishart: Oh, and we haven’t even talked about what that means for level three, right? Which is if you are sitting as a fallback what’s called a fallback ready user from SAHA three 16, you have to be able to take over, if you leave, for example your operational design domain, you need to be able to take over the vehicle and either take over the dynamic driving task or pull the vehicle over.
[00:37:00] But we haven’t I haven’t seen any instance of a level three system where they’re enumerating specifically what secondary tasks. The fallback ready user can engage in and what they can’t. So for example, can I be reading, can I be eating? Can I be watching TV on that big screen that’s right next to me.
Can I be sleeping? Probably not on the ladder, but we haven’t I haven’t seen, as I said, any instance where they’ve been specific about those secondary tasks. I think that’s really needed.
Anthony: The marketing’s definitely implied that you can play video games. You can do whatever you want.
Jeff Wishart: Yeah.
And we need the data to support those that, those decisions on those secondary tasks. I don’t think that we do. I’ve seen ranging ranges of how long it takes a driver who’s doing a sec a an engaging secondary task to regain situational awareness. And it’s up to about 45 seconds, depending on the secondary task.
So not all secondary tasks are created equal, and not all drivers are created equal, equally either. So it’s a very [00:38:00] tricky situation and I don’t think that we have all the answers yet.
Fred: We do now have the metrics to start measuring people’s response in some of those
Jeff Wishart: situations.
Yeah. Thank you, Joe. Thank you, Fred. Exactly. It’s a start. It’s a start.
Anthony: Yeah.
Learning from the Aviation Industry
Anthony: So you did mention from the aviation industry we can learn a lot. Besides I, to become a pilot you’re highly trained. You also have air traffic control monitoring you, so you’ve got backup there as well telling you to put the iPad down in focus.
So you’ve got all of these systems. Obviously we’re not gonna put, some sort of traffic control system inside cars. Sadly in the United States, we’re not going to increase the requirements for getting a driver’s license and make it harder like it is in the UK or in Ireland. What other features do you think we could take from that industry?
Or is that just that’s, like that’s not the point of my paper. It’s,
Jeff Wishart: no a great question and a tough one. I think you’re exactly right that these are trained pilots and so they get a lot more training than we will ever [00:39:00] give our human drivers. I love to talk about the three ees to reduce crashes, right?
Enforcement engineering and education. So there are certain things that we can do to improve human drivers, but it’s very limited and we won’t be able to reach everybody. So we have to focus on enforcement and engineering as well. I think another thing that we have, maybe this is getting too much into the weeds, but another thing that we’re learning or we’re trying to take from the aviation industry and their success until recently are safety cases.
And so the industry is talking a lot about safety cases as a way of proving safety. Not just saying that safety is priority number one. It’s easy to say, but you have to prove it. So if you make a claim, you have to provide evidence to support that claim. And so that’s becoming there’s a consensus in the AV industry that we should be talking about safety cases going forward.
So I think that’s something that we can learn from the aviation industry as well. But I think it’s [00:40:00] how do you how do you keep the pilots engaged? While they are, while autopilot is on that’s something that they’ve looked into. And we need to figure out ways to keep our human drivers engaged while their level two system is engaged.
Fred: One more thing that we could use is from the aviation industry is type certificates. Now they use type certificates in Europe for new versions of cars that they’re putting on the road in the United States. And Michael, you know the details on this, right? There’s no type certificates and car companies are restricted to self-certify.
Jeff Wishart: Yeah. Yeah, that’s true. And there’s also the issue that. With human drivers. You could get into a rental vehicle tomorrow and you’ve got a whole new set of features, but a whole different set of limitations and capabilities. And no one’s reading the manual, the owner’s manual, even in their own vehicle, let alone the rental vehicle.
So it’s a difficult [00:41:00] situation. I don’t think we even throw airline pilots into a vehicle they’ve, or into a plane they’ve never learned on before. So the human driver situation is is difficult.
Anthony: Yeah. If you’re a Boeing pilot, you’re not getting into an Airbus and flying that all of a sudden it’s, yeah,
Jeff Wishart: exactly.
It’s not transferrable. Yeah.
Anthony: I had a rental car once where I’m driving and like 45 minutes in I’m like, oh my God, I’m sweating. I accidentally had turned on the heat seater. The seat heater. The seat heater. There you go. Heat seater. It was and then trying to turn that off while I was doing 70 miles per hour on a back, no, I was not doing that.
Availability of Research Papers
Anthony: Jeff, are these papers available to the average person or do you have to be an SAE member?
Jeff Wishart: Good question. You, if you have a Mobilist subscription, you can get both. But neither are free. Unfortunately, J 30 16 is free. And I encourage everyone to get a copy and read it ’cause it’s what we call the Bible of taxonomy and definitions for the AV industry.
But J 32 37 is not free. And the Edge Research report is not free. If people are, [00:42:00] I, yeah, I can’t send people free versions. I would, I wish I could, but get a mobile subscription maybe perhaps through your institution and you can get access to these.
Fred: You can also go to s ae.com and you can wind your way through the the website there and you’ll be able to buy the, buy the reports or get the J 30 16 report for free. Yeah. Yeah. That’s s
Michael: sae.org. So yeah. Do com. You’re going to I don’t even know what that website is. Yeah, don’t go there. My bad. All right. Don’t go there.
Anthony: Yeah. Only put safe things in a web browser. Don’t you know, if you’re unsure, do not do it.
Pedestrian Automatic Emergency Braking
Anthony: ‘Cause of the, in the edge case research report, you have a great chart here talking about the average percent of speed reductions in PAEB.
Oh man. PAEB, another pedestrian automatic emergency break. I know. But it’s very cool where it’s breaking out, daylight high beams, low beams. And it looks like, our friends at Volvo are the clear winners in daylight, but they can’t do anything with [00:43:00] low beams. Is this what I’m looking at?
I know. I’m jumping over the place. Saw a nice graph.
Michael: Basically, I think that most vehicles probably have a difference there. They’re not as able to detect pedestrians when they have their low beams on us, when they have their high beams on. I think that’s pretty consistent across manufacturers, but FDO has a pretty big gap there.
Jeff Wishart: Yeah, it’s true. And as I said before, there’s a big difference as if you’re looking at that chart. It’s figure six on page 13. There’s a big difference between the manufacturers and and they might be good in this test, but not as good in the other test, right? So there’s no consistency in who’s necessarily the best in everything.
And so there’s still a lot of room for improvement. Yeah. These are already saving lives. These are active safety features, but there’s a lot of room for
Michael: improvement. They look like, a lot of them other than the Volkswagen are working great during the day, but you get tonight and they’re, they’re all over the place.
Yeah. Exactly.
Importance of Multiple Sensors
Anthony: It was fascinating [00:44:00] seeing that ’cause everyone talks about, oh, lidar is the answer, LIDAR is the answer for everything. But this points out in your paper that depending on the vehicle paint color, maybe it’s not that great. Or with pedestrians what clothing they’re wearing.
Jeff Wishart: So that’s where you, in my view, a range of sensors is important.
Maybe we’ll get to a point where a single sensor like camera system will be safe, as safe or perhaps even safer. But I don’t think we’re at that point yet. And so I think in the near term, a variety of sensors which each have advantages and disadvantages is the safer choice at the moment.
And so I would say that’s until proven otherwise that’s the case.
Anthony: Okay. So I wanna repeat that. You’re saying a variety of sensors together makes sense. Hey, Elon? Yes, Elon. This is a good idea. But this is also around, I gotta go into this more with the colors. ’cause it also mentions that matte finish doesn’t it makes the lidars less effective.
And I’m seeing more and more cars with a matte finish. Or is it [00:45:00] we can get to a situation where you camp any color car you want except for black, except for matte finish.
Jeff Wishart: Yeah, the, if you’ve got, so a lidar is what’s called an active sensor. ’cause it sends out a light pulse and then it gets returned to to its receiver.
And so if it, if that return doesn’t come or it doesn’t come the way it’s expected, you get a your data is not as good. The for a camera, for example, is not an active sensor. It’s a passive sensor. So that doesn’t get affected by that. But as we know, cameras don’t see do very well at night unless we’re talking about a thermal camera.
So that’s where, in this case, a lighter might get. The data may not be as good for a certain finish on a paint, but then you cover that with your radar. You cover that with your camera, other sensors to make up for the limitations in that instance. Without, if you’re relying on a single sensor you obviously can’t do that.
Anthony: So going through all of this data, when you’re putting this the Edge case paper together in [00:46:00] particular, what were the things that stood out to you as being, oh my God, this is not as good as I imagined. Or things that you’re like, oh, this is, we’re in a better place than I imagined.
Jeff Wishart: Active safety, was really the, it was clear that we’ve made big improvements and we still have a lot of ways to go, but that those are.
Already saving lives. So I’ve got a chart in the paper just pulling it up where it’s it shows all of the features and their IM impacts on crashes. This is table one on page 15, rear automatic braking has a 78% decrease in crashes in this study. And so that’s, that’s pretty good, right?
Yeah. So automatic emergency braking the frontal 50 and 56 for different types of crashes. And so those, that’s really good and we could do better and we really need to make sure that as we can get as many vehicles with these, the best systems on the road. So that kind of surprised me. Just, it was better than I even anticipated, on the driver [00:47:00] support.
I was hopeful that I would get more data that showed one way or the other. But because it didn’t, I can’t really say that I was surprised either way. I suspected that was gonna be the case, but I didn’t know. And then on the automated driving system, so levels three through five, level three, we don’t have enough data to really say anything e either way as well because we, in, in the US for example, we only have Mercedes system and it’s been very limited in its deployment.
So we don’t really have data there on the levels. Level four ’cause we don’t have any level five on level four vehicles basically have Waymo and Waymo’s done quite well. I would say that. And they continue to accumulate miles and they have their incidents, there’s no question. But they also have a lot of crash free miles and I don’t know about incident free miles because as we say we don’t, we’re only getting crashes and not the J 32 37 metrics.
But I guess I’d say caveats to the Waymo data. We, these are not transferable to any other developer and Waymo. If they make changes, as we discussed [00:48:00] earlier, we can’t say that the earlier data are all that relevant. And the Waymo data, we, their current deployments are mostly in Sunbelt type cities.
Unix being one where I live once they start expanding, I know they’re testing in DC and I think New York that will be interesting, especially in snow. But they still haven’t unlocked their highway commercial service or their service does not go on highways currently. So that will be a very different situation as well.
So there are caveats to the Waymo data, but it’s quite promising for this one AV developer, I would say.
Anthony: And any data from the autonomous trucking companies like Kodiak or.
Jeff Wishart: No, we don’t get any of those data. If they got in crashes, we get some, but at least another thing about Waymo most of their data are not peer reviewed.
They’ll put out some peer reviewed papers and that’s great. I really support Waymo in doing that. I’d love to see more of the data being being peer reviewed when they put out their publications. But a lot of their, some of their publications are not peer [00:49:00] reviewed, so their Safety Impact website and I encourage people to go to it.
It’s very, it is, it’s very informative, but it would be better to be peer reviewed. That’s just that’s the way things work out best in the scientific world. I think that’s promising, but we would be, I would like to see some more peer review.
Anthony: Alright. Gentlemen, I’m outta questions.
You guys have anything they look still,
Fred: I would like to just dive into very briefly. The Gaslight illumination for this week, ’cause it’s very relevant. So I don’t know if I can do that.
Waymo’s Safety Demonstration
Fred: Dimitri Doga, who’s a CO CEO at Waymo posted a video on LinkedIn, or at least that’s where I saw it. And in that video, a Waymo avoids crashing into another car.
So he’s extolling that as a, an example of Waymo safety. So I looked at that many times and apparently the Waymo’s really traveling at an excessive speed in a congested urban street alongside parked [00:50:00] cars. And then all of a sudden, one of those cars reverses from the parking area into the Waymo’s path.
Now Waymo took about one second after detecting the reversing vehicle before it did anything. And while still traveling at the speed, it came very close to the reversing vehicle. And all of a sudden the Waymo. Ranked the wheel 270 degrees and then moved into the adjacent lane, which was fortunately unoccupied and briefly crossed the double line into the opposing travel lane, then steered back and recovered returns to the previously occupied lane.
So Waymo hou this as proof of opper his operational safety, but they don’t have any metrics like Jeff would’ve wanted from this, that describe what happened. And this really doesn’t support their safety case because they would’ve rejected this in their safety case analysis because they’d say it’s [00:51:00] not our fault.
So you know, we’re reacting to another reckless maneuver. So from an engineering perspective, it is not really proof of engineering safety. It’s an individual event. It’s really not part of our parameterized study that is amenable to the metrics. So the questions that I’ve got after this is, why is this safe?
You, Waymo impinged on another car because it was Waymo was going too fast for the environment and there was a hygiene maneuver to avoid the accident. But if there were occupants, which is not clear, that might have been really bad for the occupants. Jeff. What?
Jeff Wishart: Yeah. Yeah. Fred you’re teaming up nicely.
I, I think if you go to their, the web, the Waymo website, they talk about crashes, right? They don’t talk about situations like this. And so the, in J 32, 37, as we’ve got a metric called event response time violations. So if there’s an event. To [00:52:00] which the AV should respond, you can set a threshold beyond which it’s a violation.
So in, in this case you said it was a second, maybe if you set your time at 0.8 or 0.5, which is often chosen then you’ve got a violation. It took too long to make, to react and it got into a situation because it took so long. So I think we do cover that kind of situation but we don’t cover, as I said, all of the situation of crash avoidance.
But there a lot, as people know, if you drive defensively, you can avoid the crash avoidance situation a lot of the time. So yeah we would cover that with J 32 37. I think with that metric at least that would help. But. Yeah I don’t know what Waymo is doing internally. They probably have some sort of metric that they would be measuring and keeping track of on that, but I don’t know what it actually is.
Fred: So these closing counters they matter. They do, and they can be analyzed if you’ve got [00:53:00] appropriate data collection and you apply the proper metrics to that. Agreed. But note that the crash avoidance in this case was at least partly due to the reversing cars application of brakes once it detected the encroaching waymo.
So there’s a lot of factors involved in this, but from the Gaslight perspective, this really is not a testament to safety from the other. On the other hand, if Waymo had divulged the statistics that supported this and the engineering data. That allowed it to happen and then describe their recovery, then it would be at least a contributor to an overall assessment of what the safety could be, good or bad.
What Waymo is doing here is clearly has because it’s just not revealing the information necessary to support their claim that this was part of a safety demonstration.
Jeff Wishart: So I would like to see yes, more instance where the J 3, 2, 3, 7 metric measurements [00:54:00] are provided or or, and or the DA score for a given scenario.
So if they that score, maybe they get below what, below 80, who knows what their score would be, it would depend on the metrics and the measurements. But I, that’s a way of developers providing information on performance without providing sensitive information. Unless they consider set performance in any scenario sensitive, which maybe they do, but I think there are ways that we can get to transparency, which I think is good for the, for everybody in the end with this with these metrics and perhaps with the DA score.
Fred: Oh, I agree. So Waymo and what’s that of the company? Tesla, if you’re listening Tesla. Okay. You’ve now got a language we can use to communicate. Let’s get it on.
Anthony: Hey Fred, I’m gonna make your day. Look, it’s a new month and you win the Gaslight this week,
Fred: you win. That’s great. No competition, [00:55:00] but still I’ll take the win.
Yeah
Anthony: We’re abstaining. But yeah, you win. Congrats. Thank you.
Conclusion and Final Thoughts
Anthony: Yeah, I think with that’s that’s the show. Jeff, thanks for coming by. I think this is great. You’re literally, you’re doing God’s work because these types of things need to happen. We’ve talked about this forever, but you’re actually doing it, which is wonderful.
I would love to see, and an update to end cap using your formulations. And then next week we’ll have a quiz on every single acronym we’ve talked about, and that’s in your papers.
Fred: Hey. Hey. Thanks Jeff. I’ll see you in Tempe in a couple weeks. You
Jeff Wishart: will.
Fred: Alright, good. I’ll be looking for you.
I’ll buy you a beer.
Jeff Wishart: Sounds good. Awesome. Thanks you guys. I appreciate the invitation. I look forward to the next time.
Anthony: Yeah, absolutely. We’re dying to find out updates to this when they happen. Will do. All right. Bye-bye. Bye-bye. Bye-bye. Bye, everybody.
For more information, visit www.auto safety.org.