Building Confidence Through Design in AI-Driven Engineered Systems
Avi Nehemiah, MathWorks
The integration of AI into automotive software platforms is accelerating, highlighting the critical need for greater confidence in the robustness and accuracy of these AI-driven systems. This talk covers five keys to enhancing this confidence by using system-level simulation, leveraging the latest models from research, formal verification, and testing on hardware.
Published: 22 Dec 2024
All right. Good morning, everybody. All right. Through this session, I'm going to try and help you take away these three things.
First, I want to expose you to some of the applications of AI that we hear in automotive industry. Second, I want to dig into the concerns, some of the top concerns, things that are blocking you from building confidence in incorporating AI. And last, we'll talk about the techniques, the development of methodologies that you can use to build that confidence.
Now, the use of AI in automotive is very diverse, and it continues to grow. Here are three examples. The first is incorporating AI into products with virtual sensors. The second is using AI to speed up simulation and analysis time, so to improve your development process with AI-based reduced order modeling. And the third is to use AI for data generation, so taking recorded test data and using AI models to convert it to simulation for virtual validation.
Now, much of the insight that I'll share over the next 20 plus minutes comes from you. It comes from collaboration with our users incorporating AI into their work. And the best examples of that will see this afternoon in the AI track. We have two user talks. And then, we have a set of MathWorks experts that will dive deeper into many of the topics I covered today.
So Sunil challenged us at the start of the session to take away one thing. Jim said he's going to take away much more than one. So, hopefully, today, I'll provide a lens for you to critically assess, is an AI development process robust? And can I have confidence in it?
And we'll use five questions to think about it, to critically look at this AI process. The first, am I using the best possible AI model? AI has a lot of development, lots of research. How do I I'm using the best model?
Second, AI is usually being added to an existing system. Some of these systems have been developed, improved over maybe a decade. What's the impact of adding the AI component? How does it change the behavior of the system?
Third, when you think about deployment, if I have an accurate AI model, do I have enough compute? Do I have enough memory? Will it fit?
Now, in automotive, safety is absolutely critical. Are there ways, are there formal methods to verify and validate the AI models? What about certification? Those are the questions we think about.
And lastly, for the individual developer on the bench, is there a way to use the target hardware during development so there are no surprises later on? Now, I'm going to use these five questions as a framework of my talk, as I go through it. I'll start from AI modeling and end with testing on hardware.
And I'll use a virtual sensor as an example as we walk through the different aspects of the talk. So a virtual sensor or a soft sensor, it mimics a physical sensor by taking measured output and estimating what the sensor would estimate. Now, some of you are probably thinking, virtual sensor?
AI seems like an overkill. I can use a Kalman filter. I can add a cheap sensor. So why do you need an AI-based virtual sensor?
So a physical sensor, it could be expensive. It might be slow. There could be other noise because of your system design. Maybe you require redundancy. But also, as the type of compute that we have in our cars increases, adding a virtual sensor is an ability to add great functionality further down the road.
The specific virtual sensor that I'll talk about is something that will estimate battery state-of-charge. Everybody has at least one of these in their pocket. It's what shows you your battery indicator on your cell phone, your laptop, or your EV.
OK, so now, let's go through the five questions. So the first one on AI modeling. How do I have the best model?
So in order to enable you to find the best model, we provide two ways for you to think about it. The first, you build your own using MATLAB. And the second is you leverage models from research. It can be from academic research. It can be from other teams in your organization.
Now, of course, you can build these models in MATLAB. You can bring them in from other AI frameworks, like TensorFlow, PyTorch, the open neural network exchange. And you can also directly call into these Python-based frameworks from MATLAB and Simulink.
Now, let's start with creating a white box model from MATLAB. So here, as you can see on the screen, I have the Deep Network Designer app. I'm creating the model. This is actually the same model that is used for battery state-of-charge estimation, assembling IT layer by layer, and then parametrizing the layers.
And then, you can export this so it can be trained with different parameters and run different experiments. You can also programmatically create these in MATLAB. In this case, I'm actually importing the model from a TensorFlow model.
And so that brings it into MATLAB. You can see the structure of the model, very simple model, seven layers. And then, take a closer look at it.
But we recognize that creating your own model or maybe getting it from somebody else that's using TensorFlow or PyTorch in your organization, that's not the only way. There's lots of research out there now, kind of breaking accuracy records all the time.
And we've provided a way for you to access these models through the MATLAB Deep Learning Model Hub. This has over 60 pre-trained models and from a variety of fields, like computer vision, NLP, et cetera. And the key point here is to keep checking back because we update it monthly.
Once you have a model, you could have received it from outside. You could have built it yourself. It's really important to find the best parameters. And that's where experiment management comes into play.
You explore the hyperparameters, different training parameters, different model types. You can, of course, run training experiments directly from this app. But the real key for this is you set up a way to assess your model. And then, you keep track of the different models.
So when you're thinking about, am I using the best AI model? You want to keep track of, what's the model? What's the accuracy? Is it the best that I can do?
Now, the use cases I showed you so far are really related to taking the AI model either from outside or building it within MATLAB, but doing it locally. But generative AI is a big trend, and a lot of those models are too large to be used locally or are required to connect to the cloud.
So when you think about, what are the top performing models? You think about things like ChatGPT, or models from Google, or Anthropic. And what I have here is just the leaderboard for the top-performing LLMs.
Now, again, in that MATLAB Deep Learning GitHub, we provide a repository that lets you connect to these LLMs in the cloud in MATLAB. There's actually a user talk this afternoon that uses this repository. And that will let you connect to text-generation models, so if you want to build a chatbot, if you want to analyze your internal test data, or image-generation models.
OK. So the answer or a thing to think about, if you're thinking about, do I have the best AI model, is, before you decide, are they built from scratch? Or try and pull the latest and the best research, either from GitHub or from your colleagues around you.
Now, the next question, and this might be the most important, is, what's the impact of adding AI to an existing system? The AI component is usually working in a much more complex system. And the answer to that is, create a system-level model, hopefully, in Simulink, that incorporates both the AI and the rest of the system.
So a battery state-of-charge, that's usually used in a BMS. So here, we have the plant, which is the battery model, and then a model of the loading. And then, the BMS is a mixture of logic, so you need to check for overcharging, under charging, manage the thermal state, and then you have this the state-of-charge estimation.
Now, the state-of-charge, this cannot be measured directly from the battery, so it has to be estimated. So let's zoom in there. Now, the most common prevalent way to do this is using an extended Kalman filter recursive estimation.
What we're going to do here instead is estimate this with AI. And if you're wondering, why use AI? Because, one, you don't need an internal model of the battery. But, second, it can model much more complex relationships between the data and the output.
Now, I expect many of you are in Simulink. And a big investment of hours has been providing the ability to access AI models from Simulink blocks. They can be AI models built in MATLAB and Simulink. They can be models imported from external or even directly running AI models by calling Python for core assimilation.
Once you have your AI component integrated, that's when you can see how it works with the rest of your system. You can perform baseline testing. You can test how the performance is versus measured data on the lab against your simulation output.
But, of course, after the algorithm development, you can then test with SIL, PIL, and HIL. So how do you test the impact of the AI component on your system? You perform system-level testing to assess what the impact of adding that AI component will be to your system.
OK, so moving on to memory and execution time. And this is the question, do I have enough compute? Do I have enough memory? Will this AI model fit?
And this is where we see a lot of fear is AI is great. It's giving me great results. But will it work with the hardware that I already have? And very typical questions or comments that we get from our users are like this.
I have a model. It's 40 magabytes. It needs to be less than 10 megabytes to fit. Or, very often, you know exactly how much memory budget you have, and that's all you're working with. You need to get your model down to a certain size. Otherwise, it won't work.
Now, there are two stages to optimizing the size of the network. The first is structural compression, so you compress down the actual size of the model. And then, you can compress the data types, so quantize them, and then leverage code generation to improve performance.
The first of those techniques projection-based compression. So here's an example of an AI model. This is a more complex version for the state-of-charge. So it still has seven layers, but it has 400,000 learnables, lots of parameters, lots of memory.
Now, using the projection-based technique, you can actually bring it down to about 30,000 learnables, at significantly less amount of memory, so much so, in fact, the actual model is 93% smaller. If you look at the accuracy chart, you'll see the accuracy is almost the same. There is a reduction. That's expected. And it's 1.5 times as fast.
Now, you can also apply pruning techniques that are available in MATLAB. Now pruning techniques changes the structure of the model, so it might reduce the number of layers. It might change the connections.
And you can apply that across a wide variety of network types. You can get up to a four-time reduction in the memory use. And you can generally keep pretty equivalent accuracy. So there will be a reduction in accuracy, but it will be pretty close.
Once you've structurally reduced the size of the model, then, before deployment, you can use deep learning quantization. This works on deep-learning models to reduce the bit depth to scaled . And that will give you the smallest, probably most memory efficient model for your deployment.
So the thing to think about is once you have-- you've trained model. You've tested the component. You compress it. You quantize it. You have the most efficient model. And then, you apply automatic code generation to deploy to a processor of your choice.
Now, when you look at these different models, and one of the reasons that we actually integrate it a system level simulation is so you can do some sort of trade off analysis. So here, you have a chart. I have the Kalman filter, which is the traditional non-AI technique on the left. And I have multiple AI techniques on the rest of the table.
The technique with the compressed model is on the absolute right. And if you look at it, you can get a few things. So it is harder to develop the AI model. There's training time. There was the time to compress it down.
It gives you better accuracy. It's about the same size as the Kalman filter and about the same speed. So the trade-off for all of that work that you have on the front end is better accuracy, similar memory, equivalent performance.
So when you're thinking about, will the model fit? Can it be performant enough? The takeaway is, think about model compression, quantization, perform the trade-off studies, and, of course, leverage automatic code generation.
OK. Moving over to verification and validation. I'm going to use a slightly different example here because it's just a little easier to think about the problem looking at an image versus a signal for state-of-charge. Now, AI models, especially the neural network-based ones, they can misclassify inputs with very imperceptible changes to the output.
So here, I have a model. It classifies the image that you see as a bicycle. If you add some noise to it, you can have it change the output to a pole. Now, this effect is called being open to an adversarial attack. And it is important to verify whether the input to a model is robust to these adversarial attacks.
So when you think about when people say, is your AI model robust? They're usually referring to if the model is robust to these adversarial attacks. And since 2022, we've been investing in formal methods, formal methods that help you check if a model is robust to these adversarial attacks.
And these techniques will go in and instead of assuming the inputs and the outputs are linear, it assumes a whole search space of inputs to then check whether the network or the model is robust to these adversarial attacks. And what happens as an end user is you can figure out what the bounds are for the input to make sure that your model is safe and is not going to give you unpredictable answers with small changes to the input.
Now, when we're developing AI models, you always have a range of data that it's trained on. And the AI model will give you the best results when the input is in that range, with some variance.
Now, another technique to ensure that these models are safe, especially as you get into safety critical situations, is checking for out-of-distribution, which means you have a separate detector that sits next to your AI model that analyzes the input data. And it tells you, is the input data that's coming in, in the distribution that it was trained on or is it not? And then, you can choose to apply additional safety methods to account for those changes.
Again, I'm going to use an image for this. It's a little bit easier to see. So what we have here is two Simulink models. We're going to have one image generated which is just clean and one with a little bit of smog added.
And you can see the indicator there. That's a runtime monitor that tells whether the input data is in the distribution or not. In this case, the model was trained very clean data.
So the smog images, that's out of distribution. And there's an alert that it is out of distribution. Now, it's up to the system engineer to decide what to do with this, but at least we have an indicator as to whether the AI model is working or whether it's working within its expected range.
Another way to help increase that confidence in the AI model is to look inside of it. And an investment for us, really over the last five years, is providing tools for you to go into the model, investigate it, visualize, and get a sense of why is it behaving the way it's behaving.
Now, you're probably wondering about certification. And this is a very active area. So in auto, we have ISO 8800, the equivalent standards in aerospace in medical devices. And then, we have more spanning legislation, like the AI Act in the EU.
And what MathWorks is doing here is we are present working with these standard bodies, understanding the concerns of our users, understanding how other vendors are thinking about it. And we are now starting to think about recommended workflows to help as these certification processes become live.
The aerospace industry is probably a little bit ahead of auto in this space. And they have started publishing guidance on what it will take for a component to comply with D0178. We published an example on this very recently, on adding an AI component to an airborne system.
And, really, the way to think about it is, if you have an AI model, and, again, this is a little easier to visualize if you think about a vision-based system, you can have the input from a camera and the system output. But for it to comply, you need multiple components. You need redundancy with the models.
And you need a safety monitor, as well, before hitting the system output. And this is a published workflow by MathWorks. We published it in the last year.
So when it comes to thinking about methods to check, is my model behaving optimally? And is it within the operating range that it was trained for? The things to think about is verify the robustness, test for out-of-distribution, and also leverage techniques to visualize the models.
All right. Lastly, is testing on hardware. Now, this is really important, as you all know, to prevent trouble later on with the integration. Things might work great on the developer's desk, but if it has not been tested on hardware, it can create lots of problems later on.
So here, I have the Simulink model of the same problem, the battery state-of-charge estimation. You can see, as I've talked about through the talk, the model has both the Kalman filter and the AI model. And, of course, you can use Embedded Coder to generate code for the model.
Of course, then, the next step is to run PIL testing, where I take the algorithm, I put it on hardware, and the deployed code communicates with the plant that is in Simulink. So in this example, we are running on an NXP board.
You can go in. You can check the settings. You can modify the parameters of the board. And you can see, in the main chart, that's comparing the true output versus the simulated output. So you can see the performance on the actual hardware that you want to deploy to.
So again, we had five questions. These were things that usually you should look at critically when you're starting to develop an AI component or integrate an AI component into a system that you're already working on. And the last thing is enable your developers to test on hardware with PIL and HIL. It makes a huge difference later on when it comes to the integration.
Now, virtual sensors, we're actually finding this to be maybe the most popular kind of production deployment of AI. We've had other OEMs deploy as well. Gotion is using it for doing the state-of-charge for the battery pack.
Daimler Mercedes-Benz is using it for engine piston pressure, so very different type of applications. And there's a variety of others. And many of these stories are available on MathWorks.com.
So maybe a quick recipe, the five keys. What do you need to do to build confidence? The first, make sure you have the best model, so access, explore, see what the state-of-the-art is.
The second, maybe the most important, system-level simulation and testing. Test the effect of the AI component along with the rest of your system. Three, to make it fit, apply model compression and automatic code generation.
For the verification and validation, test for robustness and test if the data is out of distribution. And just doing those two things can go a long way in ensuring safety. And, lastly, enable your developers to do PIL and HIL testing.
Now, before I summarize and close up, you're probably wondering, this is a talk on AI. You've not said anything about generative AI. So the way we think about generative AI, there's a spectrum at MathWorks. And there's the near-term-- Jim showed you some of this-- in creating code analysis. And we are very actively working on this.
But the two things that you can use right away, the first is the MATLAB AI Chat Playground. This will let you experiment with a chat bot right alongside your MATLAB code. This is available on MathWorks.com right now. It's powered by ChatGPT's AI. It'll give you the experience of what it's like to work with the chat-based.
The second, if you want to use some of these models in your MATLAB work right now, again, the GitHub Repository is LLMs with MATLAB. This has the code that will enable you to connect to OpenAI, also to local models using Ollama. And there are many examples on chat-- creating a chatbot, decimalization, et cetera.
Now, I'll be around for the rest of the day. And if you want to maybe dive into a little bit more on what MathWorks is doing-- Jim talked about the copilot-- I'm happy to talk offline. Now, before I close up, again, I want to point out the sessions this afternoon, the AI sessions.
We have two user talks, one of which actually uses GenAI. The second is on virtual sensors. And then, of course, we have the much deeper dives from our MathWorks expert, a really good chance to interact with our experts at MathWorks. That's it from me. Thank you for your time.
[APPLAUSE]