Thermal Runaway: Understanding the Fundamentals to Ensure Safer Batteries

September 19, 2019 | Lithium-ion (Li-ion) battery thermal runaway occurs when a cell, or area within the cell, achieves elevated temperatures due to thermal failure, mechanical failure, internal/external short circuiting, and electrochemical abuse. At elevated temperatures, exothermic decomposition of the cell materials begins. Eventually, the self-heating rate of the cell is greater than the rate at which heat can be dissipated to the surroundings, the cell temperature rises exponentially, and stability is ultimately lost. The loss in stability results in all remaining thermal and electrochemical energy being released to the surroundings.

William Walker is working to unravel the fundamentals of this explosive process. It is only by fully comprehending thermal runaway through the testing of statistically significant quantities of cells, Walker believes, that the complete impacts of the event can be effectively mitigated. Dr. Walker tells Battery Power Online that this is important because successful mitigation techniques ultimately lead to the design safer batteries with adequate thermal management systems.

Walker works at the NASA Johnson Space Center. He started as a thermal analyst focusing on thermal modeling of spacecraft structures in space environments, but while earning his PhD in Materials Science and Engineering, NASA expressed a need for a broader knowledge base regarding thermal runaway in Li-ion batteries. From there, Walker re-directed his focus to the thermal aspects of Li-ion batteries with a special focus on battery safety and thermal runaway.

His quest to elucidate the intricacies of thermal runaway recently led to the publication of a paper he co-authored outlining the creation of a statistical regression model used for predicting the test-to-test variability of thermal runaway energy release as a function of cell-level design variables (e.g. impacts of bottom vent vs. non-bottom vent, impacts of high melting temperature separator materials, and impacts of casing thickness). The publication also provides industry with insight into the fraction of energy that is released through the cell casing vs. that ejected away from the cell – information that is very helpful in the design of safe battery systems. The data discussed in the publication was only made possible through the utilization of a special fractional thermal runaway calorimeter (FTRC), developed by NASA teams (for which Dr. Walker provided thermal expertise), for the measurement of total energy release and the energy fractions.

Battery Power Online: What does a typical day in the office or lab look like for you?

William Walker: My time is split between the office and the testing facilities. My office time is spent supporting NASA projects and programs that need thermal analysis or thermal design expertise. I consult and collaborate with individuals and programs, providing guidance in safe battery design, especially as it relates to thermal management. A lot of my time is also spent working with the Fractional Thermal Runaway Calorimetry (FTRC) development teams and testing teams.

When I’m not in the office, I might be found at the Energy Systems Test Area, or ESTA. This is a safe and designated area for abuse testing of Li-ion cells at the NASA Johnson Space Center. We frequently perform abuse testing using the FTRC device; in this role I am typically supporting our lead mechanical designer and test facilitator Jacob Darst (also with NASA Johnson Space Center) in executing the tests. From there I process the data to help provide our team with an understanding of what happened when the cell exploded.

Can you briefly explain a thermal runaway event?

A few things can trigger thermal runaway events. Thermal runaway can be initiated from mechanical or thermal failures. Electro-chemical abuse from overcharging or over-discharging the cell can also initiate thermal runaway. Also, there’s the possibility of an internal short circuit within the cell leading to thermal runaway. Any of these events can lead to elevated temperatures that are high enough to induce rapid exothermic decomposition of the cell materials.

To explain further as to why we see such a rapid heating rates we need to understand a little more about the decomposition reactions. When a Li-ion cell, or a small spot within the cell, reaches a certain critical temperature range, the materials inside the cell start to break down, to decompose. These decomposition reactions are exothermic in nature which is why we have a self-heating behavior. Further, the decomposition rates, which are directly proportional to the exothermic self-heating rates, follow Arrhenius form which means that the decomposition rate (and subsequently the self-heating rate) goes up exponentially as temperature goes up. Put simply, as the temperature increases, so does decomposition rate, and likewise, so does the self-heating rate. The result is a self-feeding heating rate within the cell that increases until the cell loses stability, ruptures, and all remaining thermal and electrochemical energy is released into the surroundings.

What has NASA’s response been to thermal runaway?

Like most industries, NASA also uses Li-ion batteries. However, we operate in an arena that can be especially high-cost and high-risk, and those risks have to be addressed. Following certain well known field failure incidents in 2013, NASA decided to re-evaluate their certification criteria for Li-ion batteries. They sought to re-evaluate what defines a safe Li-ion battery assembly, and to re-consider what different certification requirements should be instituted to ensure that the battery is safe for flight, particularly for human spaceflight applications.

The NASA Engineering and Safety Center, or NESC, led the effort, in collaboration with battery engineers across the Agency. When they concluded, it boiled down to a several things thing, but primarily that the certification requirements needed to be updated to mandate that for batteries of a certain size, the severity of thermal runaway would have to be evaluated with testing and analysis and that designs would have to show that cell-to-cell propagation could not occur.

I think that standard sets us apart from other standards that are out there. We didn’t choose to ignore thermal runaway, nor did we try to develop a standard that requires us to completely prevent thermal runaway altogether (at a fundamental level that’s not possible anyways). Think about it… An effective thermal management system can make sure that thermal failure does not happen. A first-rate structural design effectively ensures a mechanical failure will not occur. Instituting effective battery management systems prevents electro-chemical abuse situations. What can’t be prevented is the possibility of a latent defect, of foreign object debris, or of something in general being inside the cell that can cause an internal short circuit. Therefore NASA chose to start assuming that thermal runaway could happen, and would happen, and then to design a thermal management system that both, prevents this from being a catastrophic event and also prevents cell-to-cell propagation. In other words, we assume it will happen and we plan for it, we give the heat somewhere to go, and we have a thermal management system designed such that when heat is released it doesn’t force a neighboring cells into thermal runaway.

However, we quickly realized that although there were several calorimetric test methods available for characterizing the heat generated from thermal runaway, there was nothing out there to help us discern the energy fractions. That’s what you really care about upfront. The fraction of energy remaining in the cell that can conduct to neighbor cells, combined with cell spacing and interstitial material selection, dictates how hot neighbor cells will get. The fraction of ejected heat is important as well; you’ve got to give it a place to go without it landing on neighbor cells and causing them to go into thermal runaway. With this newfound motivation to understand the thermal runaway energy fractions, NASA also decided to develop new testing technique. Part of the 2016 to 2017 timeline was the invention of the FTRC device that I mentioned earlier. The FTRC was developed as a NESC lead and sponsored assessment, with collaborators at Johnson Space Center and at SAIC. The completed FTRC enabled us to perform rapid turnaround experiments, where we were able to determine not just the total energy release, but also how much heat comes through the cell casing vs. how much heat is ejected away. This new testing technique ultimately gave us the information we needed to start addressing our new safe battery design criteria.

You recently co-authored a paper, Decoupling of Heat Generated From Ejected and Non-ejected Contents of 18650-format Lithium ion Cells Using Statistical Methods (DOI: . What can you tell us about the paper?

Using the FTRC device, our team tested a number of cells ranging from 2.4 amp hours up to 3.5 amp hours with a number of features considered as controlled variables—capacity, casing thickness, bottom vent versus non-bottom vent cells, varied triggering mechanisms, etc. For each of the categories, we ran up to 10 FTRC experiments and documented thermal runaway response (i.e. total energy release and the energy release fractions). It was very important to run enough experiments to characterize the total range of expected outcomes. Recall that no two thermal runaway events are the same. Decomposition reactions can go to different levels of completion. Sometimes the cell doesn’t fail out at the top or bottom the way it’s supposed to. Sometimes there’s a sidewall rupture or a spin group breach. Sometimes there is a bottom rupture for a cell that’s not a bottom vent cell. Each of these failure modes leads to a different thermal response. This is why we have been expounding in our papers and presentations that no two thermal runaway events are alike. When considering thermal runaway for a given cell, one shouldn’t consult a single experiment, or two or three experiments averaged together. It’s necessary to look at the statistical variability on a test-to-test basis for a single cell configuration and to then use that gained understanding of the variability to inform your design.

In addition to determining the energy release fractions, we also used the FTRC results as inputs for a regression model; the final model was capable of predicting the overall range of expected thermal runaway behavior for any of the cell configurations considered by our study. Specifically, the regression model was used to analyze the probability density of total energy release for each cell configuration. There was the best case scenario, so the least amount of energy ejected; there was the worst case scenario, or the greatest amount of energy ejected. The model also predicted the highest probability event. With the data presented in our paper, users can do more than consider a single thermal runaway event, but rather they are provided with inputs necessary for analyzing the impacts of the entire range of possible thermal runaway outcomes.

Another outcome of this paper was that we demonstrated that certain variables and mechanical design features can have an influence on the thermal runaway profile. The casing thickness of the cell matters. The venting features matter. The way the cell fails matters. It all impacts the heat generated and where that heat goes. If you want to design an optimized battery system with adequate thermal management, but one that is also optimized from a cell-spacing perspective, this is important information.

The paper covered certain aspects of thermal runaway that I think should be considered when designing a battery system to be safe and to prevent cell-to-cell propagation (i.e. the statistical variability of total energy release and the energy fractions). From an analysis driven design perspective, to meet the aforementioned goals/strategies, several questions come up. What should the cell spacing be? How far apart do they need to be? Do I use an interstitial insulation? Do I use a heatsink? How do I move the heat around to prevent cell-to-cell propagation? What is the best way to optimize that design? Particularly in space exploration and aerospace in general, mass and volume are essential. You don’t want unnecessary mass with your thermal management system, but at the same time you need your safety margin. The right balance is imperative.

What technologies need to be developed to engineer safer lithium ion batteries and where do we go from here?

We need to continue developing technology that helps us understand thermal runaway until we can effectively prevent thermal runaway altogether. Until we can prevent it, we have to plan for it. We learned a lot about thermal runaway with our initial FTRC study, but there is still much to be learned. Step one is continuing learning more about the impacts of cell design features on the thermal runaway response through testing of statistically significant quantities of cells.

Right now, we’re limited by the mass and volume impacts of battery thermal management system designed to protect against thermal runaway. General cell protection components, heat sinking, insulating, and/or cooling, that gets heavy. When you’re talking about automobiles, military applications, and spacecraft—where mass and volume are major drivers—that’s a problem and you’re forever limiting your potential energy and power densities. But what if we didn’t need a thermal management system designed for thermal runaway protection? There are companies out there developing technologies and new cell level materials that effectively prevent thermal runaway altogether or make it very unlikely to happen. With the threat of thermal runaway further reduced removed from the equation altogether, could we significantly reduce the mass of battery thermal management systems, and unlock the potential for higher energy and power densities? My hope is that one day we will be able to buy Li-ion cells right off the shelf with the materials built in that reduce the overall risk of thermal runaway or even prevent thermal runaway altogether.