AI Geospatial Wildfire Risk Prediction

Wildfires are unplanned fire events that occur both naturally and as a result of human activity. In the United States, they are responsible for several billion dollars in costs each year [1], and in 2020, they burned over 10 million acres in the United States alone [2]. Much of the costs come from prevention efforts. For example the state of California recently approved a wildfire prevention budget of over $1.5 billion [3]. Deciding on where to allocate that spending requires knowledge of which areas are at high risk, so in this article, I present a framework for providing up-to-date maps of fire risk using deep learning image processing and geospatial datasets.

The first section of this article presents some background on the field of Remote Sensing as well as the tool Google Earth Engine which was used extensively to create the dataset in this project. If you are already familiar with these topics, or just want to skip to the project, feel free to scroll down to AI Geospatial Wildfire Risk Prediction which covers Preparing the Dataset and training a Deep Learning Model.

This project was conducted as part of a graduate level class ECE-471: Machine Learning Topics in Remote Sensing and Earth Observation at The Cooper Union.

Remote Sensing and Earth Observation

For the uninitiated, remote sensing is the science of obtaining information from a distance, usually using airplanes or satellites. Often, it is thought of in the context of satellites observing the earth; one common example would be the satellite view on google earth.

The sensors themselves range from passive sensors, which detect natural light much like a camera, either emitted or reflected, to active sensors, which emit their own energy and observe the reflections, like radar or lidar. The different imaging frequencies of light are called spectral bands. Satellites orbit around the earth imaging swaths of land as they go, as portrayed in the video below. Because of this pattern, it takes a certain amount of time to image the entire Earth.

NASA’s Landsat polar orbit imaging swaths revisiting each point every 16 days. Retrieved from NASA at

The period of time elapsed between consecutive images of the same point on earth is known as the revisit rate. Generally, there is a trade off between revisit rate (temporal resolution) and spatial resolution, which refers to the size of a pixel on the ground. For example, the popular MODIS satellite has about a 2-day revisit rate and a resolution of 250 to 1000 meters depending on the spectral band [4]. For comparison, Landsat-8 has a resolution of 30 meters in the visible light bands, but a revisit rate of about 16 days [5].

Google Earth Engine

Google Earth Engine (GEE) is a powerful interface that lets you access over thirty years worth of satellite imagery and other geospatial data from anywhere on earth [6]. I highly recommend you explore their datasets.

In the code below, I show how easy it is to pull some data and plot it on an interactive map. First, you have to sign up for a developer account with GEE. Make sure you have geemap and ee installed (ee comes preloaded on Google Colab but geemap does not), and authenticate your account. Find the code for a dataset you are interested in, GEE calls these Image Collections. For this demo, I chose MODIS using “MODIS/006/MCD43A4”. This collection contains every image in the public catalog — years of data covering the entire earth — so you need to apply some filters before you can retrieve any images.

In remote sensing and earth observation, it is common to use composite images, that is when you combine multiple images to create a new one. For example if a cloud happens to obstruct a portion of your image on one particular day, taking a median composite over the span of multiple days will give you a clean and cloudless image. Mean and median composites are other common examples, but you can get creative. For example if you are making a flood map, you might try a “wettest pixel” composite using a clever combination of bands that indicate presence of water.

Running the above code in a Jupyter environment like Colab will pull up an interactive map like the one below. In just a few lines of code, we have access to satellite images of the entire earth! Here is a median composite of the contiguous United States (CONUS) in June of 2020. Note that this particular image collection does not show the open ocean, the lighter blue on the edges is just the default map in the background.

MODIS imagery over the contiguous United States, June 2020 [4]. Visualization by author.

GEE also lets you export these images in common formats including geotiff. You can specify the resolution, projection, and a region of interest which can be defined using formats like geojson. This will come in handy for creating a dataset formatted for Machine Learning (ML).

Lastly, it is hard to talk about processing geospatial datasets without bringing up GDAL [6], but this tool is incredibly powerful and can get really complicated really fast! Rather than dive into it here, I’ll link you to an excellent 3 part tutorial by Robert Simmon [7].

AI Geospatial Wildfire Risk Prediction

The goal of this project is to use the expansive geospatial datasets available on GEE to create a map that rates wildfire risk and danger levels across the United States. Approaching this as a pixel-wise classification problem, there are two big steps to discuss. First is the preparation of the dataset which involves: the selection of input geospatial data, annotation of training labels, and preparing the data for ML using GIS software. Second is the training of a deep neural network to perform the pixelwise-wise classification, also called semantic segmentation. Finally, we can visualize the results and discuss the model’s performance.

i) Preparing a Dataset

Input: GEE Datasets

The input dataset consists of various geospatial data from the GEE catalog. The first and perhaps most straightforward image collection for this task was MODIS (shown above) [4], whose seven optical bands provide daily coverage of the United States at a resolution of 500m. It was chosen due to its high revisit rate which will allow the wildfire hazard prediction tool to be regenerated weekly.

The second catalog is a collection of meteorological data derived from GRIDMET, the University of Idaho’s Gridded Surface Meteorological Dataset [9]. The data are available daily at a resolution of 4000 meters, and contains information like maximum surface temperature, minimum humidity, standardized precipitation index (SPI), and Evaporative Demand Drought Index (EDDI).

GRIDMET maximum surface temperature over the contiguous United States [9]. Visualization by author.

The third catalog we considered was the LANDFIRE mean fire return interval (MFRI) [10]. Assembled by the U.S. Department of Agriculture’s Forest Service, and the U.S. Department of Interior’s Geological Survey, this map categorizes each 30 meter pixel based on the average number of years between wildfires using historical trends. The values are presented as categorical data, binned in 5 year intervals. While this map was only published once, the nature of the data keeps it relevant even several years later. The problem here of course is that this layer of input data will not change over time, but it should still help the model learn regional trends. In other words it is very useful for increasing spatial accuracy, but not for temporal accuracy. This tradeoff is discussed in more detail in part (iii).

LANDFIRE mean fire return interval over the contiguous United States [10]. As presented in this map, darker spots see fires more frequently. The map says nothing about the severity of fires. Visualization by author.

The forth and final collection considered for the input dataset is the USDA National Agricultural Statistics Service Cropland Data Layer [11]. It is a 30 meter resolution land cover classification map, classifying each pixel into one of over a hundred categories corresponding to different crops as well as various non-cultivated land types. This dataset will prove useful due to the way we measure wildfire hazard, which considers croplands to be “non-burnable,” in the same category as perennial snow/ice and bare ground. This may lead to some land that is difficult to classify, as it might appear burnable, but wouldn’t be considered so by our dataset. To counteract this, we manually created a binary mask categorizing each pixel as either cultivated or non-cultivated. (in the end we notice that the model actually performed just as well, if not better, without this additional layer)

National Agricultural Statistics Service Cropland Data Layer [11]. Though we implemented a binary layer for cultivated and non-cultivated land, this map shows all the available categories. Visualization by author.

Output: Wildfire Hazard Potential

Having a bunch of data to predict fires is great, but how are we going to learn what is actually correlated to fire risk? We need some reference map that quantifies fire danger. Obviously we can’t use images of places that did burn, because that doesn’t tell us much about the conditions preceding the fire. We considered finding regions that burned, mapping out their boundaries, and considering the weeks preceding the fire to be examples of high fire danger. While this approach could have good results, manually searching for fires and scripting a way to recover images from before the fire would be tedious, and would likely limit us to hot-spots like California. Furthermore, there would be no nuance for severity of fires.

Searching for an alternative fire hazard dataset, we came across this, the USDA Forest Service’s Wildfire Hazard Potential (WHP) map [12].

2020 Wildfire Hazard Potential map “to depict the relative potential for wildfire that would be difficult for suppression resources to contain” [12]. Image by USDA Forrest Service

The WHP map is a raster geospatial dataset produced to “help inform evaluations of wildfire hazard or prioritization of fuels management needs across very large landscapes.” The goal is not just to quantify how likely wildfires are, but also to take into account how intense, destructive, and hard to contain they would be.

“Areas mapped with higher WHP values represent fuels with a higher probability of experiencing torching, crowning, and other forms of extreme fire behavior under conducive weather conditions.” [12]

The map was published as a geospatial dataset for three years: 2014, 2018, and 2020. It is available either as categorical values, with 5 levels of fire hazard (very low, low, moderate, high, and very high) and 3 other categories (non-burnable, water, and developed), or as continuous integer values. While it is not available as part of the GEE catalog, the dataset is in the public domain and we were able to download it from USDA research data archives.

Processing: Preparing data for ML

We now have several input datasets that can be easily combined for both training and running models, as well as an output dataset that can serve as labels for supervised learning algorithm. Great! But one crucial step is still missing, the data is not at all formatted for ML. Especially not for algorithms designed for image processing.

The first thing we have to do is filter all the raster datasets by month and make sure they are properly lined up, in the same coordinate reference system (CRS), and at the same resolution. This is made rather simple with GDAL and the gdalwarp command. We chose the common EPSG:4326 as our CRS (it is used in GPS), and 500 meters as our resolution. (Technical note: when resampling a categorial raster datasets, make sure to use “near’ , “mode,” or similar resampling algorithms which preserve input values, NOT “bilinear”, “cubic”, etc…)

The second step is to cut up the entire CONUS into smaller images which can be processed by a deep learning model, and pair up the feature images with their labels. For the model to work, all input images must be of the exact same size, which is harder than it sounds when working within the quirks of projection systems — there are millions of ways to flatten spheres, and none of them are really that good! We used the online tool GeoJSON Grid Creator to generate a grid that perfectly tiled the United States, then manually removed any squares outside of US borders. We aimed for each tile to be roughly 256km across, or 512 pixels at 500m per pixel. The final grid pictured below contains 177 tiles.

Full grid across the CONUS. Tiles are approximately 256x256km. Image by author.

With the grid ready, we can now loop through every tile in the dataset and export the region as a geotiff with each of the feature layers stacked, along with a paired geotiff with the labels of the WHP map. We did this using median composites over three summer months (the height of the fire season in most of the country), June, July, and August, for each of the three years the WHP map was available, 2014, 2018, and 2020. The final dataset contains images with 13 layers of geospatial data over the entire CONUS 9 times over.

ii) Deep Learning Model

What is the goal?

Before we get into the model, I want to quickly talk about our objectives. Our initial goal was to have an algorithm that could generate wildfire risk maps weekly over the CONUS. We knew early on that finding labeled data for what was an “at risk area” was going to be hard. Not only is the process of annotating every pixel over the US at any resolution difficult, but what constitutes fire danger is nuanced and complicated, and we are not well versed enough in the subject to make that judgement ourselves. Finding the WHP map solves this issue for us, but from an ML perspective, it has its problems. The biggest of which is that it is generated once every few years and does not correspond to any particular point in time. If we generate a new map every week, or even every month, we are really stretching the intended use of the WHP map.

Essentially, we are performing temporal interpolation by assuming that the WHP map corresponds best to observation of the US at the peak of fire season, in summer months, and hoping that the model will learn enough about geographically different locations to apply those nuances to different times. In other words, we want it to learn temporal differences from spatial differences. This will lead to somewhat of a domain gap at inference time if we try to predict fire hazard during different seasons. A domain gap in ML is when the data used for training are dissimilar from the data used at inference. While we know this will make the model imperfect, we still hope to see how it performs, and perhaps learn something for future attempts.

A Neural Network for Semantic Segmentation

The task of labeling every pixel in an image according to more than 2 classes is known as multi-class semantic segmentationThe U-Net introduced by O. Ronneberger, P. Fischer, and T. Brox in 2015 is an excellent neural network model for this task [13]. While originally used for binary segmentation of cells in microscope images, it has been shown to generalize well to other segmentation tasks including multi-class segmentation of remote sensing data. The architecture is pictured below.

Architecture of the U-Net as presented in the original paper. Diagram by Ronneberger et al.

In this example, input images are 572×572. They pass through convolutional layers, increasing the number of channels. They are then contracted by a factor of 2 using a max pool operation, where they pass through another set of convolutional layers. This goes on 3 more times, after which the “images” or feature maps are then passed back up, using an up-convolution to invert the max pool operation. The images from the contracting path are appended to the corresponding images in the expanding path to help preserve the shapes and content of the original inputs.

Our implementation of the model using Tensorflow and Keras differs slightly to support the multiple input channels, the different image dimensions, and the multiple classes at the output. The full model can be initialized using a function, as shown below.

Our Adjustments

To support the sizes of our images, we had to make another important set of changes. Because the images have to be appended to one another, the dimensions in the contracting and expanding paths have to match exactly. For this to happen, the image dimensions must be cleanly divisible by 2 before each contraction. Our images were 513×565, clearly not divisible by 2, let alone 16. Using GDAL to read the geotiffs into a numpy array, we then pad the images to a dimension of 576×528, the next available multiples of 16.

We also implemented a weighted loss function to prevent the model from completely ignoring the less common classes like Very High Risk or Water. It does so by increasing the cost of a misclassifications inversely proportional to a class’s prevalence. The class distribution is shown below, note how the padding has added empty pixels covering about 12.6% of the image.

Pixel-wise distribution of the classes in the padded and labeled images. Image by author.

With those important adjustments, we were able to train our model using 80% of our dataset, keeping the remaining 20% for validation and testing. We chose a batch size of 16, as our GPU didn’t have enough memory for 32. Our model converged after 20 epochs with an accuracy of 65% for both training and validation data.

Visualizing the Output

To observe the outputs of our model, we had to convert them back into geotiffs with appropriate coordinates. To do this, we save the metadata of the target images for each tile, and use it to export the array at the output of the neural network. We can then use GQIS to reassemble the tiles and display them on a map along with some additional elements like country and state borders, as well as a legend [14]. Using this method along with our model, we can now finally recreate the WHP map for years during which it was not made available!

Predicted Wildfire Hazard Potential map for July of 2019. Image by author.

From observing a few outputs during the summer months, we notice that our model is quite shy at predicting the highest level of risk. Comparing this map to original WHP maps, we notice a lot of red missing, particularly from the westernmost part of the United States. However, the model promisingly considers most of these regions with High Risk. Similarly, the model often ignores the Low Risk category, opting for Very Low Risk instead. These issues could possibly be solved by adjusting the loss function to further punish misclassifications of these underrepresented classes.

The model had little to no trouble identifying water as well as the non-burnable areas, including cultivated land in the Midwest and along the Mississippi River.

So the model performs reasonably well when presented with images from summers for which it wasn’t trained, but what would happen if we tried to predict wildfire hazard in winter months. To test this, we ran the predictions for January of 2020. Here are the results.

Predicted Wildfire Hazard Potential map for July of 2019. Image by author.

Excellent! This shows that the model doesn’t just memorize and recreate the WHP map. There is evidence that suggests the model learned to generalize to months outside of its training domain. For example, the parts of the country that were covered in snow now appear as non-burnable, which makes sense since perennial ice and snow (which are very uncommon in the training data) were considered non-burnable in the original WHP map. Another nice result is that many High Risk and Very High Risk areas in the west have been downgraded, to Moderate Risk or even lower.

While the model does show evidence of learning to generalize temporal differences from geographic ones, there are still some clear problems. For example much of the south is now considered at higher risk than it was in the summer. Some research does suggest that states like Georgia and the Carolinas are more likely to experience fires in late winter or spring than states like California, but it still seems the model is over estimating the risk. Since the south does not experience much snow, the leafless trees and dry winter conditions (not seen during training) could be causing this.

Another couple of problems that are only apparent in the winter predictions are the noisier outputs and grid-like artifacts. In the non-burnable areas throughout the northern-most parts of the country, there appears to be some speckles of unnaturally high predictions, including a lot of single pixels of Very High Risk. The grid-like artifacts, most apparent around Oklahoma, Missouri, and Arkansas, seem to be caused by the tiling, though they don’t show up in summer predictions so there may be a way to correct them.

iii) What did we learn?

Given the daunting nature of this project, and that we only had a few short weeks to complete it, we were thrilled with the results. Learning to work with geospatial data was eye-opening, and using it to prepare a labeled machine learning dataset was anything but trivial. Working on this project has taught us a tremendous amount about both working with geospatial raster data and preprocessing data for ML.

At the time we started this project, no one to our knowledge had published models for automatically predicting wildfire risk at the scale of the entire country. While this model is far from perfect, it serves as a proof of concept. It demonstrates the possibilities for using geospatial data through APIs like GEE in deep neural networks to perform tasks on scales as large as the contiguous United States.

Geospatial data are plentiful, easy to access, and provide the perfect framework for data-hungry deep learning models. I highly encourage everyone to explore the possibilities! However, obtaining good labels can be quite difficult. Quantifying fire risk or danger remains the biggest problems with this particular tasks. The WHP map was a great starting point for this project, but with some expert help or even some manual labelling, the results would improve significantly. I look forward to revisiting this project when time permits.

I hope you enjoyed this read! If you have any questions about this project and the many implementation details I glossed over, please reach out either by leaving a comment or contacting me directly at I’d like to thank my collaborators on this project Kevin Kerliu and Brandon Bunt, as well as our professor Krishna Karra who showed us the ropes for raster data, geographic information systems, and tools like GDAL, GQIS, and GEE.


[1] National Statistics (2021), National Interagency Fire Center
[2] T. Feo and S. Evans, The Costs of Wildfire in California (2020), California Council on Science & Technology
[3] A. Beam, California OKs new spending on drought, wildfire prevention (2021), Associated Press News
[4] ORNL DAAC, MODIS and VIIRS Land Products Global Subsetting and Visualization Tool (2018), NASA EOSDIS Land Processes DAAC
[5] S.C. Goslee, Analyzing Remote Sensing Data in R: The LANDSAT Package (2011), Journal of Statistical Software
[6] N. Gorelick, M. Hancher, M. Dixon, S. Ilyushchenko, D. Thau, and R. Moore, Google Earth Engine: Planetary-scale geospatial analysis for everyone, Remote Sensing of Environment
[7] GDAL/OGR contributors, GDAL/OGR Geospatial Data Abstraction software Library (2022), Open Source Geospatial Foundation
[8] R. Simmon, A Gentle Introduction to GDAL (2017), Planet Stories
[9] J.T. Abatzoglou, Development of gridded surface meteorological data for ecological applications and modelling (2013), The International Journal of Climatology
[10] LANDFIRE, LANDFIRE: Existing Vegetation Type (2020), U.S. Department of Agriculture and U.S. Department of the Interior
[11] USDA National Agricultural Statistics Service, Cropland Data Layer (2020)
[12] G.K. Dillon, J. Menakis, and F. Fay, Wildland Fire Potential: A Tool for Assessing Wildfire Risk and Fuels Management Needs (2015), U.S. Department of Agriculture, Forest Service
[13] O. Ronneberger, P. Fischer, and T. Brox, U-Net: Convolutional Networks for Biomedical Image Segmentation (2015), The Computing Research Repository
[14], QGIS Geographic Information System (2022), QGIS Association

Original post:

Leave a Reply

Your email address will not be published. Required fields are marked *