This tutorial is about the land cover classification of several Landsat images in order to create a classification of a large study area using the Semi-Automatic Classification Plugin (SCP). For very basic tutorials see Tutorial 1: Your First Land Cover Classification and Tutorial 2: Land Cover Classification of Landsat Images.
The study area of this tutorial is Costa Rica , a Country in Central America that has an extension of about 51,000 square kilometres. In particular, we are going to classify Landsat 8 and Landsat 7 images, masking clouds and creating a mosaic of classifications. We are going to identify the following land cover classes:
First, install the SCP. For information about the installation of QGIS in various systems see Plugin Installation.
Open QGIS; from the main menu, select
Manage and Install Plugins;
From the menu
All, select the Semi-Automatic Classification Plugin and click the button
The SCP should be automatically activated; however, be sure that the Semi-Automatic Classification Plugin is checked in the menu
Installed(the restart of QGIS could be necessary to complete the SCP installation);
We are going to set some of the SCP options in order to optimize the following processes. Open the Settings clicking the button and select Settings: Processing. Now, in Classification process check the options
Use virtual rasters for temp filesand
Raster compression, in order to save disk space during the processing.
In RAM set the available RAM (in MB) for processing entering half of the system RAM; for instance it your system has 2GB of RAM enter 1024. If the system is 32bit, due to system limitations you should not enter values higher than 512MB.
In order to ease the photo interpretation in the following steps, we are going to use also the OpenLayers Plugin which allows for the display of several maps. If you don’t have already installed, follow the same steps previously described and install the OpenLayers Plugin in QGIS.
Download and Pre processing of Landsat images
We are going to download Landsat 7 and 8 images using the SCP tool Download Landsat. Landsat images are available from the U.S. Geological Survey, and these bands are downloaded through the Google Earth Engine and the Amazon Web Services. Also, we are going to convert Landsat images to reflectance and apply the DOS1 atmospheric correction (see Landsat image conversion to reflectance and DOS1 atmospheric correction).
First, we need to download the Landsat database in SCP. Open the tab Download Landsat clicking the button in the SCP menu or the Toolbar. Click the button
Select database directoryin order to define where to save the database. It is preferable to create a new directory (e.g.
LandsatDB) in the user directory. Click the button
Update databaseand click
Yesin the following question about updating the image database.
TIP : Landsat databases are updated daily, therefore when you need up to date images you should click the button
Update databasein order to the get the latest Landsat images.
Now we could define the Area coordinates of the study area, click
Find imagesand browse Landsat images. Each Landsat image has a unique ID (i.e. identifier). In this tutorial we are going to use two Landsat 8 images acquired on February 2014 (IDs: LC80150532014050LGN00 and LC80160532014057LGN00) and a Landsat 7 image acquired on March 2014 (ID: LE70150532014090EDC00); of course, more images are required for the classification of the whole Country.
With SCP, it is possible to find an image basing on the ID thereof using the Search options. In particular, in
Image IDpaste the following IDs and click
After a few seconds, the three images are listed in the Landsat images.
Before downloading the images, we need to define the options for the conversion to reflectance which will be performed automatically to downloaded images. Open the tab Landsat clicking the button in the Toolbar ; enable the options
Apply DOS1 atmospheric correctionand
Brightness temperature in Celsius. Also, leave checked
Use NoData value (image has blackborder).
TIP : check
Perform pan-sharpeningin order to perform the Pan-sharpening of Landsat images producing bands with 15m spatial resolution; of course, using pan-sharpened images increases the classification time (because a greater number of pixels need to be processed) and can increase the spectral variability.
Now, open the tab Download Landsat and uncheck the options
only if preview in Layersand
Load bands in QGIS(leave checked
Pre process imagesin order to convert bands to reflectance automatically). Click the button
Download images from listto select an output directory and start the download process (this may take a while).
When the download process is finished, several directories are created in the output directory with the name like Landsat ID, containing the original Landsat bands and the converted bands (with the suffix
Classification of Landsat Images
We are going to start the classification of the Landsat 8 image
LC80150532014050LGN00converted to reflectance. Open the directory
In QGIS, open the following bands (also with drag and drop):
- RT_LC80150532014050LGN00_B2.tif = Blue;
- RT_LC80150532014050LGN00_B3.tif = Green;
- RT_LC80150532014050LGN00_B4.tif = Red;
- RT_LC80150532014050LGN00_B5.tif = Near-Infrared;
- RT_LC80150532014050LGN00_B6.tif = Short Wavelength Infrared 1;
- RT_LC80150532014050LGN00_B7.tif = Short Wavelength Infrared 2.
Open the tab Band set clicking the button in the SCP menu or the Toolbar. Click the button
Select All, then
Add rasters to set, and then
Sort by namefor ordering bands automatically. Finally, select
Landsat 8 OLIfrom the combo box
Quick wavelength settings, in order to set automatically the center wavelength of each band (this is required for the spectral signature calculation).
TIP : click the button
Build band overviewsin order to improve display performance of bands.
In the list
RGBselect the item
4-3-2for displaying a Color Composite of Near-Infrared, Red, and Green. A temporary virtual raster of the Band set will be created in QGIS, allowing for the photo interpretation of the image.
Now we need to create
Signature list filein order to collect Training Areas (ROIs) and calculate the Spectral Signature thereof (for very basic definitions see Tutorial 1: Your First Land Cover Classification).
In the ROI Creation dock click the button
New shpand define a name (e.g.
ROI.shp) in order to create the
Training shapefilethat will store ROI polygons. The shapefile is created and added to QGIS. The name of the
Training shapefileis displayed in Training shapefile .
Also, click the button
Savein the Classification dock and define a name (e.g.
SIG.xml) in order to create the
Signature list filethat will store spectral signatures. The path of the
Signaturelist fileis displayed in Signature list file .
Now we are ready for the creation of ROIs. We are going to use the same codes for ROIs in all the Landsat images, according to the following table.
|Macroclass name||Macroclass ID|
About the basics of ROI creation see Create the ROIs. It is possible to create ROIs by drawing manually a polygon using the button or with region growing pressing the button + and then clicking the map. Use the button in ROI creation for zooming to the polygon extent of the ROI, and
Showfor showing or hiding the temporary ROI.
With the ROI Creation dock create as many ROIs as possible, assigning a unique Class ID (C_ID) to each ROI, and the Macroclass ID (MC_ID) of the corresponding Macroclass. If
Displaycursor foris checked in the ROI creation, the NDVI value of the pixel beneath the cursor is displayed in the map: it is useful for detecting vegetation pixels (characterized by high NDVI values).
TIP : change frequently the Color Composite and use the buttons and in the Toolbar for stretching the minimum and maximum values of the displayed image; also, use the button
Showfor hiding and showing the image.
ROIs are used for the calculation of spectral signatures that are used by the classification algorithm in order to classify the entire image. In this tutorial we are going to use the Maximum Likelihoodalgorithm.
After the creation of each ROI it is useful to check the Spectral Distance in order to assess the separability of ROI; in fact, each ROI should be different (i.e. spectrally distant) from the others, in order to avoid spectral confusion and achieve better classification results.
In the Signature list highlight the ROIs and click the button . Spectral signature are added to the Spectral Signature Plot.
Now click the tab Spectral distances. Each table represent the Spectral Distance of each ROI combination.
As shown in the following figure, the comparison of the Built-up ROI and the Soil ROI highlights very low Spectral Angle and Euclidean Distance; this means high similarity if we used the Spectra Angle Mapping or the Minimum Distance algorithms. The Jeffries-Matusita Distance is near 2; this means that the two ROIs are separable for the Maximum Likelihood algorithm.
Since we are using the Maximum Likelihood algorithm, it is important that the Jeffries-Matusita Distance is near 2 for each ROI combination.
Now we can create a classification preview (see Create a Classification Preview for the basics of classification previews).
In the Classification algorithm select the classification algorithm
Maximum Likelihood. In Classification preview set
Size= 500 , click the button
+and then left click in the map in order to create a classification preview. Use the
Transparencytool for changing the preview transparency and display the classification over the image.
In the Classification algorithm, click the button
+and then right click in the map for calculating the
algorithm raster. The
algorithm rasterrepresents the calculation result of the Classification Algorithms; it is useful for locating where we need to create new ROIs.
As shown in the following figure, the
algorithm rasterhas a grey scale symbology, where dark areas represent pixels that the algorithm found distant from all the spectral signatures and white areas represents pixels that are very similar to spectral signatures. In these dark areas we have a greater level of uncertainty, therefore we need to create new ROIs in order to improve the classification results.
We can notice the presence of clouds in the image. In order to avoid classification errors, we need to mask clouds.
There are several methods for masking clouds; during the classification step, a simple method for masking clouds is the creation of ROIs. Create a new ROI inside a cloud in the image, and assign a unique Class ID and the Macroclass ID equals to 0. In fact, the MC ID = 0 is used by SCP for the category
Unclassified, which means that cloud pixels are not classified (i.e. masked).
In the following image, we can see that clouds are now masked. However, pixels near the border of clouds are classified incorrectly as Built-up. In the next paragraphs, more effective methods are described for masking clouds after the classification process (see Cloud Masking).
TIP : load a service such as OpenStreetMap using the OpenLayers Plugin, which can ease the photo interpretation and the ROI creation.
When we are happy with the results of the previews, we can perform the classification of the whole image. In Classification algorithm, activate the checkbox
Use Macroclass ID. In theClassification output click the button
Perform classificationand define the name of the classification output (e.g.
We can see that part of the clouds are black (i.e. unclassified), however several cloud pixels are classified as Built-up. Also, the black border of the Landsat image is classified as Built-up. We are going to correct these errors and refine the classification in the next steps.
Now, in QGIS open the following Landsat 8 bands that are inside the directory
- RT_LC80160532014057LGN00_B2.tif = Blue;
- RT_LC80160532014057LGN00_B3.tif = Green;
- RT_LC80160532014057LGN00_B4.tif = Red;
- RT_LC80160532014057LGN00_B5.tif = Near-Infrared;
- RT_LC80160532014057LGN00_B6.tif = Short Wavelength Infrared 1;
- RT_LC80160532014057LGN00_B7.tif = Short Wavelength Infrared 2.
Repeat the above steps for the creation of the Band set, the
Signature list file.
TIP : close QGIS and create a new QGIS project for each Landsat image, in order to delete temporary files and free disk space.
Create a land cover classification repeating the steps previously described.
In a new QGIS project, open the Landsat 7 bands inside the directory
- RT_LE70150532014090EDC00_B1.tif = Blue;
- RT_LE70150532014090EDC00_B2.tif = Green;
- RT_LE70150532014090EDC00_B3.tif = Red;
- RT_LE70150532014090EDC00_B4.tif = Near-Infrared;
- RT_LE70150532014090EDC00_B5.tif = Short Wavelength Infrared 1;
- RT_LE70150532014090EDC00_B7.tif = Short Wavelength Infrared 2.
You can see that this image covers the same area as the Landsat 8 image
LC80150532014050LGN00. In fact, we are going to use the classification of this Landsat 7 image in order to fill the Unclassified pixels of the Landsat 8 image.
Again, create a land cover classification following the steps previously described.
Now, we have 3 land cover classifications that we can enhance in several ways.
Enhancement of Classification Using NDVI
We are going to calculate NDVI for enhancing the classification using the Band calc (see Tutorial: Using the tool Band calc). In particular, pixels where NDVI value is above a certain threshold will be classified as vegetation (code 2). Below this NDVI threshold, the Maximum Likelihood classification is untouched.
Of course, this is an example of integration of ancillary data; we could use other data such as other vegetation indices or the result of other classifications (e.g. using Spectra Angle Mapping).
Now, in QGIS load the bands of the Landsat 8 image
LC80150532014050LGN00and the respective land cover classification. Open the Band calc and click the button
Refresh list. In the Band calc, calculate the NDVI copying the following Expression:
Click the button
Calculate, select where to save the NDVI (e.g. a new file named
Then, calculate the following Expression for enhancing the classification basing on the NDVI:
Click the button
Calculate, and select where to save the new classification (e.g.
classification_1_NDVI.tif). We can see in the following figure that the area classified as vegetation has increased.
In this case we have used a NDVI threshold equals to 0.6 . However, the threshold value has to be chosen for every image, because NDVI can vary from image to image.
Now we perform the same enhancement for the other land cover classifications. For the Landsat 8 image
LC80160532014057LGN00calculate NDVI with the following expression:
and the following expression for enhancing the classification:
For the Landsat 7 image
LE70150532014090EDC00calculate NDVI with the following expression:
and the following expression for enhancing the classification:
Now that the classification of vegetation has been enhanced for the three images, we are going to mask clouds and border pixels in order to avoid classification errors.
Landsat 8 images include Quality Assessment bands (QA) that are useful for identifying clouds. Pixel values of QA bands are codes that represent combinations of surface and atmosphere conditions. These values indicate with high confidence cirrus or clouds pixels (for the description of these codes see the table at http://landsat.usgs.gov/L8QualityAssessmentBand.php ).
The QA band of the Landsat 8 image
LC80150532014050LGN00includes mainly the values 53248 and 61440 indicating clouds, and the value 36864 indicating potential clouds. Therefore, we are going to write an expression that masks our classification (i.e.
classification_1_NDVI) where pixels of the QA band are equal to one of these values.
In QGIS, open the band
LC80150532014050LGN00_BQAthat is inside the directory
LC80150532014050LGN00of the downloaded Landsat image and the
classification_1_NDVI. Copy the followingExpression in the Band calc:
Click the button
Calculate, and select where to save the new classification (e.g.
Clouds are almost completely masked (i.e. Unclassified); however, some pixels are still classified as Built-up (in red). We can do the same for the image
LC80160532014057LGN00using the following Expression in the Band calc:
The Landsat 7 image does not have the QA band. Another method for masking clouds uses the Blue and the Thermal Infrared (converted to temperature) bands, basing on the fact that clouds are generally colder than soil and have high reflectance values in the blue band. Landsat 7 is also affected by black stripes (i.e. SLC-off) that we are going to mask as well.
We are going to create an expression that identifies pixel values below a certain temperature threshold for the Thermal band (band 6 for Landsat 7), and above a certain reflectance threshold for the Blue band (band 1).
In QGIS load all the Landsat bands inside the directory
LE70150532014090EDC00_converted. Use the following expression in the Band calc:
The first part (
("RT_LE70150532014090EDC00_B6_VCID_1"<23) & ("RT_LE70150532014090EDC00_B1">0.1)) means that we are going to mask pixels that have both temperature lower than 23°C and Blue band reflectance greater than 0.1 . These threshold values have been identified in the image, using the tool
Identifyof QGIS for cloud pixels in band 1 and band 6.
or, so that the other expressions (e.g.
"RT_LE70150532014090EDC00_B1" == 0) identify pixel values equal to 0 (which are NoData) for every Landsat band, in order to mask the black stripes due to SLC-off and the black border.
We could use the same method of cloud masking also for Landsat 8 images. For the image
LC80150532014050LGN00load the bands
RT_LC80150532014050LGN00_B2, and use the following Expression in the Band calc:
"RT_LC80160532014057LGN00_B2" == 0allows for the masking of the image black border.
As you can see, there are still gaps (Unclassified pixels) in the classification; we would require the classification of other Landsat images in order to fill those gaps. After the cloud masking of these three classifications, we can create one mosaic that is the classification of the whole study area.
Part of the unclassified gaps has been filled with the Landsat 7 classification. Of course, we would require more classifications in order to fill all the gaps.
Mosaic of Classifications
In order to create a mosaic of classifications, we are going to write an expression that will fill Unclassified pixels of the Landsat 8 image (ID
LC80150532014050LGN00) with the classification of the Landsat 7 image (ID
LE70150532014090EDC00). Also, we are going to merge these classifications to third one (the Landsat 8 image with ID
Uncheck the checkbox
Intersectionin Output raster and click
Calculate. The result (e.g.
classification_mosaic) is shown in the following image.
In the following steps we are going to perform the accuracy assessment and the estimation of land cover area.
Accuracy Assessment is an important step of a land cover classification. In this tutorial we are going to use the
Training shapefileas reference for assessing classification accuracy. However, there other methods that can improve the validation reliability (see http://fromgistors.blogspot.com/2014/09/accuracy-assessment-using-random-points.html ).
In QGIS, load the classification mosaic and the
Training shapefileused for the image
LC80150532014050LGN00. In SCP open the tab Accuracy and click the buttons
Refresh list. Select
classification_mosaicas the classification to assess and the
Training shapefileas reference shapefile. Also, select
Shapefile field. Click
Calculate error matrixand choose the output destination (e.g.
The process produces an
error matrixand an
error rasterwhich are useful for assessing the quality of our classification.
Clip of the Classification
Before calculating the area of each land cover class, we need to clip the classification to the extent of the study area, which is Costa Rica.
Download the Shapefile of Sub-National Administrative Units of Costa Rica from http://data.fao.org/map?entryId=c7a0f990-88fd-11da-a88f-000d939bc5d8&tab=metadata (clicking the Download button) by the FAO .
Extract the downloaded file (
1173.zip) and load the shapefile
costa rica.shpin QGIS (select WGS84 as projection).
In this case, we need to define the projection of this shapefile. In QGIS, open the command
Vector > Data management tool > Define current projection; select the shapefile
Input vector layerand choose
EPSG:4326 - WGS 84as spatial reference, and click
Now we can clip the
classification_mosaic.tif. Load the classification in QGIS. Open the command
Raster > Extraction > Clipper. Select the
classification_mosaicas input raster; set the output file (e.g.
classification_clip.tif), and set
No data valueequals to 0. In
Mask layerand select
costa rica, then click
Finally, we have a classification clipped to the extent of Costa Rica (as you can see we would need other classifications for covering the whole extent of Costa Rica), and we can calculate the classification report.
In SCP open the tab Classification report and click the buttons
Refresh list. Check
Use NoData valuesetting the value equals to 0 and click the button
Calculate classification report. The classification report is displayed with the count of pixels, the area, and percentage of each land cover class. You can save the report to text file clicking the button
Save report to file.
We have completed this tutorial about the land cover classification of a large area, using multiple Landsat images and creating a classification mosaic. It is worth pointing out that classification results depend on the season of the images. Therefore, the input images should be acquired in the same period, in order to avoid differences due for instance to the phenological state of vegetation or occurred land cover change.