Updated tutorial at https://fromgistors.blogspot.com/2016/09/semi-automatic-classification-pluginv5.html
This basic tutorial illustrates how to perform a supervised classification of land cover using the Semi-Automatic Classification Plugin (SCP) 3.0 "Rome" for QGIS.
We are going to classify a subset of Landsat 8 image acquired over Rome, Italy (data available from the U.S. Geological Survey) on June 12, 2014. Using a semi-automatic approach we are going to rapidly classify the image and estimate land cover area, in only six phases.
Firstly, I would like to congratulate with the QGIS developers for the release of QGIS 2.4.
Firstly, I would like to congratulate with the QGIS developers for the release of QGIS 2.4.
Following, a very brief installation guide of QGIS and the SCP:
The sample dataset of this tutorial is available for download from here. The zip file can be extracted with any file archiver software (for instance the open source 7-zip).
The dataset includes the metadata file (MTL.txt) and the following Landsat 8 bands (16 bit raster) :
Steps:
Repeat the above steps for every land cover class, and assign to each ROI a new incremental class ID, and the following macroclass IDs:
Classifications previews are useful during the collection of ROIs, and for the selection of the more accurate spectral signatures.
If the preview results are considered good (i.e. classes are correctly identified), the classification of the entire image can be performed. Otherwise, it is possible to remove one or more spectral signatures, or add new spectral signatures creating other ROIs as described in the previous phase.
Steps:
- For Windows, download the QGIS installer from here (preferably the 64 bit version); then install the SCP from the plugin manager;
- For Linux, install QGIS and the python-scipy package; then install the SCP from the plugin manager;
- For Mac OS, install QGIS (and GDAL) from here and the python modules Numpy, Scipy, and Matplotlib from here; then install the SCP from the plugin manager.
The sample dataset of this tutorial is available for download from here. The zip file can be extracted with any file archiver software (for instance the open source 7-zip).
The dataset includes the metadata file (MTL.txt) and the following Landsat 8 bands (16 bit raster) :
- Band 2 = Blue;
- Band 3 = Green;
- Band 4 = Red;
- Band 5 = Near-Infrared;
- Band 6 = Short Wavelength Infrared 1;
- Band 7 = Short Wavelength Infrared 2.
The objective of this tutorial is to classify the following land cover classes:
- Class 1 = Water (e.g. surface water);
- Class 2 = Vegetation (e.g. grassland or trees);
- Class 3 = Built-up (e.g. artificial areas, buildings and asphalt);
- Class 4 = Bare soil (e.g. soil without vegetation).
Following, the tutorial phases are illustrated along with a brief description thereof.
1. Conversion of raster bands from DN to Reflectance
The pre processing phase is required before the actual image processing in order to improve the classification results.
SCP allows for the automated conversion of Landsat DN (i.e. Digital Numbers) to the physical measure of Top Of Atmosphere reflectance (TOA). Also, SCP implements an image-based atmospheric correction using the DOS1 method (Dark Object Subtraction 1).
Also, we are going to create a color composite of the image. In particular, the composite RGB = 543 (that is the equivalent of RGB = 432 for Landsat 7) is useful for the interpretation of the image, because of healthy vegetation reflects a large part of the incident light in the near-infrared wavelength, resulting in higher reflectance values for band 5; thus vegetation pixels appear red with this color composite.
SCP allows for the automated conversion of Landsat DN (i.e. Digital Numbers) to the physical measure of Top Of Atmosphere reflectance (TOA). Also, SCP implements an image-based atmospheric correction using the DOS1 method (Dark Object Subtraction 1).
Also, we are going to create a color composite of the image. In particular, the composite RGB = 543 (that is the equivalent of RGB = 432 for Landsat 7) is useful for the interpretation of the image, because of healthy vegetation reflects a large part of the incident light in the near-infrared wavelength, resulting in higher reflectance values for band 5; thus vegetation pixels appear red with this color composite.
Steps:
- Open QGIS and start the Semi-Automatic Classification Plugin;
- In the main interface select the tab Pre processing > Landsat;
SCP Toolbar |
SCP Landsat tab |
- Select the directory that contains the Landsat bands (and also the required metafile MTL.txt), and select the output directory where converted bands are saved;
- Check the option Apply DOS1 atmospheric correction, and click Perform conversion to convert Landsat bands to reflectance (leaving checked Create Virtual Raster);
SCP Landsat conversion to TOA reflectance |
- At the end of the process, converted bands are loaded in QGIS. Also, a virtual raster named landsat.vrt is loaded (containing all the Landsat bands converted to reflectance), which is useful for the creation of color composites;
- Select the Landsat virtual raster, left click and open its properties; in Style select band 4 (i.e. Near-Infrared) for the red band, band 3 (i.e. Red) for the green band, and band 2 (i.e. Green) for the blue band.
Selection of RGB bands in QGIS
We need to define the input image (i.e. the Landsat bands), the training shapefile (for the ROI collection), and the signature list file (which stores the spectral signatures calculated from ROIs or imported from other sources) in SCP.
Steps:
Tip: in order to use a training shapefile created with previous versions of the SCP, or a shapefile from other sources, add 4 fields to this shapefile: Macro_ID (int), MCl_info (text), Class_ID (int), Class_info (text).
If some of these fields are already present in the shapefile but have different names, you can simply change the field names in the tab Settings of the SCP, according to your shapefile.
Remember that every record should have no empty field.
Steps:
- In the SCP Main interface select the tab Band set (also, a button is available in the SCP toolbar); click the button Select All, then Add rasters to set (order the band names in ascending order, from top to bottom using ? and ? arrows); then, select the Landsat 8 OLI from the combo box Quick wavelength settings, in order to set automatically the center wavelength of bands.
SCP Toolbar |
Band set definition |
- In order to create the training shapefile, in the dock ROI creation click the button New shp, and select where to save the shapefile (for instance ROI.shp);
- Click the button Save in the dock Classification, in order to create a signature list file (for instance SIG.xml).
In the SCP toolbar, the name << band set >> is displayed in the Input image combo box. The shapefile name is displayed in the Training shapefile combo box, and the path to the xml file is displayed in the Signature list file. Now we are ready to collect the ROIs.
SCP input defined |
Tab Settings |
If some of these fields are already present in the shapefile but have different names, you can simply change the field names in the tab Settings of the SCP, according to your shapefile.
Remember that every record should have no empty field.
Extra: Classification using only spectral signatures.
In order to show you the main new feature of the SCP, please download the signature file here (this is the result of the following step 3), and import the file SIG.xml in the Classification dock (click Import list).
Now, perform a classification preview with a click on the button +. As you can see, having the spectral signatures that we need, we are able to classify the image without creating ROIs. In the same way, we could classify another image with the same spectral signatures (if the image is acquired with the same sensor, and pixel values are converted to reflectance). Now delete all the signatures (highlight the items in the table with a click and then click the button Delete highlighted signatures) because we are going to collect the ROIs and calculate the signatures from these ROIs.
In order to show you the main new feature of the SCP, please download the signature file here (this is the result of the following step 3), and import the file SIG.xml in the Classification dock (click Import list).
Now, perform a classification preview with a click on the button +. As you can see, having the spectral signatures that we need, we are able to classify the image without creating ROIs. In the same way, we could classify another image with the same spectral signatures (if the image is acquired with the same sensor, and pixel values are converted to reflectance). Now delete all the signatures (highlight the items in the table with a click and then click the button Delete highlighted signatures) because we are going to collect the ROIs and calculate the signatures from these ROIs.
3. Collection of ROIs and Spectral Signatures
ROIs are polygons drawn over homogeneous areas of the image that represent land cover classes. ROIs can be drawn manually or with a region growing process (i.e. image segmentation that groups similar pixels), and they should account for the spectral variability of land cover classes.
SCP calculates the spectral signatures (which are used by classification algorithms) considering the pixel values under each ROI.
SCP allows for the definition of a Macroclass ID (i.e. MC ID) and a Class ID (i.e. C ID) for each ROI or spectral signature, which are the identification codes of land cover classes.
Macroclasses allows for the classification of materials that have different spectral signatures (therefore are processed individually), but belong to the same land cover class (thus the same MC ID is assigned to these pixels). For instance we could classify grass (e.g. C ID = 1 and MC ID = 1) and trees (e.g. C ID = 2 and MC ID = 1) as a vegetation class (e.g. MC ID = 1).
Every ROI (or spectral signature) should have a unique C ID, while the MC ID can be shared with other ROIs. In the dock Classification it is possible to choose between MC ID and C ID classification.
Steps:
SCP allows for the definition of a Macroclass ID (i.e. MC ID) and a Class ID (i.e. C ID) for each ROI or spectral signature, which are the identification codes of land cover classes.
Macroclasses allows for the classification of materials that have different spectral signatures (therefore are processed individually), but belong to the same land cover class (thus the same MC ID is assigned to these pixels). For instance we could classify grass (e.g. C ID = 1 and MC ID = 1) and trees (e.g. C ID = 2 and MC ID = 1) as a vegetation class (e.g. MC ID = 1).
Every ROI (or spectral signature) should have a unique C ID, while the MC ID can be shared with other ROIs. In the dock Classification it is possible to choose between MC ID and C ID classification.
Steps:
- In order to create a ROI, in the dock ROI creation click the button + beside Create a ROI and then click any pixel of the image; zoom in the map and click on a blue pixel of the Tiber river (in order to define the ROI extent, change the values for Min ROI size and Range radius); after a few seconds the ROI polygon will appear over the image (a semitransparent orange polygon);
ROI in the Tiber river |
- Under ROI Signature definition type a brief description of the ROI inside the field Class Information and Macroclass Information, and assign a Macroclass ID and Class ID (it is possible to rename and change these codes later from the ROI list table, which also changes the values in the training shapefile);
- In order to save the ROI to the training shapefile click the button Save ROI to shapefile; if the checkbox Add sig. list is checked, then the spectral signature is calculated (the mean of ROI pixel values for all the bands) and added to the Signature list table; also, it is possible to add signatures later, highlighting the ROIs in the ROI list and clicking Add to signature (if two or more highighted ROIs have the same MC ID and C ID, a unique spectral signature is calculated considering all the pixels that are under those ROIs);
- Define the color of classes that will be used in the classification, with a double click on the Color column in the Signature list (the signature list is automatically saved when you save the QGIS project, or when you click the button Save in the dock Classification).
ROI created with SCP region growing |
Color selection for the spectral signature |
- Water (e.g. surface water): MC ID = 1
- Vegetation (e.g. grassland or trees): MC ID = 2
- Built-up (e.g. artificial areas, buildings and asphalt): MC ID = 3
- Bare soil (e.g. soil without vegetation): MC ID = 4
Following, a few examples of ROIs created for these land cover classes.
Bare soil |
Built-up |
Vegetation - Grass |
Vegetation - Trees |
After the collection of several ROIs, it is useful to visualize the spectral signatures thereof, in order to assess the spectral similarity:
- In the Signature list table, highlight one or more signatures, and click the button spectral signature plot; in the Spectral Signature Plot window, if the checkox Plot s is checked, then the plot will display the standard deviation of the signatures.
Spectral signature plot |
You can download the final training shapefile and spectral signature list, where I collected 11 spectral signatures.
Tip: it is possible to change the attributes of ROIs and signatures from the ROI list table and the Signature list table respectively, with a click on the table item. Also, it is possible to order items in the tables, with a click on the column name.
4. Classification of the study area
SCP allows for classification previews, in order to assess very rapidly the classification results.Classifications previews are useful during the collection of ROIs, and for the selection of the more accurate spectral signatures.
If the preview results are considered good (i.e. classes are correctly identified), the classification of the entire image can be performed. Otherwise, it is possible to remove one or more spectral signatures, or add new spectral signatures creating other ROIs as described in the previous phase.
Steps:
- In the dock Classification, under Classification preview set Size = 500 (i.e. the side of the classification preview in pixel unit), and select the Spectral Angle Mapping algorithm; click the button + and then click on the image; after a few seconds, the classification preview will be displayed;
Classification preview using classes |
- Check Use Macroclass ID, and click the button Redo; another preview will be performed, but now using macroclasses;
Classification preview using macroclasses |
- In order to perform the final classification, click the button Perform classification and select where to save the output (e.g. classification.tif).
Classification result |
Tip: during the ROI/Signature collection, perform some classification previews using the C ID, in order to assess how individual spectral signatures affect the classification; then check Use Macroclass ID, in order to calculate the final land cover classification.
5. Calculation of classification accuracy
The accuracy assessment of land cover classification is useful for identifying map errors. SCP allows for the calculation of accuracy comparing the classification raster to a reference shapefile.
Usually, accuracy assessment requires ancillary data and field survey. In this tutorial we are are going to compare the land cover classification to the training ROIs.
Steps:
- Select the tab Post processing > Accuracy of the SCP main interface;
SCP toolbar |
Tab Accuracy |
- Select the classification.tif beside Select the classification to assess and select the ROI shapefile beside Select the reference shapefile;
- Click the button Calculate error matrix and select a directory where the error matrix (a .csv file separated by tab) and the error raster are saved; the error matrix will be displayed, and the error raster will be loaded in QGIS, showing the errors in the map (each value of this raster represents a class of comparison between classification and reference shapefile, which is the ErrorMatrixCode in the error matrix file).
Accuracy assessment |
The results of the error matrix show an overall accuracy of 95%, which is very good.
The following error matrix represents the number of pixels classified correctly in the major diagonal. As you can see, most of the errors are between class 3 (built-up) and 4 (bare soil).
Reference | |||||
Classification | 1 | 2 | 3 | 4 | Total |
1 | 87 | 0 | 0 | 0 | 87 |
2 | 0 | 1327 | 1 | 17 | 1345 |
3 | 0 | 0 | 4417 | 2 | 4419 |
4 | 0 | 21 | 268 | 606 | 895 |
Total | 87 | 1348 | 4686 | 625 | 6746 |
From the error matrix file, we also have calculated the accuracy of user and producer; the results show that class 4 (bare soil) has high commission error (100 - user accuracy) and low omission error (100 - producer accuracy). In order to improve the results, we should collect more ROIs and spectral signatures for the bare soil class, paying attention to the spectral similarity with the built-up class.
Class 1 producer accuracy [%] = 100.0 user accuracy [%] = 100.0
Class 2 producer accuracy [%] = 98.4421364985 user accuracy [%] = 98.6617100372
Class 3 producer accuracy [%] = 94.2594963722 user accuracy [%] = 99.9547408916
Class 4 producer accuracy [%] = 96.96 user accuracy [%] = 67.7094972067
Class 2 producer accuracy [%] = 98.4421364985 user accuracy [%] = 98.6617100372
Class 3 producer accuracy [%] = 94.2594963722 user accuracy [%] = 99.9547408916
Class 4 producer accuracy [%] = 96.96 user accuracy [%] = 67.7094972067
Tip: the reference shapefile must have two fields named the same as the Macroclass ID and Class ID fields in the training shapefile; if Use Macroclass ID is checked in the dock Classification, then the Macroclass ID field of the shapefile is used as reference for the classes, otherwise the Class ID field is used.
6. Calculation of the area of classes
SCP allows for the calculation of a classification report with the percentage and the area of land cover classes.
Steps:
From these results, we can see that about 31% of the study area is vegetated, 47% is occupied by built-up, 22% is bare soil surface (also in agricultural areas), and 0.4% is surface water.
Of course, these figures are the result of a tutorial for demonstrating the main features of the SCP for the land cover classification of a Landsat image; several ROIs for each class are required for a good classification (only 11 ROIs were collected in this tutorial), considering their spectral variability; also, field data is useful for improving the collection of ROIs and spectral signatures.
Soon, I am going to post new tutorials, showing advanced features, and I am going to publish the new user manual of SCP.
Following the video of this tutorial.
Steps:
- Select the tab Post processing > Classification report of the SCP Main interface;
SCP toolbar |
Tab Classification report |
- Select classification.tif beside Select the classification and click Calculate classification report;
- After a few seconds the report will be displayed, showing the percentage and the area (area unit is calculated from the image itself).
Classification report |
Following, the report table.
Class | PixelSum | Percentage % | Area [metre^2] |
1 | 2353 | 0.41 | 2117700 |
2 | 178040 | 30.91 | 160236000 |
3 | 269664 | 46.82 | 242697600 |
4 | 125943 | 21.86 | 113348700 |
From these results, we can see that about 31% of the study area is vegetated, 47% is occupied by built-up, 22% is bare soil surface (also in agricultural areas), and 0.4% is surface water.
Of course, these figures are the result of a tutorial for demonstrating the main features of the SCP for the land cover classification of a Landsat image; several ROIs for each class are required for a good classification (only 11 ROIs were collected in this tutorial), considering their spectral variability; also, field data is useful for improving the collection of ROIs and spectral signatures.
Following the video of this tutorial.