From GIS to Remote Sensing: Mapping Urban Area with Sentinel-1 Data: a Tutorial Using SNAP and SCP for QGIS

I am glad to post this tutorial about urban mapping developed by Laure Boudinaud, a satellite data scientist, during her work as Young Graduate Trainee at the European Space Agency (ESA), ESRIN. The developed method aims to create a binary map of Urban / Non-urban area through the supervised classification of a Sentinel-1 image stack.
I personally thank very much Laure Boudinaud for her valuable work and the kindness to share this innovative method with the SCP Community.

Before starting this tutorial, you should download and install the free software SNAP developed by ESA for processing Sentinel data from http://step.esa.int/main/download/.
You should download the following Sentinel-1 images (the study area is Rome, Italy) from the Copernicus Open Access Hub (after the free registration):

S1A_IW_GRDH_1SDV_20161008T051144_20161008T051209_013394_015602_7B4A
S1B_IW_GRDH_1SDV_20161008T170446_20161008T170511_002418_004147_C814

Also, you should download the following reference data:

Finally, download a Sentinel-2 image of the same area, which will be the reference optical data, following this previous tutorial.

Following, the text of the tutorial that is organized in two main steps:

Processing in SNAP of the image and extraction of the features to be used for the classification;
Supervised classification in QGIS (using the Semi-Classification plugin SCP).

We choose images from January 2016, so there is less vegetation than in summer and thus the backscattering coefficient in vegetated areas is less high, which makes the distinction from urban areas (high backscattering too) easier. Moreover, we work with one ASCENDING image and one DESCENDING in order to diminish radar image distortions.

1) Processing in SNAP

Open the two Sentinel-1 images.
Check in the Metadata that one image is Ascending, and the other one Descending.
Visualise the bands (VH and VV): localize the city of Rome. Is it easy to distinguish urban areas from non-urban?
Unfortunately, on the contrary of optical data (which has more than 12 bands of different spectral characteristics), SAR data implies speckle, grey levels only, radar geometry and thus deformations… and so it does not give good classification results if used such as. That is why we need to process it and generate new layers from this backscattering information.
Apply Orbit File: click on Radar > Apply Orbit File. Select the image and click on Run. This gives additional and more precise information about the image acquisition, that will be useful later when stacking the two products, which are not taken the same day.
Calibration: click on Radar > Radiometric > Calibrate. Select the image and in Processing Parameters, select both the polarisations VH and VV. Select Output sigma0 band. Run.
Terrain correction: open the new product and click on Radar > Geometric > Terrain Correction > Range Doppler Terrain Correction. Select the right calibrated product and in Processing Parameters, select all the bands, leave the parameters by default and click Run.
Import the vector file Roma_municipality_district.shp to visualize it on the SAR image and zoom on the area of Rome. When opening, specify SNAP that the CRS for the shapefile is UTM-ED1950/32N. Now, it can be clipped in order to save computational time and final product size.

For this, zoom on the image as you want to clip it (zoomed on Rome, including the municipality shapefile) and click right, then in Spatial Subset from View… , click OK.

Terrain Corrected products with Rome_municipality_district shapefile imported for the images 12Jan2016 (VH) and 06Janv2016 (VH).

Do the same process for the other image.
Stack the two images by clicking in Radar > Coregistration > Stack tools > Create Stack. Click on the Add Opened icon and keep the two latest created images by removing the other ones with the Remove icon. In 2-CreateStack, leave NONE for Resampling Type, select Product Geolocation for Initial Offset Method. Then, in 3-Write, choose the file to save it and click on Run.

Remark: We stack the products and do not co-register them because the coregistration modify one of the two products to make it fit (pixel-wise) to the other one! Fortunately, thanks to the Orbit File that was applied and so the precise information we get from the Sentinel data acquisition, the stack should have bands from both images with pixels fitting (no shift). This can be checked by opening one band from each of the two different images, and visualize them close to each other (Tile icons).

Now the image we visualise is the backscattering coefficient (sigma0) for the polarisations VH and VV for each of the two images. The product should thus contain 4 bands so far. But this is not enough as an input for the classification. For instance, the urban areas and the top of the mountains would tend to be classified together. We thus need to extract extra features from the image to avoid such problems. A common way to address the problem is to analyse the texture of SAR imagery.

Texture describes the type and characteristic of the re-occurrence of objects. It is a feature of an area – the texture of a point is not defined. The Grey Level Co-occurence Matrix (GLCM) is the tabulation of how often different combinations of pixel brightness values (grey levels) occur in an image. GLCM contains frequencies of the occurrence of neighbouring pixels in a defined distance and a defined grey-value difference in a certain direction.

Click on Radar > SAR Applications > Urban Areas > Speckle Divergence. Select the stack and in Processing Parameters, select the four source bands and choose 13x13 for the window size. Run it. A new product is created, containing both the backscattering coefficient and the speckle divergence for each image (8 bands).
We are now going to create some extra texture features. First, click on Radar > Speckle Filtering > Multi-temporal Speckle Filter. Select the last product and in processing Parameters, select only the backscattering coefficient bands (that is to say not the speckle divergence ones). Choose the filter called Refined Lee and run it.
Now click on Raster > Image Analysis > Texture Analysis > Grey Level Co-occurrence Matrix. Select the last product (filtered) and choose 9x9 for the window size. Leave the other parameters by default. Untick all the features except for Energy, GLC Mean and GLCM Variance.

Speckle divergence images derived from 06Jan2016 (VH) and from 12Jan2016 (VV)

Finally, stack the last product with the one created in j (containing the 4 backscattering coefficient bands and the 4 speckle divergence bands). For this, do as in step i.

2) GIS analysis in QGIS

a. Preparation of the layer to be classified

Open the OpenStreetMap layer from the OpenLayers plugin (can be installed in Plugins > Manage and Install Plugins…) in order to visualize it as a background. This helps for visualisation of the product we work. Then open the geotiff product in QGIS. Check it is at the right location.

Open the reference shapefiles “GRA.shp”, “Tiber.shp”, and “Roma_municipality_district.shp”. They have a different Coordinate Reference System (CRS) so to visualise them on top of the product of interest, we need to set the right CRS for each shapefile. Click right on the shapefile in the Layers panel and in Set Layer CRS, select ED50 / UTM zone 33N for GRA and Tiber, and ED50 / UTM zone 32N for Roma_municipality_district. They should now appear on top of the product. Check the river and the GRA corresponds to what can be seen from the SAR product. (For this, you can click right on the shapefile name and in Properties, change the Transparency of the layer in Style).

Clip to the area of interest: To clip the SAR product with the Roma_municipality_district vector file, we first need to reproject it in the same Coordinate System Reference (CRS). Use the Reproject Tool for this (search for it in the Processing Toolbox panel). Check first what the CRS of the geotiff is (it should be WGS84 / EPSG4326, by default when using SNAP) and select this CRS in Target CRS when using the Reproject layer tool for the municipality shapefile (selected as an input layer).

Now use the Raster > Extraction > Clipper tool. Set as an input the geotiff product, specify the output name and location, select Mask layer and choose the reprojected shapefile of the municipality of Rome.

Run the algorithm, the process can take several minutes.

b. Supervised Classification

For this, we will be using the Semi-Classification Plugin (SCP), which can be installed in Plugins > Manage and Install Plugins… Then open the SCP dock on the side of the image.

First, we need to create the Band Set on which the classification will be done. Click on the

icon to refresh the layer list and select the clipped raster (Sentinel-1 product of bands) as the Input image. Then click on the Band Set icon

. Check that the 18 bands of the product appear in the Band set definition table. Select Band order for Quick wavelength settings and band number for Wavelength unit. Indeed, this plugin is originally developed to work with optical data and thus need some of this information for spectral signature calculation. Close the Band Set window.

Creation of the training ROIs (Regions of Interest)

Open the reference optical data that is the Sentinel-2 image.

We need to manually create polygons based on the optical data to detect urban areas from non-urban areas.

We now look at the Training Input section of the SCP dock. Click on the Create a new Input File icon

, name it then save it. Now, we can create the polygons. We are going to create several ROIs using the Macroclass IDs 1 for Urban and 2 for Non-Urban. For this, we look at the Classification dock section.

Click on the

icon (Create a ROI polygon) in the working toolbar and manually draw a polygon on urban areas. Left click on the map to define the ROI vertices and right click to draw the last one that will close the polygon. An orange temporary polygon is then displayed on the map. Draw 10-15 polygons for each of the two classes (in the city centre, but also at the limits of the municipality. For Urban draw polygons on roads too, for Non-urban draw polygons on the Tiber river, on the parks in the centre as well and of course the mountains, fields, etc.). For this, press Ctrl when right clicking to close the new polygon. To save it to the training input, in the Classification dock, define the Classes and Macroclasses by setting IDs (set MC ID = 1, C ID=1 and MC Info = C Info = Urban). Then, click on the Save temporary ROI to training input icon

and unselect the Display NDVI option. This will calculate the spectral signature and save the polygon associated to the training input. Do the same now for non-urban areas (set MC ID = 2, C ID=2 and MC Info = C Info = Non-Urban).

Two lines must now appear in the ROI signature list table. B in the Type column means that the ROI spectral signature was calculated and saved to the Training input (B stands for “Both”). A new layer of the training ROIs in the Layer Panel must have as well appeared.

Run the classification

Now click on the Classification algorithm tab: tick the “MC ID” option, choose the Algorithm “Maximum Likelihood”. Then in the Classification output tab, tick Create Vector and run the classification. We need to save the output file and give it a name. The classification step can take several minutes.

c. Accuracy assessment of the classification

A classification needs to come with statistics in order to understand the reliability of the classification and to identify possible errors.

Select the tab Post processing > Accuracy of the plugin main interface; select the classification raster file and select the ROI shapefile beside. Select also the reference shapefile; then click the button Calculate error matrix and the matrix will be displayed; you can save the error matrix by clicking the button Save error matrix to file.

3) Calculation of the urban surface in km²

Before all, we need to always make sure that the layer is in the right projection (some have degrees as unit, some others meters). If not, use the Reprojection tool in the QGIS Toolbox to reproject the Classification vector file in the same CRS as the Municipality shapefile (WGS 84 / UTM Zone 32N). This coordinate system uses meters as unit.

Once the reprojected classification product is created, you can clip it to the municipality area, which is the area we are interested about. For this, go to Vector > Geoprocessing Tools > Clip). The clipping process can take several minutes.

Then, go in Edit mode (Toggle Editing button; the one with one yellow pencil), right click on the classification shapefile > Open Attribute Table > Field Calculator and search for the $area operator (in Geometry). Click on it to add it in the expression (left) and write as an output field name “Area”. Click OK. It then calculates the area in m² for each polygon. There should be a new column called “area” in Attribute Table.

Now, we need to sum up all these numbers to have one number for the total class area. First, close the Toggle Editing mode by clicking again on the yellow pen icon, and click on Save: this saves the new Area data column to the product’s attribute table. Now, let’s use the tool called “Statistics by category”, in Toolbox. Select the clipped reprojected classified vector as input vector layer. Select “area” for “Field to calculate statistics on” and “C_ID” for “Field with categories”. Click on Run. A product called “Statistics by category” opens in the Layers Panel. Click right on it and open the Attribute Table. Read the column Sum for the total area in m², and Count to know how many polygons are in each class.

Compare the result with optical data Sentinel-2A (for instance read this tutorial) : 354.54 km² in 2015 / 351 km² in 2016.

Supervised classification result based on Sentinel-1 images – QGIS

Again I thank Laure Boudinaud for this valuable tutorial.
I am very interested in this application of Sentinel-1 data and their use in SCP. Of course, I invite you to share your methodologies or applications about the use of SCP.

For any comment or question, join the Facebook group and the Google+ Community about the Semi-Automatic Classification Plugin.