Sometimes I prefer to publish my map in gray instead of black. But all newly added QGIS composer items are set to black by default. For changing the colors more easily and rapidly I created the “Turn Gray” plugin. By default it changes all foreground colors (labels and outlines) to gray. But you are free … Continue reading Gray is the new Black
It’s been a long time since I last blogged here. Let’s just blame that on the amount of changes going into QGIS 3.0 and move on…
One new feature which landed in QGIS 3.0 today is a processing algorithm for automatic coloring of a map in such a way that adjoining polygons are all assigned different color indexes. Astute readers may be aware that this was possible in earlier versions of QGIS through the use of either the (QGIS 1.x only!) Topocolor plugin, or the Coloring a map plugin (2.x).
What’s interesting about this new processing algorithm is that it introduces several refinements for cartographically optimising the coloring. The earlier plugins both operated by pure “graph” coloring techniques. What this means is that first a graph consisting of each set of adjoining features is generated. Then, based purely on this abstract graph, the coloring algorithms are applied to optimise the solution so that connected graph nodes are assigned different colors, whilst keeping the total number of colors required minimised.
The new QGIS algorithm works in a different way. Whilst the first step is still calculating the graph of adjoining features (now super-fast due to use of spatial indexes and prepared geometry intersection tests!), the colors for the graph are assigned while considering the spatial arrangement of all features. It’s gone from a purely abstract mathematical solution to a context-sensitive cartographic solution.
Let’s explore the differences. First up, the algorithm has an option for the “minimum distance between features”. It’s often the case that features aren’t really touching, but are instead just very close to each other. Even though they aren’t touching, we still don’t want these features to be assigned the same color. This option allows you to control the minimum distance which two features can be to each other before they can be assigned the same color.
The biggest change comes in the “balancing” techniques available in the new algorithm. By default, the algorithm now tries to assign colors in such a way that the total number of features assigned each color is equalised. This avoids having a color which is only assigned to a couple of features in a large dataset, resulting in an odd looking map coloration.
Another available balancing technique is to balance the color assignment by total area. This technique assigns colors so that the total area of the features assigned to each color is balanced. This mode can be useful to help avoid large features resulting in one of the colors appearing more dominant on a colored map.
The final technique, and my personal preference, is to balance colors by distance between colors. This mode will assign colors in order to maximize the distance between features of the same color. Maximising the distance helps to create a more uniform distribution of colors across a map, and avoids certain colors clustering in a particular area of the map. It’s my preference as it creates a really nice balanced map – at a glance the colors look “randomly” assigned with no discernible pattern to the arrangement.
As these examples show, considering the geographic arrangement of features while coloring allows us to optimise the assigned colors for cartographic output.
The other nice thing about having this feature implemented as a processing algorithm is that unlike standalone plugins, processing algorithms can be incorporated as just one step of a larger model (and also reused by other plugins!).
QGIS 3.0 has tons of great new features, speed boosts and stability bumps. This is just a tiny taste of the handy new features which will be available when 3.0 is released!
Today’s post was motivated by a question following up on my recent post “Details of good flow maps“: How to create arrows with gradients from transparent to opaque?
The key idea is to use a gradient fill to color the arrows:
It all seems perfectly straightforward: determine the direction of the line and set the gradient rotation according to the line direction.
But wait! That doesn’t work!
The issue is that all default angle functions available in expressions return clockwise angles but the gradient rotation has to be set in counter-clockwise angles. So we need this expression:
Sorry, this entry is only available in the Dutch language
In my previous posts, I discussed classical flow maps that use arrows of different width to encode flows between regions. This post presents an alternative take on visualizing flows, without any arrows. This style is inspired by Go with the Flow by Robert Radburn and Visualisation of origins, destinations and flows with OD maps by J. Wood et al.
The starting point of this visualization is a classic OD matrix.
For my previous flow maps, I already converted this data into a more GIS-friendly format: a Geopackage with lines and information about the origin, destination and strength of the flow:
In addition, I grabbed state polygons from Natural Earth Data.
At this point, we have 72 flow features and 9 state polygon features. An ordinary join in the layer properties won’t do the trick. We’d still be stuck with only 9 polygons.
Virtual layers to the rescue!
The QGIS virtual layers feature (Layer menu | Add Layer | Add/Edit Virtual Layer) provides database capabilities without us having to actually set up a database … *win!*
Using a classic SQL query, we can join state polygons and migration flows into a new virtual layer:
The resulting virtual layer contains 72 polygon features. There are 8 copies of each state.
Now that the data is ready, we can start designing the visualization in the Print Composer.
This is probably the most manual step in this whole process: We need 9 map items, one for each mini map in the small multiples visualization. Create one and configure it to your liking, then copy and paste to create 8 more copies.
I’ve decided to arrange the map items in a way that resembles the actual geographic location of the state that is represented by the respective map, from the state of Vorarlberg (a proud QGIS sponsor by the way) in the south-west to Lower Austria in the north-east.
To configure which map item will represent the flows from which origin state, we set the map item ID to the corresponding state ID. As you can see, the map items are numbered from 1 to 9:
Once all map items are set up, we can use the map item IDs to filter the features in each map. This can be implemented using a rule based renderer:
The first rule will ensure that the each map only shows flows originating from a specific state and the second rule will select the state itself.
We configure the symbol of the first rule to visualize the flow strength. The color represents the number number of people moving to the respective district. I’ve decided to use a smooth gradient instead of predefined classes for the polygon fill colors. The following expression maps the feature’s weight value to a shade on the Viridis color ramp:
ramp_color( 'Viridis', scale_linear("weight",0,2000,0,1) )
You can use any color ramp you like. If you want to use the Viridis color ramp, save the following code into an .xml file and import it using the Style Manager. (This color ramp has been provided by Richard Styron on rocksandwater.net.)
<!DOCTYPE qgis_style> <qgis_style version="0"> <symbols/> <colorramp type="gradient" name="Viridis"> <prop k="color1" v="68,1,84,255"/> <prop k="color2" v="253,231,36,255"/> <prop k="stops" v="0.04;71,15,98,255:0.08;72,29,111,255:0.12;71,42,121,255:0.16;69,54,129,255:0.20;65,66,134,255:0.23;60,77,138,255:0.27;55,88,140,255:0.31;50,98,141,255:0.35;46,108,142,255:0.39;42,118,142,255:0.43;38,127,142,255:0.47;35,137,141,255:0.51;31,146,140,255:0.55;30,155,137,255:0.59;32,165,133,255:0.62;40,174,127,255:0.66;53,183,120,255:0.70;69,191,111,255:0.74;89,199,100,255:0.78;112,206,86,255:0.82;136,213,71,255:0.86;162,218,55,255:0.90;189,222,38,255:0.94;215,226,25,255:0.98;241,229,28,255"/> </colorramp> </colorramps> </qgis_style>
If we go back to the Print Composer and update the map item previews, we see it all come together:
Finally, we set title, legend, explanatory texts, and background color:
I think it is amazing that we are able to design a visualization like this without having to create any intermediate files or having to write custom code. Whenever a value is edited in the original migration dataset, the change is immediately reflected in the small multiples.
A year ago I have asked QGIS’s community what were their favourite QGIS new features from 2015 and published this blog post. This year I decided to ask it again. In 2016, we add the release of the second long-term release (2.14 LTR), and two other stable versions (2.16 and 2.18).
2016 was again very productive year for the QGIS community, with lots of improvements and new features landing on QGIS source code, not to speak of all the work already in place for QGIS 3. This is a great assurance of the project’s vitality.
5 – Paste a style to multiple selected layers or to all layers in a legend group (2.14)
This is a productivity functionaly that I just realized that existed now, with so many people voting on it. If copy/paste styles was, in my opinion, a killer feature, being able to use it in multiple layers or even a group is just great.
4 – fTools plugin has been replaced with Processing algorithms (2.16)
While checking the Vector Menu, the tools seem the same as previous version, but it’s when you open them that you understand the difference. All vector tools, provided until now by the fTools core plugin, were replaced by equivalent processing Algoritms. For the users it means easier access to more functionality, like running the tools in batch mode, or getting outputs as temporary layers. Besides some of the tools have been improved.
3 – Virtual layers (2.14)
This is definitly one of my favourite new features, and it seems I’m not alone. With virtual layers you can run SQL queries using the layers loaded in the project, even when the layers are not stored in a relational database. We are not talking about WHERE statments to filter data, with this you can do real SQL queries, with spatial analysis, aggregations, and so on. Besides, virtual layers will act as VIEWs and any changes to any of the input layers will automatically update the layer.
2 – Speed and memory improvements (2.14)
It’s no surprise that speed and memory improvements we one of the most voted features. Lots of improvements were made for loading and managing large datasets, and this have a tremendous impact in all users. According to the changelog, zoom is faster, selecting features is faster, updating attributes on selected features is faster, and it consumes less memory. So don’t be afraid to put QGIS to the test.
1 – Trace digitising tool (2.14)
If you do lots of digitising, you better look into this new feaure that landed on QGIS 2.14. It allows you to digitize new feature by using other layers boundaries. Besides the quality improvement of layers topology, this can make digitizing almost feel pleasing and fast! Just click the first point, move your mouse around other features edged to pick up more vertex.
There were other new features that also made the delight of many users. For example, several improvements on the labeling, Georeference outputs (eg PDF) from composer (2.16), Filter legend by expression (2.14), 2.5D Renderer. Personally, the Style docker made my day/year. But you can check the full results of the survey, if you like.
Obviously, this list means nothing at all. All new features were of tremendous value, and will be useful for thousands (yes thousands) of people. It was a mere exercise as, with such a diverse QGIS crowd, it would be impossible to build a list that would fit us all. Besides, there were many great enhancements, introduced during 2016, that might have fallen under the radar for most users. Check the visual changelogs for a full list of new features.
On my behalf, to all developers, sponsors, and general QGIS contributors, once again
THANK YOU VERY MUCH FOR YOUR TREMENDOUS WORK!
I wish you a fantastic 2017.
QGIS Developer Sprint in Lyon
QGIS Server 3.0 is going to be better than ever! Last week I attended to the mini code-sprint organized by the french QGIS developers in Lyon.
The code sprint was focused on QGIS Server refactoring to reach the following goals:
- increase maintainability through modularity and clean code responsibilities
- increase performances
- better multi-project handling and caching
- multi threaded rendering
By working for different companies on such a big Open Source project like QGIS, coordination between developers is fundamentally achieved through those kind of events.
We were a small group of engaged QGIS Server developers and I think that the alternance between brainstorming and coding has proven to be very productive: after two days we were able to set common milestones and commitments that will ensure a bright future to QGIS Server.
A huge and warm thank to the french QGIS developers that organized this meeting!
Last time, I wrote about the little details that make a good flow map. The data in that post was made up and simpler than your typical flow map. That’s why I wanted to redo it with real-world data. In this post, I’m using domestic migration data of Austria.
With 9 states, that makes 72 potential flow arrows. Since that’s too much to map, I’ve decided in a first step to only show flows with more than 1,000 people.
Following the recommendations mentioned in the previous post, I first designed a basic flow map where each flow direction is rendered as a black arrow:
Even with this very limited number of flows, the map gets pretty crowded, particularly around the north-eastern node, the Austrian capital Vienna.
To reduce the number of incoming and outgoing lines at each node, I therefore decided to change to colored one-sided arrows that share a common geometry:
The arrow color is determined automatically based on the arrow direction using the following expression:
CASE WHEN "weight" < 1000 THEN color_rgba( 0,0,0,0) WHEN x(start_point( $geometry)) - x(end_point($geometry)) < 0 THEN '#1f78b4' ELSE '#ff7f00' END
The same approach is used to control the side of the one-sided arrow head. The arrow symbol layer has two “arrow type” options for rendering the arrow head: on the inside of the curve or on the outside. This means that, if we wouldn’t use a data-defined approach, the arrow head would be on the same side – independent of the line geometry direction.
CASE WHEN x(start_point( $geometry)) - x(end_point($geometry)) < 0 THEN 1 ELSE 2 END
Obviously, this ignores the corner case of start and end points at the same x coordinate but, if necessary, this case can be added easily.
Of course the results are far from perfect and this approach still requires manual tweaking of the arrow geometries. Nonetheless, I think it’s very interesting to see how far we can push the limits of data-driven styling for flow maps.
Give it a try! You’ll find the symbol and accompanying sample data on the QGIS resource sharing plugin platform:
In my previous post, I shared a flow map style that was inspired by a hand drawn map. Today’s post is inspired by a recent academic paper recommended to me by Radoslaw Panczak @RPanczak and Thomas Gratier @ThomasG77:
Jenny, B., Stephen, D. M., Muehlenhaus, I., Marston, B. E., Sharma, R., Zhang, E., & Jenny, H. (2016). Design principles for origin-destination flow maps. Cartography and Geographic Information Science, 1-15.
Jenny et al. (2016) performed a study on how to best design flow maps. The resulting design principles are:
- number of flow overlaps should be minimized;
- sharp bends and excessively asymmetric flows should be avoided;
- acute intersection angles should be avoided;
- flows must not pass under unconnected nodes;
- flows should be radially arranged around nodes;
- quantity is best represented by scaled flow width;
- flow direction is best indicated with arrowheads;
- arrowheads should be scaled with flow width, but arrowheads for thin flows should be enlarged; and
- overlaps between arrowheads and flows should be avoided.
Many of these points concern the arrangement of flow lines but I want to talk about those design principles that can be implemented in a QGIS line style. I’ve summarized the three core ideas:
- use arrow heads and scale arrow width according to flow,
- enlarge arrow heads for thin flows, and
- use nodes to arrange flows and avoid overlaps of arrow heads and flows
To get started, we can use a standard QGIS arrow symbol layer. To represent the flow value (“weight”) according to the first design principle, all arrow parameters are data-defined:
To enlarge the arrow heads for thin flow lines, as required by the second design principle, we can add a fixed value to the data-defined head length and thickness:
The main issue with this flow map is that it gets messy as soon as multiple arrows end at the same location. The arrow heads are plotted on top of each other and at some point it is almost impossible to see which arrow starts where. This is where the third design principle comes into play!
To fix the overlap issue, we can add big round nodes at the flow start and end points. These node buffers are both used to render circles on the map, as well as to shorten the arrows by cutting off a short section at the beginning and end of the lines:
difference( difference( $geometry, buffer( start_point($geometry), 10000 ) ), buffer( end_point( $geometry), 10000 ) )
Note that the buffer values in this expression only produce appropriate results for line datasets which use a CRS in meters and will have to be adjusted for other units.
It’s great to have some tried and evaluated design guidelines for our flow maps. As always: Know your cartography rules before you start breaking them!
PS: To draw a curved arrow, the line needs to have one intermediate point between start and end – so three points in total. Depending on the intermediate point’s position, the line is more or less curved.
The QGIS map style I want to share with you today was inspired by a hand-drawn map by Philippe Rekacewicz that I saw on Twitter:
The look reminds me of conveyor belts, thus the name choice.
The conveyor belt is a line symbol that makes extensive use of Geometry generators. One generator for the circle at the flow line start and end point, respectively, another generator for the belt, and a final one for the small arrows around the colored circles. The color and size of the circle are data defined:
The collection also contains a sample Geopackage dataset which you can use to test the symbol immediately. It is worth noting that the circle size has to be specified in layer CRS units.
It’s great fun playing with the power of Geometry generator symbol layers and QGIS geometry expressions. For example, this is the expression for the final geometry that is used to draw the small arrows around colored circles:
line_merge( intersection( exterior_ring( convex_hull( union( buffer( start_point($geometry), "start_size" ), buffer( end_point($geometry), 500000 ) ) ) ), exterior_ring( buffer( start_point( $geometry), "start_size" ) ) ) )
The expression constructs buffer circles, the belt geometry (convex_hull around buffers), and finally extracts the intersecting part from the start circle and the belt geometry.
Hope you enjoy it!
It’s holiday season, why not share one of your own symbols with the QGIS community?
QGIS’ handling of color ramps has just gotten much better with a series of improvements we committed to the open source project’s upcoming version 3.0.
This slide goes through brief summary of changes: Color ramp handling, made fun!
On the developer front, one nice improvement is the addition of an invert() function directly attached to color ramp classes (QgsColorRamp and its children). This removed the need for symbol layers and renderers to implement individual invert-related functions; those are now served with a customized source color ramp, with edited steps and/or reversed order already taken into account.
The following packages can now be installed:
- qgis 2.18.0
- qgis-debuginfo 2.18.0
- qgis-devel 2.18.0
- qgis-grass 2.18.0
- qgis-python 2.18.0
- qgis-server 2.18.0
Installation instructions (run as “root” user or use “sudo”):
su # Fedora 23, Fedora 24: dnf copr enable neteler/QGIS-2.18-Las-Palmas dnf update # note: the "qca-ossl" package is the OpenSSL plugin for QCA dnf install qgis qgis-grass qgis-python qca-ossl
In the previous post, I presented an approach to generalize big trajectory datasets by extracting flows between cells of a data-driven irregular grid. This generalization provides a much better overview of the flow and directionality than a simple plot of the original raw trajectory data can. The paper introducing this method also contains more advanced visualizations that show cell statistics, such as the overall count of trajectories or the generalization quality. Another bit of information that is often of interest when exploring movement data, is the time of the movement. For example, at LBS2016 last week, M. Jahnke presented an application that allows users to explore the number of taxi pickups and dropoffs at certain locations:
By adopting this approach for the generalized flow maps, we can, for example, explore which parts of the research area are busy at which time of the day. Here I have divided the day into four quarters: night from 0 to 6 (light blue), morning from 6 to 12 (orange), afternoon from 12 to 18 (red), and evening from 18 to 24 (dark blue).
The resulting visualization shows that overall, there is less movement during the night hours from midnight to 6 in the morning (light blue quarter). Sounds reasonable!
One implementation detail worth considering is which timestamp should be used for counting the number of movements. Should it be the time of the first trajectory point entering a cell, or the time when the trajectory leaves the cell, or some average value? In the current implementation, I have opted for the entry time. This means that if the tracked person spends a long time within a cell (e.g. at the work location) the trip home only adds to the evening trip count of the neighboring cell along the trajectory.
Since the time information stored in a PostGIS LinestringM feature’s m-value does not contain any time zone information, we also have to pay attention to handle any necessary offsets. For example, the GeoLife documentation states that all timestamps are provided in GMT while Beijing is in the GMT+8 time zone. This offset has to be accounted for in the analysis script, otherwise the counts per time of day will be all over the place.
Using the same approach, we could also investigate other variations, e.g. over different days of the week, seasonal variations, or the development over multiple years.
In the fist two parts of the Movement Data in GIS series, I discussed modeling trajectories as LinestringM features in PostGIS to overcome some common issues of movement data in GIS and presented a way to efficiently render speed changes along a trajectory in QGIS without having to split the trajectory into shorter segments.
While visualizing individual trajectories is important, the real challenge is trying to visualize massive trajectory datasets in a way that enables further analysis. The out-of-the-box functionality of GIS is painfully limited. Except for some transparency and heatmap approaches, there is not much that can be done to help interpret “hairballs” of trajectories. Luckily researchers in visual analytics have already put considerable effort into finding solutions for this visualization challenge. The approach I want to talk about today is by Andrienko, N., & Andrienko, G. (2011). Spatial generalization and aggregation of massive movement data. IEEE Transactions on visualization and computer graphics, 17(2), 205-219. and consists of the following main steps:
- Extracting characteristic points from the trajectories
- Grouping the extracted points by spatial proximity
- Computing group centroids and corresponding Voronoi cells
- Deviding trajectories into segments according to the Voronoi cells
- Counting transitions from one cell to another
The authors do a great job at describing the concepts and algorithms, which made it relatively straightforward to implement them in QGIS Processing. So far, I’ve implemented the basic logic but the paper contains further suggestions for improvements. This was also my first pyQGIS project that makes use of the measurement value support in the new geometry engine. The time information stored in the m-values is used to detect stop points, which – together with start, end, and turning points – make up the characteristic points of a trajectory.
The following animation illustrates the current state of the implementation: First the “hairball” of trajectories is rendered. Then we extract the characteristic points and group them by proximity. The big black dots are the resulting group centroids. From there, I skipped the Voronoi cells and directly counted transitions from “nearest to centroid A” to “nearest to centroid B”.
The resulting visualization makes it possible to analyze flow strength as well as directionality. I have deliberately excluded all connections with a count below 10 transitions to reduce visual clutter. The cell size / distance between point groups – and therefore the level-of-detail – is one of the input parameters. In my example, I used a target cell size of approximately 2km. This setting results in connections which follow the major roads outside the city center very well. In the city center, where the road grid is tighter, trajectories on different roads mix and the connections are less clear.
Since trajectories in this dataset are not limited to car trips, it is expected to find additional movement that is not restricted to the road network. This is particularly noticeable in the dense area in the west where many slow trajectories – most likely from walking trips – are located. The paper also covers how to ensure that connections are limited to neighboring cells by densifying the trajectories before computing step 4.
Running the scripts for over 18,000 trajectories requires patience. It would be worth evaluating if the first three steps can be run with only a subsample of the data without impacting the results in a negative way.
One thing I’m not satisfied with yet is the way to specify the target cell size. While it’s possible to measure ellipsoidal distances in meters using QgsDistanceArea (irrespective of the trajectory layer’s CRS), the initial regular grid used in step 2 in order to group the extracted points has to be specified in the trajectory layer’s CRS units – quite likely degrees. Instead, it may be best to transform everything into an equidistant projection before running any calculations.
It’s good to see that PyQGIS enables us to use the information encoded in PostGIS LinestringM features to perform spatio-temporal analysis. However, working with m or z values involves a lot of v2 geometry classes which work slightly differently than their v1 counterparts. It certainly takes some getting used to. This situation might get cleaned up as part of the QGIS 3 API refactoring effort. If you can, please support work on QGIS 3. Now is the time to shape the PyQGIS API for the following years!
The possibility to easily share plugins with other users and discover plugins written by other community members has been a powerful feature of QGIS for many years.
The QGIS Resources Sharing plugin is meant to enable the same sharing for map design resources. It allows you to share collections of resources, including but not limited to SVGs, symbols, styles, color ramps, and processing scripts.
Using the Resource Sharing plugin is like using the Plugin Manager. Once installed, you are presented with a list of available resource collections for download. You will find that there are already some really nice collections, including nautical symbols, Mapbox Maki Icons, and my Google-like OSM road style.
By pressing Install, the resource collection is downloaded and you can have a look at the content using the Open folder button. In case of the Mapbox Maki Icon collection, it contains a folder of SVGs:
Using the new icons is as simple as opening the layer styling settings and selecting the Mapbox Maki Icons collection in the SVG group list:
Similarly, if you download the OSM Spatialite Googlemaps collection, its road line symbols are added to your existing list of available line symbols:
By pressing the Open Library button, you get to the Style Manager where you can browse through all installed symbols and delete, rename, or categorize them.
The Resource Sharing plugin was developed by Akbar Gumbira during this year’s Google Summer of Code. The full documentation, including instructions for how to share your own symbols with the community, is available at www.akbargumbira.com/qgis_resources_sharing.
I’ve recently spent some time optimising the performance of various QGIS plugins and algorithms, and I’ve noticed that there’s a few common performance traps which developers fall into when fetching features from a vector layer. In this post I’m going to explore these traps, what makes them slow, and how to avoid them.
As a bit of background, features are fetched from a vector layer in QGIS using a QgsFeatureRequest object. Common use is something like this:
request = QgsFeatureRequest() for feature in vector_layer.getFeatures(request): # do something
This code would iterate over all the features in layer. Filtering the features is done by tweaking the QgsFeatureRequest, such as:
request = QgsFeatureRequest().setFilterFid(1001) feature_1001 = next(vector_layer.getFeatures(request))
In this case calling getFeatures(request) just returns the single feature with an ID of 1001 (which is why we shortcut and use next(…) here instead of iterating over the results).
Now, here’s the trap: calling getFeatures is expensive. If you call it on a vector layer, QGIS will be required to setup an new connection to the data store (the layer provider), create some query to return data, and parse each result as it is returned from the provider. This can be slow, especially if you’re working with some type of remote layer, such as a PostGIS table over a VPN connection. This brings us to our first trap:
Trap #1: Minimise the calls to getFeatures()
A common task in PyQGIS code is to take a list of feature IDs and then request those features from the layer. A see a lot of older code which does this using something like:
for id in some_list_of_feature_ids: request = QgsFeatureRequest().setFilterFid(id) feature = next(vector_layer.getFeatures(request)) # do something with the feature
Why is this a bad idea? Well, remember that every time you call getFeatures() QGIS needs to do a whole bunch of things before it can start giving you the matching features. In this case, the code is calling getFeatures() once for every feature ID in the list. So if the list had 100 features, that means QGIS is having to create a connection to the data source, set up and prepare a query to match a single feature, wait for the provider to process that, and then finally parse the single feature result. That’s a lot of wasted processing!
If the code is rewritten to take the call to getFeatures() outside of the loop, then the result is:
request = QgsFeatureRequest().setFilterFids(some_list_of_feature_ids) for feature in vector_layer.getFeatures(request): # do something with the feature
Now there’s just a single call to getFeatures() here. QGIS optimises this request by using a single connection to the data source, preparing the query just once, and fetching the results in appropriately sized batches. The difference is huge, especially if you’re dealing with a large number of features.
Trap #2: Use QgsFeatureRequest filters appropriately
Here’s another common mistake I see in PyQGIS code. I often see this one when an author is trying to do something with all the selected features in a layer:
for feature in vector_layer.getFeatures(): if not feature.id() in vector_layer.selectedFeaturesIds(): continue # do something with the feature
What’s happening here is that the code is iterating over all the features in the layer, and then skipping over any which aren’t in the list of selected features. See the problem here? This code iterates over EVERY feature in the layer. If you’re layer has 10 million features, we are fetching every one of these from the data source, going through all the work of parsing it into a QGIS feature, and then promptly discarding it if it’s not in our list of selected features. It’s very inefficient, especially if fetching features is slow (such as when connecting to a remote database source).
Instead, this code should use the setFilterFids() method for QgsFeatureRequest:
request = QgsFeatureRequest().setFilterFids(vector_layer.selectedFeaturesIds()) for feature in vector_layer.getFeatures(request): # do something with the feature
Now, QGIS will only fetch features from the provider with matching feature IDs from the list. Instead of fetching and processing every feature in the layer, only the actual selected features will be fetched. It’s not uncommon to see operations which previously took many minutes (or hours!) drop down to a few seconds after applying this fix.
Another variant of this trap uses expressions to test the returned features:
filter_expression = QgsExpression('my_field > 20') for feature in vector_layer.getFeatures(): if not filter_expression.evaluate(feature): continue # do something with the feature
Again, this code is fetching every single feature from the layer and then discarding it if it doesn’t match the “my_field > 20” filter expression. By rewriting this to:
request = QgsFeatureRequest().setFilterExpression('my_field > 20') for feature in vector_layer.getFeatures(request): # do something with the feature
we hand over the bulk of the filtering to the data source itself. Recent QGIS versions intelligently translate the filter into a format which can be applied directly at the provider, meaning that any relevant indexes and other optimisations can be applied by the provider itself. In this case the rewritten code means that ONLY the features matching the ‘my_field > 20’ criteria are fetched from the provider – there’s no time wasted messing around with features we don’t need.
Trap #3: Only request values you need
The last trap I often see is that more values are requested from the layer then are actually required. Let’s take the code:
my_sum = 0 for feature in vector_layer.getFeatures(request): my_sum += feature['value']
In this case there’s no way we can optimise the filters applied, since we need to process every feature in the layer. But – this code is still inefficient. By default QGIS will fetch all the details for a feature from the provider. This includes all attribute values and the feature’s geometry. That’s a lot of processing – QGIS needs to transform the values from their original format into a format usable by QGIS, and the feature’s geometry needs to be parsed from it’s original type and rebuilt as a QgsGeometry object. In our sample code above we aren’t doing anything with the geometry, and we are only using a single attribute from the layer. By calling setFlags( QgsFeatureRequest.NoGeometry ) and setSubsetOfAttributes() we can tell QGIS that we don’t need the geometry, and we only require a single attribute’s value:
my_sum = 0 request = QgsFeatureRequest().setFlags(QgsFeatureRequest.NoGeometry).setSubsetOfAttributes(['value'], vector_layer.fields() ) for feature in vector_layer.getFeatures(request): my_sum += feature['value']
None of the unnecessary geometry parsing will occur, and only the ‘value’ attribute will be fetched and populated in the features. This cuts down both on the processing required AND the amount of data transfer between the layer’s provider and QGIS. It’s a significant improvement if you’re dealing with larger layers.
Optimising your feature requests is one of the easiest ways to speed up your PyQGIS script! It’s worth spending some time looking over all your uses of getFeatures() to see whether you can cut down on what you’re requesting – the results can often be mind blowing!
This is a guest post by Mickael HOARAU @Oneil974
For people who are working on QGIS Atlas feature, I worked on an Atlas version of the last tutorial I have made. The difficulty level is a little bit more consequente then last tutorial but there are features that you could appreciate. So I’m happy to share with you and I hope this would be helpful.Click to view slideshow.
You can download tutorial here:
And sources here:
PS : I’m looking for job offers, feel free to contact me on twitter @Oneil974
In the first part of the Movement Data in GIS series, I discussed some of the common issues of modeling movement data in GIS, followed by a recommendation to model trajectories as LinestringM features in PostGIS to simplify analyses and improve query performance.
Of course, we don’t only want to analyse movement data within the database. We also want to visualize it to gain a better understanding of the data or communicate analysis results. For example, take one trajectory:
Visualizing movement direction is easy: just slap an arrow head on the end of the line and done. What about movement speed? Sure! Mean speed, max speed, which should it be?
Speed along the trajectory, a value for each segment between consecutive positions.
With the usual GIS data model, we are back to square one. A line usually has one color and width. Of course we can create doted and dashed lines but that’s not getting us anywhere here. To visualize speed variations along the trajectory, we therefore split the original trajectory into its segments, 1429 in this case. Then we can calculate speed for each segment and use a graduated or data defined renderer to show the results:
Very unsatisfactory! We had to increase the number of features 1429 times just to show speed variations along the trajectory, even though the original single trajectory feature already contained all the necessary information and QGIS does support geometries with measurement values.
Starting from QGIS 2.14, we have an alternative way to deal with this issue. We can stick to the original single trajectory feature and render it using the new geometry generator symbol layer. (This functionality is also used under the hood of the 2.5D renderer.) Using the segments_to_lines() function, the geometry generator basically creates individual segment lines on the fly:
Once this is set up, we can style the segments with a data-defined expression that determines the speed on the segment and returns the respective color along a color ramp:
Speed is calculated using the length of the segment and the time between segment start and end point. Then speed values from 0 to 50 km/h are mapped to the red-yellow-blue color ramp:
ramp_color( 'RdYlBu', scale_linear( length( transform( geometry_n($geometry,@geometry_part_num), 'EPSG:4326','EPSG:54027' ) ) / ( m(end_point( geometry_n($geometry,@geometry_part_num))) - m(start_point(geometry_n($geometry,@geometry_part_num))) ) * 3.6, 0,50, 0,1 ) )
Thanks a lot to @ for all the help figuring out the details!
While the following map might look just like the previous one in the end, note that we now only deal with the original single line feature:
Similar approaches can be used to label segments or positions along the trajectory without having to break the original feature. Thanks to the geometry generator functionality, we can make direct use of the LinestringM data model for trajectory visualization.
The 6th QGIS UK user group meeting in Scotland is happening on the 3rd November 2016. It is being hosted by the EDINA University of Edinburgh at the Informatics Forum and is sponsored by thinkWhere, Ordnance Survey, Angus Council and Cawdor Forestry. Tickets are available through Eventbrite.
The almost final programme of presentations and lightning talks is as follows:
- Phil Taylor (CEH) – How deep is your loch?
- Fiona Hemsley-Flint – QGIS server
- University of Edinburgh – packaging and deploying QGIS
- Anouk Lang – Mapping narrative: QGIS in the humanities classroom
- Art Lembo (Salisbury University, MD) – terrain analysis with massively parallel processing techniques (embarrasingly so)
- Neil Benny (thinkWhere) – finding the heart of Scotland / viewshed analysis
- Tom Chadwin – qgis2web and coding a QGIS plugin
- Pete Wells (Lutra) – WMTS previews and XYZ support
- Stephen Bathgate – decision support system in Forestry
- Tim Manners (Ordnance Survey) – Creating an indoor routable network with QGIS and pgRouting
- Andrew Whitelee – QGIS in forestry/ecology
- Ross McDonald (Angus Council) – Them thar hills: shaded, textured and blended
- Michal Michalski (The Origins of Doha and Qatar Project) – DOHA: Doha Online Historical Atlas
- eeGeo – Using QGIS to create 3D indoor maps
Doors open from 9:00. Registration shortly thereafter. Start and welcome at 9:45 and a planned finish at 16:30. Geobeers to follow.
A common use case of the QGIS TimeManager plugin is visualizing tracking data such as animal migration data. This post illustrates the steps necessary to create an animation from bird migration data. I’m using a dataset published on Movebank:
Fraser KC, Shave A, Savage A, Ritchie A, Bell K, Siegrist J, Ray JD, Applegate K, Pearman M (2016) Data from: Determining fine-scale migratory connectivity and habitat selection for a migratory songbird by using new GPS technology. Movebank Data Repository. doi:10.5441/001/1.5q5gn84d.
It’s a CSV file which can be loaded into QGIS using the Add delimited text layer tool. Once loaded, we can get started:
1. Identify time and ID columns
Especially if you are new to the dataset, have a look at the attribute table and identify the attributes containing timestamps and ID of the moving object. In our sample dataset, time is stored in the aptly named timestamp attribute and uses ISO standard formatting %Y-%m-%d %H:%M:%S.%f. This format is ideal for TimeManager and we can use it without any changes. The object ID attribute is titled individual-local-identifier.
The dataset contains 128 positions of 14 different birds. This means that there are rather long gaps between consecutive observations. In our animation, we’ll want to fill these gaps with interpolated positions to get uninterrupted movement traces.
2. Configuring TimeManager
To set up the animation, go to the TimeManager panel and click Settings | Add Layer. In the following dialog we can specify the time and ID attributes which we identified in the previous step. We also enable linear interpolation. The interpolation option will create an additional point layer in the QGIS project, which contains the interpolated positions.
When using the interpolation option, please note that it currently only works if the point layer is styled with a Single symbol renderer. If a different renderer is configured, it will fail to create the interpolation layer.
Once the layer is configured, the minimum and maximum timestamps will be displayed in the TimeManager dock right bellow the time slider. For this dataset, it makes sense to set the Time frame size, that is the time between animation frames, to one day, so we will see one frame per day:
Now you can test the animation by pressing the TimeManager’s play button. Feel free to add more data, such as background maps or other layers, to your project. Besides exploring the animated data in QGIS, you can also create a video to share your results.
3. Creating a video
To export the animation, click the Export video button. If you are using Linux, you can export videos directly from QGIS. On Windows, you first need to export the animation frames as individual pictures, which you can then convert to a video (for example using the free Windows Movie Maker application).
These are the basic steps to set up an animation for migration data. There are many potential extensions to this animation, including adding permanent traces of past movements. While this approach serves us well for visualizing bird migration routes, it is easy to imagine that other movement data would require different interpolation approaches. Vehicle data, for example, would profit from network-constrained interpolation between observed positions.
If you find the TimeManager plugin useful, please consider supporting its development or getting involved. Many features, such as interpolation, are weekend projects that are still in a proof-of-concept stage. In addition, we have the huge upcoming challenge of migrating the plugin to Python 3 and Qt5 to support QGIS3 ahead of us. Happy QGISing!