YK: Chapter 18: Displaying data
18 Displaying data
This chapter covers the ways in which data can be displayed and visualized within MediaWiki. With the exception of the Maps and DynamicPageList extensions, all the options in this chapter depend on Semantic MediaWiki. There are non-SMW-based data display tools for MediaWiki there are a number of calendar extensions, for instance but these are almost always hacks to some degree, since they don’t rely on data in any kind of standardized format. If you want calendars, charts and maps on your wiki, the Semantic MediaWiki approach is the most flexible, and the easiest one to maintain.
Semantic Result Formats
Semantic Result Formats (SRF) is an extension that contains a wide variety of result formats, or query formats, for Semantic MediaWiki essentially all of the important visualizations for SMW data, other than mapping, are contained in Semantic Result Formats. You can read more about it here:
https://www.mediawiki.org/wiki/Extension:Semantic_Result_Formats
The next few sections will cover formats defined by this extension. However, this book does not provide a comprehensive reference to all the parameters and customizations that each format can take (which in some cases is a lot). For that, there are two main options: use the page Special:Ask on your wiki (see here), assuming you have a wiki and SRF is installed on it, to see all the allowed parameters for each format and try them out on real data; or read the documentation on each format.
Not every format in SRF is enabled by default when it’s installed; most are, but five ’googlebar’, ’googlepie’, ’exhibit’, ’filtered’ and ’excel’ are not, for various reasons. To enable any of these formats, you just need to add it to the $srfgFormats global array. For example, to enable the ’googlebar’ format, add the following to LocalSettings.php, after the inclusion of SRF:
$srfgFormats[] = 'googlebar';
Calendars
[]
Figure 18.1 Calendar created with the ’calendar’ format
After maps, calendars may be the second-most-common form of data visualization. The ’calendar’ format for SMW queries, defined in Semantic Result Formats, lets you display date-based data in a monthly calendar. Here’s an example of a call to display a calendar:
{{#ask:
|? Has date
|format=calendar
}}
This will display a calendar that could look like the one in Figure 18.1.
The navigation at the top lets the user move to other months; the current month being displayed is set in the URL’s query string.
There’s an inefficiency at the heart of the Calendar format: unlike other formats, it doesn’t stop querying after it gets a certain number of results. Instead, it goes through every result, in order to get all the ones that apply to the current month. If there are thousands of dates, or more, this could potentially slow down the system. Unfortunately, the Calendar format can’t query on only the dates within the specified month, because of how SMW works: all the querying is done before the format code kicks in. But there’s a workaround you can do just change the query to look like the following:
{{#ask:
[[Has date::>{{#calendarstartdate:}}]] [[Has date::<{{#calendarenddate:}}]]
|? Has date
|format=calendar
}}
- calendarstartdate and #calendarenddate are parser functions, defined by the Calendar format, that simply display the first and last dates of the month that the user is currently looking at. This can dramatically reduce the amount of querying work that needs to be done. (Note the colon contained in both calls this is necessary so that the parser function will actually be called.)
You can display additional data for each calendar entry, just by adding more printouts to the query (like “?Has attendees”). And you can display all of that data via a template, using the “template=” parameter see here.
And you can set the color that each entry is displayed with, using the parameter “color=”. This isn’t very interesting for a simple query, where every event would be displayed as the same color, but it can become quite useful when calendars are displayed using Semantic Compound Queries see here.
There are more customizations possible for the calendar you can see them all here:
https://semantic-mediawiki.org/wiki/Help:Calendar_format
Charts and graphs
There are various formats defined in SRF that do a chart- or graph-style display:
jqplotchart supports a number of different standard charting options bar, pie, line and donut using the jqPlot JavaScript library.
jqplotseries similar to jqplotchart, but intended for multiple sets of data; adds additional “bubble” and “scatter” charting options.
googlebar, googlepie bar and pie charts created via the Google Charts API (behind the scenes, the code sends the numbers to Google, and gets back an image).
d3chart allows for more unconventional chart types (treemap, bubble, etc.) using the D3 JavaScript library.
dygraphs meant for very large data sets. It gets its main data from a CSV file, but allows for adding annotations to specific points using semantic properties. It uses the dygraphs library.
sparkline creates “sparklines”, small inline charts meant to just show a general trend, using the jquery.sparkline library.
timeseries aggregates pages by a date property value, using the flot library.
graph shows pages and the properties between them as a unordered “graph”, displayed with an image generated by the GraphViz application.
process like the graph format, but shows a “process graph”, a specific type of graph used in workflow modeling.
Many of these charting and graphing formats are configurable they can take parameters to set attributes such as the height and width, the chart title, the color scheme used, etc.
[]
Figure 18.2 Various charting formats from the Semantic Result Formats extension, using (rather pointless) sample data
The default usage for formats like the jqPlot-based ones is to display a set of page names and a number for each, using a property of type “Number”. For instance, you could have the following:
{{#ask: |?Has number of employees |format=jqplotbar}}
However, the jqPlot-based formats, and other charting formats, can also display a “distribution” instead and this is ultimately the more important usage. For the case above, what are the chances that you’ll have a semantic property that holds an exact count of employees in each department? It’s more likely that what you’ll have semantically stored is just pages for the employees themselves, and you’ll have to count them “manually” for each department. Adding “distribution=yes” to the query makes it display a count of the number of pages for each value instead of just listing the pages and their values. So you could instead have a query like:
{{#ask: |?Has department |format=jqplotbar |distribution=yes}}
Though the two queries look very different, they will actually produce the same result, assuming both work.
Mapping
Two MediaWiki extensions Maps and Semantic Maps provide the best mechanism for displaying coordinate-based data, and related functionality like geocoding of addresses. These two extensions were created together, and are meant to be used together Maps can be used by itself, but Semantic Maps requires Maps. They support two mapping services at the moment: Google Maps and OpenLayers. (Another service, Yahoo! Maps, was supported at one time, but was dropped after Yahoo! itself abandoned the service.) More mapping services may eventually get added.
Google Maps probably needs no explanation it’s the dominant mapping service on the web, and rightly so, since it’s extremely well done. (We are talking here about the Google Maps service, not the Google Maps website located at maps.google.com.)
OpenLayers is interesting, because it’s an open-source, generic framework for displaying any mapping data in theory, even Google Maps can be displayed within OpenLayers. Usually, though, when OpenLayers is used, it’s used to display mapping data from OpenStreetMap. OpenStreetMap is a very exciting mapping service, which is completely open-source, and is generated collaboratively by users via a wiki interface (not MediaWiki it uses its own custom software). Google Maps is free to most users, but with exceptions non-public websites need to pay to use it, as do public websites that make massive use of it (the exact amount of traffic before Google requires payment seems to be in flux at the moment). For those cases, and others, the combination of OpenLayers and OpenStreetMap is an increasingly viable alternative.
This book, by necessity, skips over a lot of smaller details, and that’s certainly true of the sections about the Maps and Semantic Maps extensions these two extensions have a variety of parameters and administrative settings that won’t be mentioned here. To see a complete overview (and a lot of interesting demos), please go to the extension homepages:
https://www.mediawiki.org/wiki/Extension:Maps
https://www.mediawiki.org/wiki/Extension:Semantic_Maps
The Maps extension
Maps allows for the display of maps, with or without points on them, in various, non-semantic ways. Maps defines six main functions, each of which are available as both a parser function and a tag function. One does the actual display of maps: display_map. The other five are utility functions, which do a variety of tasks related to coordinates: coordinates, distance, finddestination, geocode, and geodistance.
Let’s go over display_map first. Like all the others, it can be called as either #display_map or <display map>. It can take the following parameters:
points= , or an unnamed parameter the set of points to be displayed, separated by semicolons, taking the form of either addresses or coordinates. There is an entire syntax for these points (see below).
center= (or centre=) sets the center of the map. This can take in either coordinates or an actual address.
service= sets the mapping service to be used; the current possible options are googlemaps, openlayers and osm (for the combination of OpenLayers and OpenStreetMap).
geoservice= sets the geocoding service to be used, if any geocoding is to be done; the current possible options are geonames and google.
width= sets the width of the map can be in “px”, “em” or a percentage (if no unit is specified, pixels are assumed). The default is “100%”.
height= sets the height of the map; works like the width parameter. The default is 350 pixels.
zoom= sets the zoom level of the map. This is an integer, whose range of settings depends on the mapping service being used.
icon= an image to show for all markers (this, like most parameters, is optional).
lines= sets the group of points between which to draw lines, with groups separated by semicolons, and then points within groups separated by colons.
polygons= same syntax as the “lines” parameter, but sets filled-in polygons.
The syntax for defining the parameter for the points, which can either be unnamed or named as points=, depends on whether “display_map” is called as a parser function or a tag. In both cases, if you simply want to display one point (the most common usage), and don’t want any specific formatting, this value can just be coordinates (like “43.9° N, 103.5° W”), or an address or descriptor (like “Eiffel Tower”). If you want to show more than one point, or have extra formatting, the full syntax is as follows:
For #display_map:
point 1~title 1~caption 1~marker 1;point 2~title 2~caption 2~marker 2
For <display_map>:
point 1|title 1|caption 1|marker 1
point 2|title 2|caption 2|marker 2
Here’s an example of a call to <display_map>:
<display_map zoom=11 width=475 height=350>
Nairobi National Park|Nairobi National Park|Kenya’s first national park.|Green marker.png
</display_map>
And the equivalent call to #display_map would be:
{{#display_map:Nairobi National Park~Nairobi National Park~Kenya’s first national park.~Green marker.png |zoom=11 |width=475 |height=350}}
Either call would produce the output seen in Figure 18.3. This image shows the map if the marker is clicked on, and then hovered over the “title” field serves as both the title of the caption, and the hover text.
[]
Figure 18.3 Map created using #display_map, from the Maps extension
In addition, Maps defines the following functions (all available as both tags and parser functions), which are helpful when dealing with geographical data:
coordinates converts coordinates from one format to another (such as converting from degrees-minutes-seconds to a decimal number)
distance converts a distance from one unit to another (such as miles to kilometers)
finddestination displays the coordinates of a location, given an initial location, a bearing (an angle) and a distance
geocode displays the coordinates of a location, given an address or place name
geodistance displays the distance between two coordinates or place names
Semantic Maps
The Semantic Maps extension requires the extensions Semantic MediaWiki and Maps (it also requires Validator, which is also required by those other extensions). It defines a new SMW property type, “Geographic coordinate”, that is used for coordinates. It provides a way to display coordinate data from multiple pages on one map, through various query formats. And it also lets you use a map to enter coordinates in a Semantic Forms-based form.
Here’s an example of a simple query that displays a map:
{{#ask: |?Has coordinates |format=googlemaps}}
In order for this query to work, one or more pages in the category “Parks” need to have the property “Has coordinates” set for them, with valid coordinates; and the property “Has coordinates” needs to be defined as having type “Geographic coordinate”.
Queries with Semantic Maps can take in most of the same additional parameters that the display_map function from Maps can. Its set of allowed parameters is: center/centre, width, height, zoom, geoservice, template, icon, forceshow and showtitle. The last four parameters are new to Semantic Maps. The “template” parameter lets you set the layout of the caption for each point; see here for how it can be used. The “icon” parameter sets the icon/marker image to be used for the points. The “forceshow” parameter shows a map even if there are no results to display; by default it’s set to “yes”. And the “showtitle” parameter dictates whether page titles are shown for each point on the map; by default it’s set to “yes” as well.
There are various other parameters, that are specific to each mapping format.
Here’s an example of a more complicated query:
{{#ask:
| ?Has coordinates
| ?Has location type
| ?Located in
| format=map
| template=Location Popup
| showtitle=no
| icon=Blue_marker.png
}}
This query uses a template, “Location Popup”, that formats the set of values for each point there are three values for each point, but the page title is not one of them, because “showtitle” is set to “no” as well. And every point is shown with a custom icon the uploaded image “Blue marker.png”.
The query formats that Semantic Maps supports are “maps”, “googlemaps”, “openlayers” and “kml”. The first three are hopefully self-explanatory; they are identical to the “service” parameter values for the Maps extension. The last one, “kml”, exports the data in KML (Keyhole Markup Language), a standard file format for coordinate data.
Map form inputs
The other very useful feature of Semantic Maps is that it provides form inputs for map-based entry of coordinates within Semantic Forms forms. Since it’s usually unlikely that users will know the geographical coordinates of a place, a map lets users easily find the point, and automatically determine the coordinates.
Figure 18.4 shows how a map form input looks. Here, the user has entered “causeway bay, hong kong” into the address lookup field, and clicked on “Look up coordinates”. The map now shows a location on the map, and the coordinates field (which is the field whose value actually ends up on the resulting page) is set to match that location. Assuming the real location the user is looking for is somewhere around there, the user now has three options to try to pinpoint it: zoom in and click on the exact point on the map (the standard, and easiest, option); enter a more specific address in the address lookup field, and look it up again; or manually set the coordinates in the coordinates field (which would be an unlikely option).
[]
Figure 18.4 Semantic Maps form input for Semantic Forms, using Google Maps
It’s worth noting that the address lookup field, below the coordinates, exists only for lookup purposes, and does not show up on the page that’s generated. So if you have a form where both the address and the coordinates are meant to be entered, the user may end up typing in the address twice.
The SF form input types that Semantic Maps defines are “map” (which displays whatever the default mapping service is), “googlemaps” and “openlayers”.
Other visualizations and displays
There are various other visualizations provided by Semantic Result Formats. Here are the relevant format names, and a description of each:
timeline displays the values for a “Date” property of the queried set of pages on a scrollable, JavaScript-based horizontal timeline.
eventline similar to the “timeline” format, but displays multiple points if any page has multiple values for the Date property.
gallery displays a gallery of images, similar to the display of the- Outline splits up the set of queried pages by a set of properties that they have, grouping them so that the result resembles an outline.
- Array has a display similar to the “list” format defined in Semantic MediaWiki, but allows for more customization of the separators used between page names, and between values.
- Hash similar to the “array” format, but outputs display in a format that can be directly used by the MediaWiki extension HashTables (https://www.mediawiki.org/wiki/Extension:HashTables).
- Valuerank similar to “tagcloud”, but simply shows the number of instances of each value, instead of changing the font size.
- There are a variety of other result formats defined in Semantic Result Formats. Here is the current list, and their descriptions:
- Eventcalendar similar to the “calendar” format, but uses the FullCalendar JavaScript library to display a calendar, instead of HTML, which can make for a slicker user experience.
- Listwidget splits up results into pages, with a clickable table of contents at the top the most common use is to split up results by alphabetically, by the first letter of their name.
- Pagewidget intended to be a generic format for breaking up results into pages of display. It currently only supports one interface: “carousel”, which lets users use buttons to scroll back and forth through the different pages.
- Incoming shows the incoming properties pointing to each page in the results.
- Semantic Compound Queries
- One thing the
- Figure 18.5 Map created using Semantic Compound Queries and the ’googlemaps’ format
- The first thing you would need to do is upload images for the icons you want to display. Let’s say you upload images for the two icons shown in Figure 18.5, and name them "Red cross.png" and "Dollar sign.png".
- The call to
- The syntax for
- The two geographical-coordinates properties being displayed in the query are in this both called “Has coordinates” here, but they could theoretically have different names.
- What if we want to add to our map a third set of points, for all businesses that aren’t banks, each point represented with a picture of a building? Thankfully,
- ;?Has date ;color=blue
- ;?Has deadline ;color=orange
- Figure 18.6 shows what such a query could generate.
- Figure 18.6 A calendar created using Semantic Compound Queries and the ’calendar’ format
- Compound queries can also be used for more basic formats like tables and lists, though in practice that’s rarely done.
- You can read more about Semantic Compound Queries here:
- Https://www.mediawiki.org/wiki/Extension:Semantic Compound Queries
- DynamicPageList
- Https://www.mediawiki.org/wiki/Extension:DynamicPageList (Wikimedia)
- Https://www.mediawiki.org/wiki/Extension:DynamicPageList (third-party)
- The two are fairly similar in their functionality. The Wikimedia version is the safer choice. For wikis where performance is not paramount, the “third-party” version is the more powerful, and seemingly more popular, choice.
- Most notably, DPL is used to find intersections of categories, as well as to display all the pages in one category. For instance, to show a sorted list of pages about 18th-century composers, you might have the following:
- Category = Composers
- Category = 18th century births
- Order = ascending