YK: Chapter 18: Displaying data

From Blik
Jump to: navigation, search

18 Displaying data

This chapter covers the ways in which data can be displayed and visualized within MediaWiki. With the exception of the Maps and DynamicPageList extensions, all the options in this chapter depend on Semantic MediaWiki. There are non-SMW-based data display tools for MediaWiki ­ there are a number of calendar extensions, for instance ­ but these are almost always hacks to some degree, since they don’t rely on data in any kind of standardized format. If you want calendars, charts and maps on your wiki, the Semantic MediaWiki approach is the most flexible, and the easiest one to maintain.

Semantic Result Formats

Semantic Result Formats (SRF) is an extension that contains a wide variety of result formats, or query formats, for Semantic MediaWiki ­ essentially all of the important visualizations for SMW data, other than mapping, are contained in Semantic Result Formats. You can read more about it here:

https://www.mediawiki.org/wiki/Extension:Semantic_Result_Formats

The next few sections will cover formats defined by this extension. However, this book does not provide a comprehensive reference to all the parameters and customizations that each format can take (which in some cases is a lot). For that, there are two main options: use the page Special:Ask on your wiki (see here), assuming you have a wiki and SRF is installed on it, to see all the allowed parameters for each format and try them out on real data; or read the documentation on each format.

Not every format in SRF is enabled by default when it’s installed; most are, but five ­ ’googlebar’, ’googlepie’, ’exhibit’, ’filtered’ and ’excel’ ­ are not, for various reasons. To enable any of these formats, you just need to add it to the $srfgFormats global array. For example, to enable the ’googlebar’ format, add the following to LocalSettings.php, after the inclusion of SRF:

$srfgFormats[] = 'googlebar';

Calendars

[]


Figure 18.1 Calendar created with the ’calendar’ format

After maps, calendars may be the second-most-common form of data visualization. The ’calendar’ format for SMW queries, defined in Semantic Result Formats, lets you display date-based data in a monthly calendar. Here’s an example of a call to display a calendar:

{{#ask:

|? Has date

|format=calendar

}}

This will display a calendar that could look like the one in Figure 18.1.

The navigation at the top lets the user move to other months; the current month being displayed is set in the URL’s query string.

There’s an inefficiency at the heart of the Calendar format: unlike other formats, it doesn’t stop querying after it gets a certain number of results. Instead, it goes through every result, in order to get all the ones that apply to the current month. If there are thousands of dates, or more, this could potentially slow down the system. Unfortunately, the Calendar format can’t query on only the dates within the specified month, because of how SMW works: all the querying is done before the format code kicks in. But there’s a workaround you can do ­ just change the query to look like the following:

{{#ask:

[[Has date::>{{#calendarstartdate:}}]] [[Has date::<{{#calendarenddate:}}]]

|? Has date

|format=calendar

}}

  1. calendarstartdate and #calendarenddate are parser functions, defined by the Calendar format, that simply display the first and last dates of the month that the user is currently looking at. This can dramatically reduce the amount of querying work that needs to be done. (Note the colon contained in both calls ­ this is necessary so that the parser function will actually be called.)

You can display additional data for each calendar entry, just by adding more printouts to the query (like “?Has attendees”). And you can display all of that data via a template, using the “template=” parameter ­ see here.

And you can set the color that each entry is displayed with, using the parameter “color=”. This isn’t very interesting for a simple query, where every event would be displayed as the same color, but it can become quite useful when calendars are displayed using Semantic Compound Queries ­ see here.

There are more customizations possible for the calendar ­ you can see them all here:

https://semantic-mediawiki.org/wiki/Help:Calendar_format

Charts and graphs

There are various formats defined in SRF that do a chart- or graph-style display:

jqplotchart ­ supports a number of different standard charting options ­ bar, pie, line and donut ­ using the jqPlot JavaScript library.

jqplotseries ­ similar to jqplotchart, but intended for multiple sets of data; adds additional “bubble” and “scatter” charting options.

googlebar, googlepie ­ bar and pie charts created via the Google Charts API (behind the scenes, the code sends the numbers to Google, and gets back an image).

d3chart ­ allows for more unconventional chart types (treemap, bubble, etc.) using the D3 JavaScript library.

dygraphs ­ meant for very large data sets. It gets its main data from a CSV file, but allows for adding annotations to specific points using semantic properties. It uses the dygraphs library.

sparkline ­ creates “sparklines”, small inline charts meant to just show a general trend, using the jquery.sparkline library.

timeseries ­ aggregates pages by a date property value, using the flot library.

graph ­ shows pages and the properties between them as a unordered “graph”, displayed with an image generated by the GraphViz application.

process ­ like the graph format, but shows a “process graph”, a specific type of graph used in workflow modeling.

Many of these charting and graphing formats are configurable ­ they can take parameters to set attributes such as the height and width, the chart title, the color scheme used, etc.

[]

Figure 18.2 Various charting formats from the Semantic Result Formats extension, using (rather pointless) sample data

The default usage for formats like the jqPlot-based ones is to display a set of page names and a number for each, using a property of type “Number”. For instance, you could have the following:

{{#ask: |?Has number of employees |format=jqplotbar}}

However, the jqPlot-based formats, and other charting formats, can also display a “distribution” instead ­ and this is ultimately the more important usage. For the case above, what are the chances that you’ll have a semantic property that holds an exact count of employees in each department? It’s more likely that what you’ll have semantically stored is just pages for the employees themselves, and you’ll have to count them “manually” for each department. Adding “distribution=yes” to the query makes it display a count of the number of pages for each value ­ instead of just listing the pages and their values. So you could instead have a query like:

{{#ask: |?Has department |format=jqplotbar |distribution=yes}}

Though the two queries look very different, they will actually produce the same result, assuming both work.

Mapping

Two MediaWiki extensions ­ Maps and Semantic Maps ­ provide the best mechanism for displaying coordinate-based data, and related functionality like geocoding of addresses. These two extensions were created together, and are meant to be used together ­ Maps can be used by itself, but Semantic Maps requires Maps. They support two mapping services at the moment: Google Maps and OpenLayers. (Another service, Yahoo! Maps, was supported at one time, but was dropped after Yahoo! itself abandoned the service.) More mapping services may eventually get added.

Google Maps probably needs no explanation ­ it’s the dominant mapping service on the web, and rightly so, since it’s extremely well done. (We are talking here about the Google Maps service, not the Google Maps website located at maps.google.com.)

OpenLayers is interesting, because it’s an open-source, generic framework for displaying any mapping data ­ in theory, even Google Maps can be displayed within OpenLayers. Usually, though, when OpenLayers is used, it’s used to display mapping data from OpenStreetMap. OpenStreetMap is a very exciting mapping service, which is completely open-source, and is generated collaboratively by users via a wiki interface (not MediaWiki ­ it uses its own custom software). Google Maps is free to most users, but with exceptions ­ non-public websites need to pay to use it, as do public websites that make massive use of it (the exact amount of traffic before Google requires payment seems to be in flux at the moment). For those cases, and others, the combination of OpenLayers and OpenStreetMap is an increasingly viable alternative.

This book, by necessity, skips over a lot of smaller details, and that’s certainly true of the sections about the Maps and Semantic Maps extensions ­ these two extensions have a variety of parameters and administrative settings that won’t be mentioned here. To see a complete overview (and a lot of interesting demos), please go to the extension homepages:

https://www.mediawiki.org/wiki/Extension:Maps

https://www.mediawiki.org/wiki/Extension:Semantic_Maps

The Maps extension

Maps allows for the display of maps, with or without points on them, in various, non-semantic ways. Maps defines six main functions, each of which are available as both a parser function and a tag function. One does the actual display of maps: display_map. The other five are utility functions, which do a variety of tasks related to coordinates: coordinates, distance, finddestination, geocode, and geodistance.

Let’s go over display_map first. Like all the others, it can be called as either #display_map or <display map>. It can take the following parameters:

points= , or an unnamed parameter ­ the set of points to be displayed, separated by semicolons, taking the form of either addresses or coordinates. There is an entire syntax for these points (see below).

center= (or centre=) ­ sets the center of the map. This can take in either coordinates or an actual address.

service= ­ sets the mapping service to be used; the current possible options are googlemaps, openlayers and osm (for the combination of OpenLayers and OpenStreetMap).

geoservice= ­ sets the geocoding service to be used, if any geocoding is to be done; the current possible options are geonames and google.

width= ­ sets the width of the map ­ can be in “px”, “em” or a percentage (if no unit is specified, pixels are assumed). The default is “100%”.

height= ­ sets the height of the map; works like the width parameter. The default is 350 pixels.

zoom= ­ sets the zoom level of the map. This is an integer, whose range of settings depends on the mapping service being used.

icon= ­ an image to show for all markers (this, like most parameters, is optional).

lines= ­ sets the group of points between which to draw lines, with groups separated by semicolons, and then points within groups separated by colons.

polygons= ­ same syntax as the “lines” parameter, but sets filled-in polygons.

The syntax for defining the parameter for the points, which can either be unnamed or named as points=, depends on whether “display_map” is called as a parser function or a tag. In both cases, if you simply want to display one point (the most common usage), and don’t want any specific formatting, this value can just be coordinates (like “43.9° N, 103.5° W”), or an address or descriptor (like “Eiffel Tower”). If you want to show more than one point, or have extra formatting, the full syntax is as follows:

For #display_map:

point 1~title 1~caption 1~marker 1;point 2~title 2~caption 2~marker 2

For <display_map>:

point 1|title 1|caption 1|marker 1

point 2|title 2|caption 2|marker 2

Here’s an example of a call to <display_map>:

<display_map zoom=11 width=475 height=350>

Nairobi National Park|Nairobi National Park|Kenya’s first national park.|Green marker.png

</display_map>

And the equivalent call to #display_map would be:

{{#display_map:Nairobi National Park~Nairobi National Park~Kenya’s first national park.~Green marker.png |zoom=11 |width=475 |height=350}}

Either call would produce the output seen in Figure 18.3. This image shows the map if the marker is clicked on, and then hovered over ­ the “title” field serves as both the title of the caption, and the hover text.

[]

Figure 18.3 Map created using #display_map, from the Maps extension

In addition, Maps defines the following functions (all available as both tags and parser functions), which are helpful when dealing with geographical data:

coordinates ­ converts coordinates from one format to another (such as converting from degrees-minutes-seconds to a decimal number)

distance ­ converts a distance from one unit to another (such as miles to kilometers)

finddestination ­ displays the coordinates of a location, given an initial location, a bearing (an angle) and a distance

geocode ­ displays the coordinates of a location, given an address or place name

geodistance ­ displays the distance between two coordinates or place names

Semantic Maps

The Semantic Maps extension requires the extensions Semantic MediaWiki and Maps (it also requires Validator, which is also required by those other extensions). It defines a new SMW property type, “Geographic coordinate”, that is used for coordinates. It provides a way to display coordinate data from multiple pages on one map, through various query formats. And it also lets you use a map to enter coordinates in a Semantic Forms-based form.

Here’s an example of a simple query that displays a map:

{{#ask: |?Has coordinates |format=googlemaps}}

In order for this query to work, one or more pages in the category “Parks” need to have the property “Has coordinates” set for them, with valid coordinates; and the property “Has coordinates” needs to be defined as having type “Geographic coordinate”.

Queries with Semantic Maps can take in most of the same additional parameters that the display_map function from Maps can. Its set of allowed parameters is: center/centre, width, height, zoom, geoservice, template, icon, forceshow and showtitle. The last four parameters are new to Semantic Maps. The “template” parameter lets you set the layout of the caption for each point; see here for how it can be used. The “icon” parameter sets the icon/marker image to be used for the points. The “forceshow” parameter shows a map even if there are no results to display; by default it’s set to “yes”. And the “showtitle” parameter dictates whether page titles are shown for each point on the map; by default it’s set to “yes” as well.

There are various other parameters, that are specific to each mapping format.

Here’s an example of a more complicated query:

{{#ask:

| ?Has coordinates

| ?Has location type

| ?Located in

| format=map

| template=Location Popup

| showtitle=no

| icon=Blue_marker.png

}}

This query uses a template, “Location Popup”, that formats the set of values for each point ­ there are three values for each point, but the page title is not one of them, because “showtitle” is set to “no” as well. And every point is shown with a custom icon ­ the uploaded image “Blue marker.png”.

The query formats that Semantic Maps supports are “maps”, “googlemaps”, “openlayers” and “kml”. The first three are hopefully self-explanatory; they are identical to the “service” parameter values for the Maps extension. The last one, “kml”, exports the data in KML (Keyhole Markup Language), a standard file format for coordinate data.

Map form inputs

The other very useful feature of Semantic Maps is that it provides form inputs for map-based entry of coordinates within Semantic Forms forms. Since it’s usually unlikely that users will know the geographical coordinates of a place, a map lets users easily find the point, and automatically determine the coordinates.

Figure 18.4 shows how a map form input looks. Here, the user has entered “causeway bay, hong kong” into the address lookup field, and clicked on “Look up coordinates”. The map now shows a location on the map, and the coordinates field (which is the field whose value actually ends up on the resulting page) is set to match that location. Assuming the real location the user is looking for is somewhere around there, the user now has three options to try to pinpoint it: zoom in and click on the exact point on the map (the standard, and easiest, option); enter a more specific address in the address lookup field, and look it up again; or manually set the coordinates in the coordinates field (which would be an unlikely option).

[]

Figure 18.4 Semantic Maps form input for Semantic Forms, using Google Maps

It’s worth noting that the address lookup field, below the coordinates, exists only for lookup purposes, and does not show up on the page that’s generated. So if you have a form where both the address and the coordinates are meant to be entered, the user may end up typing in the address twice.

The SF form input types that Semantic Maps defines are “map” (which displays whatever the default mapping service is), “googlemaps” and “openlayers”.

Other visualizations and displays

There are various other visualizations provided by Semantic Result Formats. Here are the relevant format names, and a description of each:

timeline ­ displays the values for a “Date” property of the queried set of pages on a scrollable, JavaScript-based horizontal timeline.

eventline ­ similar to the “timeline” format, but displays multiple points if any page has multiple values for the Date property.

gallery ­ displays a gallery of images, similar to the display of the