YK: Chapter 17: Semantic Forms

From Blik
Jump to: navigation, search

Semantic Forms

Though there are many extensions that make use of Semantic MediaWiki, Semantic Forms is the most widely-used. It provides a way to edit both template calls and sections within a page, where the templates are expected to in turn use Semantic MediaWiki to store their values. It thus complements SMW, by providing a structure for SMW’s storage capabilities.

This chapter begins with an explanation of how (and why) SMW and templates are used in conjunction, and then gives an in-depth listing of Semantic Forms’ syntax and features.

A template-based approach to SMW

We covered templates here, and Semantic MediaWiki in the previous chapter. Both are quite useful on their own; but it’s when the two are used together that the full power of both emerges. Templates without a storage system like SMW can provide structure to pages, and a nice standard display, but all that data stored within their fields just goes to waste: you can’t use or display it anywhere outside that page. Meanwhile, Semantic MediaWiki, when used by itself and outside of a structure like templates, is interesting but not very practical.

This is the naive approach to using SMW tags ­ to intersperse them among free text, like:

Bob works in Has department::Accounting.

But there are a number of problems with this approach. Most obviously, it requires people to learn and understand a new syntax. The tag syntax is another bit of wikitext that users have to understand, even when they don’t plan to edit anything related to semantic properties. But more importantly, there’s lots of ambiguity about the actual data in question. What if Bob moves to a different department ­ is it enough to change the department name, or should there also be a property like “Had department”, pointing to the old value? And is there specific naming that should be used for each department? You could have software that provides autocompletion for semantic properties and their values, but it still won’t resolve all of the ambiguity. The main confusion springs from the fact that users can’t inherently know what the correct “data structure” should be for each page ­ the ideal set of semantic properties, and the expected value or values there should be for each. A template implicitly defines these things. Without a template, there is no easy way to define, or to clarify to users, which properties should be used and which shouldn’t. But a template serves as both the definition and the container for a data structure.

There’s another benefit to using templates: they can also set the relevant category or categories for a page. For MediaWiki installations that don’t use Semantic MediaWiki, categories can end up getting used for a large variety of purposes (see here). In SMW, the number of categories tends to be much smaller, but categories are still used to define a page’s type: whether a page represents a person, department, movie, fish, etc. A template can add such a category tag automatically, so that users don’t have to add it separately.

In short, Semantic MediaWiki provides meaning to templates, while templates provide structure to Semantic MediaWiki; it’s a combination that works very well together.

Let’s see how it works in practice. We’ll start with a simple example: a template that defines data for a page about an employee. We want this template to hold, for every employee, their phone number, email address and current position. Let’s say the template is called "Employee". A call to this template could look like:

Template:Employee

(The employee’s name is not included in the template because it will be the name of the page.) Here’s how the relevant part of the template definition would look:

Phone number Has phone number::
Email address Has email address::
Position Has position::

This code defines an infobox-style table, with one row for each parameter; and it stores each parameter, using a semantic property, at the same time that it displays that parameter. And note also that the template sets a category tag for the page ­ every page that includes this template gets automatically added to the “Employees” category.

So now, adding the simple template call above to any page, with the relevant data filled out, will display the data, store it semantically for querying elsewhere, and add the page to the right category ­ all without any extra work needed on the user’s part.

What if we want to allow a field to hold more than one value? Of course, there’s nothing stopping users from just entering a comma-separated list of values for a template parameter, but then the semantic property won’t be set correctly, as we’ll see in a moment. For such a case, there’s the #arraymap function. #arraymap is actually defined by the Semantic Forms extension, which is covered in the next section. (It’s somewhat of an accident of history that it’s defined in Semantic Forms, since there’s nothing semantic or form-based about #arraymap, but nevertheless that’s where it is.) We’ll get to the full syntax of #arraymap (and its sibling, #arraymaptemplate) later in this chapter ­ for now, let’s just look at an example of how it’s used.

Say you want to add to the Employee template a new parameter, “Previous positions”, that holds a comma-separated list of all the positions the person previously held at the company. In other words, you want a template call to look like:

{{Employee

...

|Previous positions=Junior accountant, Accountant

}}

In the template, you could of course just attach a semantic property like “Has previous positions” to the “Previous positions” field ­ but it’s much preferable to separate out the values in the list, so that each is its own property value. (Someone at some point might want to query on anyone who has had the previous position “Accountant”, for example.) To do that, we use the #arraymap function. Here is how the relevant lines of the template would look:

! Previous positions

| {{#arraymap:|,|x|Has previous position::x}}

  1. arraymap splits up the value by the specified delimiter (in this case, a comma), and applies the same "mapping" (in this case, assigning a semantic property) to each resulting element.

Now, there may be more information you want displayed about each of those previous positions ­ the start and end date for each, for instance. If you want to store this kind of compound information semantically, you’ll need to use special syntax (see here). But in terms of simply placing these elements on the page, the template system can handle it quite well. To have such compound data, you just need to create a template that holds a single "row" of information, and then have repeated instances of it on the page. In this case, a good solution would be to have a template called "Position". It would eliminate the need for both the "Position" and "Previous positions" fields in the "Employee" template, and it could be called like the following:

Template:Position

Each employee page would then have an instance of this template for every position that the employee has had, past or present.

There are two standard ways in which calls to such templates can be added to a page. The first is to place the calls to the template after the main template, in this way:

{{Employee

...

}}

{{Position

...

}}

{{Position

...

}}

...

And the second is to place calls to the multiple-instance template within the call to the main template, like this:

{{Employee

...

|Positions=Template:PositionTemplate:Position

...

}}

We’ll see later in this chapter how Semantic Forms lets you create and edit pages with either of these structures.

In theory, you could keep going down multiple “levels” of data ­ if you have embedded template calls, they could themselves have template calls embedded within them, etc. For instance, for each position you could then store the set of bosses the employee worked for at the time, with a start and end date for each. This is not recommended, though ­ one level of additional data is usually all that works in practice, mostly because Semantic Forms doesn’t support having more than one level of embedding, but also because that’s just too much complexity for most users to deal with.

There’s one exception to that, though: if you just want to have a list of values for a field in one of those multiple-instance templates ­ for instance, if you just want to have a list of bosses for every position ­ that’s possible. Again, see here for a full explanation of handling compound data.

Introduction to Semantic Forms

So far we’ve covered the usage of Semantic MediaWiki and templates together, and why they’re such an important combination. Semantic Forms fills in the missing piece of the puzzle: an easy way to create and edit such pages. It lets you define forms for all your wiki’s different page structures, to let users then easily create and edit pages that contain template calls. The beauty of Semantic Forms is that users don’t have to know the syntax for template calls, and they don’t have to know the names of the templates to use, or the names of their fields. Really, they don’t even have to know that what they’re editing is a wiki page. All they have to do is fill out a form.

Semantic Forms, or SF as it’s abbreviated, makes use of the data structure defined via SMW to display more intelligent forms. The form input that is displayed for each template field is by default based on the SMW property, if one exists, that corresponds to that field. So, for instance, if a certain template field is stored using a property of type Date, then SF displays a date input to edit that field. And if the property is an enumeration, i.e. it has a pre-defined set of allowed values using “Allows value”, then the form displays a dropdown for that field, with that set of values. The form can be set to override these default input type options, but at its core, SF uses SMW to understand and thus enforce the desired data structure for the wiki.

On a deeper level, SF and SMW complement each other in an important way, which is that together they can be used to simulate a standard database-backed website of the kind one sees all over the web. We’re used to seeing data in terms of a structured set of fields, used to editing such data with forms, and used to then seeing that data be aggregated and displayed in various ways. That’s true for product-review sites, self-publishing sites like Flickr, YouTube or blogs, or any of a wide variety of content-management systems that people use for their jobs. SF and SMW together let you mimic that kind of interface. The user, importantly, just sees the interface: they don’t have to think about the wikitext or semantic markup, and are freed to think about the important stuff, which is the data itself. There are in fact many cases where users of an SMW/SF-backed system aren’t aware that the system they’re dealing with is semantic, or even a wiki.

This brings up an obvious question: if it’s a great success for a wiki to mimic non-wiki software, why not just use that other software in the first place? That’s because using a wiki has a number of big advantages. The primary one is that the system keeps a version history of every change ­ which means that you don’t have to fear random users going in and modifying your data, because any bad changes can always be reverted. That in turn means that suddenly you can open up your data to editing by everyone, which is actually a revolutionary change. In most database-backed systems, tight controls are placed over editing ­ regular users can usually only edit information that pertains to them, while the editing of general-use information is restricted to a very small group of people who can be trusted to not delete important data, either accidentally or on purpose. And if mistakes happen, they may require a concerted effort to fix ­ like going through old database backups. On the other hand, with a wiki, having everything editable by everyone is no problem at all ­ in fact, it’s the default state. So if you’re trying to create a set of general data, you’ve just seen your potential base of editors jump up from a handful of people to, theoretically, billions ­ or, more realistically, hundreds.

Even if you only want a small group of people to edit the data, though -- for example, you run an internal knowledge base for a small team of people within a company ­ having the data stored in a wiki is helpful. Let’s say that, on a page for a project, there’s a detail that you disagree with. On a wiki, a quick check of the history page would let you see who added it, and when ­ or whether you in fact added it yourself and then forgot. With a non-wiki system, the only real option is to send out an email to the group and hope that someone remembers; which becomes more difficult the larger the size of the data set, and the larger the number of editors, and the longer it’s been in place.

There are two other advantages that an SMW/SF-based system has over non-wiki software. The first is the flexibility of the data structure; and it springs from the fact that the data, and the data structure, in the wiki are all stored as text, and not in a relational database. Text is a very flexible medium, and changes to the data structure can be both easily done and easily undone. You can thus open up editing of the pages that define the data structure ­ forms, templates and the like ­ to everyone, without the fear of having drastic, irreversible changes made. In a conventional database-backed system, the editing of the data structure, i.e. the set of database tables and their fields, has to be restricted to a very small group of technical experts.

(To be fair, there’s a new wave of “document-oriented database” systems, like MongoDB, also referred to as “NoSQL” systems, that offer this same advantage of flexibility, though without the built-in interface tools.)

Getting started with Semantic Forms

The rest of this chapter provides a breakdown of the syntax and workflows you can use with Semantic Forms. But if you’re just getting started, the best approach is to use Semantic Forms’ helper pages to quickly create pages. There are essentially five good options:

Use the page Special:CreateClass to create everything at once ­ categories, properties, templates and forms.

Use the individual pages Special:CreateProperty, Special:CreateCategory, Special:CreateTemplate and Special:CreateForm to create the entire data structure. This is a more hands-on approach, which is less ideal for starting out, but it’s quite useful if you already have templates and categories in place (and possibly properties as well), and only want to create what is still missing.

Similar to the previous option, you can go to any specific uncreated property, category, template or form page, and click on the “create with form” tab, which displays a form that matches the form found in the relevant one of those four special pages.

Copy from an existing installation or package. If you see a data structure setup that you like elsewhere, you can copy and paste all the necessary files to your wiki. (Though it’s usually a good idea to get their permission first, even if legally it probably isn’t necessary.) And there may be a package of such pages, created with a generic purpose like project management in mind, that you want to copy onto your wiki. At the moment, the company semantic::apps offers such packages, though not for free - see here.

Use the Page Schemas extension. See here for an introduction to this extension; it lets you create a set of “schemas” for your data structure, from which forms, templates etc. can be automatically generated.

First, let’s look at Special:CreateClass. Figure 17.1 shows the interface that appears on that page.

[]

Figure 17.1 Special:CreateClass page

Using this interface, you can define an entire “class” ­ a data structure to represent a single page type, which is composed of a template, a form, a category, and properties. Not every page type can be defined in this way ­ some pages will contain more than one standard template, for instance ­ but in many cases it’s a good starting point. The set of fields at the bottom is used to create the template, the form, and the properties.

Why bother creating the category? Because, in Semantic Forms, the category is where the connection between pages and their forms is defined, so that an “edit with form” tab shows up at the top of each page. This is done via the “Has default form” special property, which we’ll get to later.

Another option is to use the special pages Special:CreateProperty, Special:CreateCategory, Special:CreateTemplate and Special:CreateForm, all defined by Semantic Forms. These have the advantage of granularity ­ you can create, or regenerate, any specific pages ­ and they also have the advantage of additional fields that Special:CreateClass doesn’t offer.

For example, in Figure 17.2 you can see part of the helper form at Special:CreateForm ­ it lets you set all the allowed parameters for each form field, with the group of parameters based on the selected input type. We’ll get to all of these specific parameters in the next section.

[]

Figure 17.2 Special:CreateForm page

By the way, you may find it odd that the pages Special:CreateProperty and Special:CreateTemplate are contained in Semantic Forms, since they have nothing to do with forms (other than the fact that they provide a helper form to generate pages ­ which is not the same thing). And the same argument could actually be made for Special:Templates, which Semantic Forms also provides. For the case of CreateTemplate and Templates, they are part of Semantic Forms because Semantic Forms is based around templates in a way that almost nothing else is among core MediaWiki and its extensions. For Special:CreateProperty, though, it’s really just an accident of history that it’s part of Semantic Forms and not Semantic MediaWiki, which would be the obvious home for it. It could be that in the future Special:CreateProperty will move to SMW.

Form definitions

Semantic Forms provides an entire syntax for defining forms, that makes use of special tags contained within triple curly brackets. Pages that define forms should always go in the "Form:" namespace (or, for non-English-language wikis, its equivalent in another language). Such pages are not called forms, but rather "form-definition pages", to distinguish them from the actual corresponding forms that users see.

Before we define the syntax, here’s an example of the full contents of a form-definition page, for a “Project” form:


This is the 'Project' form.

To add a site with this form, enter its name below;

if a page with that name already exists, you will be sent to a form to edit that page.


{{#forminput:form=Project|autocomplete on category=Projects}}


Already, without getting into any of the specifics of the syntax, you can notice a few things:

The form-definition page serves a dual purpose: within the