Making sense of tagging

By now almost everyone and their dog are familiar with the Web 2.0 meme and it’s common attributes. One of the more prominent features is tagging, assigning free text keywords to your photos, bookmarks and everything else.
This has many benefits, as you can generate nice tag clouds or find interesting bookmarks by tag subject.

But there are problems as well,most prominently the fact that my tag word may mean something rather different, depending on context.

Over the past years I have been struggling with this problem, especially for tagging my photos. At first I cooked my own solution, based on a modified version of the Exif parser jhead (with added XML output) and a sticky ball of XSL transformation scripts (never published).
Then I switched to iPhoto. Adding tags itself is a real pain with iPhoto, but this problem is solved by the excellent Keyword Assistant. The problem, however, is still in making sense of those keywords. I mean, there must at least be an option to export this metadata together with the image files, for archival (I’m rather sure that iPhoto 6 format will be forgotten about in a mere 15 to 20 years from now).

There appear to be a couple of half finished projects to export iPhoto metadata to RDF. This looks like a promising route, but for some reason these didn’t gain traction and seem to have been abandoned.

Of course, exporting just tags does not give the definitive answer to what exactly these tags mean, especially a couple of years from now. Context matters very much, if I tag a photo with a certain keyword, this may well mean something different than the same keyword for, let’s say, a song.

So I conceived a very nice contextual tagging system, all in my head. Working title: TagLib. This would be a service-like application, always sitting in the background (or maybe running remotely as a web service) and waiting for tagging activity. Then, whenever a tag needs to be entered, all kinds of context would be considered. For instance, the kind of subject. When tagging a photo, the tag could be associated with the media type (photo) and time. The time could be compared with events in iCalendar and – if a matching event was found – the photo and event could be coupled. RDF would the natural choice for the data format, which then naturally extended to related data, e.g. FOAF for people’s names and Dublin Core for lots of other metadata.

I still think that such a tagging service would make a lot of sense. Especially when it would be open and available for the general public to extend, you would get a kick start assigning meaningful keywords to whatever you want to tag.

The working would be something along these lines:

  • start tagging operation (e.g. right click, context menu)
  • tagging interface invoked with context (object type, time, previous tagging)
  • suggested tags appear with auto-completion, based on context
  • user action: inspect context of suggested tag
  • when satisfied, apply tag
  • otherwise, create a personal “fork” for your context, e.g. by referring to name in Foaf file etc.

Example: the first time you enter the tag bush, you wold be suggested the choice between the president of the USA or a wilderness scene. Or maybe you know someone else by the name bush, and you point the tool to the bush in your address book (facilitated through Foaf or some other mechanism).

This all is a rough concept, stuck at the thought model level. I would have kept this all to myself, if I had not come across an article by Tim Berners-Lee: Using labels to give semantics to tags. In short: applying well defined (semantic) labels to liberally tagged objects, in order to give them presence in the semantic web context. In Tim BL’s words: “The concept of a label as a preset set of data which is applied to things and classes of things provides an intuitive user interface for a operation which should be simple for untrained users.

Excellent, there’s still way to go!