video metadata and the pursuit of classification
Sadly enough the word “metadata” still acts on me like catnip on cats, then the words “data licensing” put a seal on it, I feel impelled to go and poke my nose in other people’s business. The business in question was that of Re:Transmission, a gathering of independent video producers held in London, and their quest to adopt a common metadata standard for video distribution.
What they’re looking at is straightforward, and as long as they can agree on something which contains a sensible least useless thing model of properties of video data, it’s hard to see that it matters which format they prefer to use as a carrier. Again, it’s the underlying metadata model that matters, not the specific exchange protocol or carrier format, which can easily be translated if wanted.
The Transmission group have a good opportunity to establish clear rules about open data licensing, and I hope they take it for a project which is all about re-use and the encouragement of public redistribution and distributed public archives.
A recent post to the Transmission list by Minna Tarkka points up where I start to feel a little edgy about this discussion:
… we have been immersed in the …onerous, stringent… world of semantic web and listened carefully to the (partly academic) discussions of its possible merits visavis soft ontologies (self-organising maps) and folksonomic camps. we think that there are ways to combine the best of these approaches in creating flexible description schemas for audiovisual content
The overview document has some token discussion of “folksonomies”. The theoretic advocates of “folksonomies” have held them up against a straw-man version of the semantic web, the “wrong trousers” version which imposes rigorous taxonomical hierarchies onto data structures, which insists on classification of media according to a domain specific vocabulary. Metadata is metacrap, they cry, but as far as i’m concerned, folksonomy is folksonocrap as well.
The emphasis on classification and categorisation is seriously overstressed when it comes to data discovery. We know this from web pages - indexes rely on inference and locality much more than they rely on any, necessarily unreliable intentional descriptions of web pages; so much metacrap. But what the semantic web (in a broad sense which can start to include constructs like Atom and GeoRSS) is good at is transmitting recordings of things which are definite, observable and relatively inarguable. What time a piece of media was recorded; the area in which it takes place in space; which people feature in it; meta properties of the media such as bitrate and duration.
For a network like Transmission - specifically directed towards “citizen journalism” and the recording of local events - timing, location and any details of protagonists should provide a lot more value in terms of “discoverability” - not just as an abstraction but also a local, immediate discoverability of events nearby in the world or elsewhere in similar circumstances that might matter to me.
I start to view category-based, keyword-based annotation as a big red herring. What about transcriptions and subtitles, or scene descriptions? If there’s an emphasis on making these available - and there seemed to be in the plans being drawn up by the network - how much more benefit to index them than to worry about classification schemes, whether “controlled” or freeform, at all?
Post a Comment
You must be logged in to post a comment.