Web appsSupertagging the Web

June 29th, 2008 by Greg Boutin
old zodiac map, holland

Image by oceandesetoiles via Flickr

I have a confession to make. I am impatient. So much so that, even though computers made it into my life two decades or so, and the web, a decade or so, I am frustrated every day with the state of computing. And something in particular has been bugging me for quite a while now. For the world of me, I can’t get why we’re still using folders to organize information. Folders force me to compartmentalize information, in an exclusive, hierarchical way. Each file or, broadly speaking, piece of content, can only be in one folder at a time, unless it’s copied twice. In fact, folders are just one very constrained type of tags. Very inefficient, very paper-like, very 20th century. Tags are so much better. And for full disclosure, I’ve been working on an idea for a while. But, as I couldn’t take any talented programmer away from the yet-another mashup or facebook app they were working on at that time, I am presenting an extended concept here, with three hopes: that it highlights some of the benefits I expect from the semantic web or from the future forms of tagging, that it guides entrepreneurs to develop new value propositions based on those early adopter needs, and lastly, maybe, that a talented programmer realizes the potential and contacts me to assess a partnership. But wait, I expect you to tell me that there is one problem: tags are messy and time-consuming.Yes, I agree with the general consensus, but I’m not thinking about the type of tags you’ve used so far. What I’m talking about is something I’d like to call Supertags for now, as a blanket name only… the actual name probably wouldn’t involve the tainted word “tag”. So, those would be tags that I could use across the board, be it on gmail, on Flickr, on facebook, on del.icio.us, on my blog, and while I’m at it, on my laptop and my cellphone applications too. This way, I could tag my emails about this article in Outlook with both “Idea for New venture” and “My blog posts”, instead of choosing between both. I could tag part of my information, such as a paragraph, and I could even tag words or expressions, telling the system that this is an address, or a location, or a date. I could find all information I tagged with the same keywords, on all my other services. It would all be synchronized. To help with all that, there would be a central tag manager, which I could access either from my laptop or on the web, as it would also synchronize itself. Just like I organize my folders today with MS file explorer, I could also organize my tags the way I want, with relationships that are nested or not. The system would know which tags are synonyms and which one are closely related. It could infer things, for example that a date tagged as such in my document, refers to a historical event, and it would find all documents related to that event when I ask him about this date. Or it could break down the content in Wikipedia and reconstitute new documents based on the tags in each paragraph. It could even merge my content and others’ based on our tags. But wait, tagging all that information would be way too cumbersome for me. And I would want it to be completely tagged, at all the different possible levels: word, expression, paragraph, document, group of documents even. I would need automated tagging, or automated tag suggestion for the information I prefer to tag myself in an authoritative way. A lot of my information would be tagged automatically: for example at the more granular level, the system would find dates, locations and names and tag them all by itself. At a more conceptual level, it would tag information based on an analysis of the information to be tagged, the tags I’m already using, similar information I’ve already tagged, and similar information others have tagged, in a wisdom-of-the-crowd way. I could share my set of tags as desired, and all shared tags could be publicly synthesized into a giant, dynamic tag base, that could tell us things like which tag is most used, when and where. One could also spot buzz trends with such a service. This central tag base would contribute to the autotagging capabilities, too. Because it’s about eliminating the noise, the system would be smart enough to determine which are the one, 5 or 20 tags that best apply to my information, and to which extent each unique tag applies to this data. This is a parameter I could personalize. The system would do that in a way that facilitates data organization and data retrieval, the two key uses for my Supertags. Organization must be logical and highly descriptive. Retrieval must be fast. That’s two often conflicting priorities. Tags solve that for me, because I can tag things with some tags used for organization purposes, like “Renaissance music”, and others used for retrieval purposes, like “Work in progress”. My system would also recommend content tagged in similar ways by others, and help me connect with similar-minded users based on our common tags. Even better, I could link information using tags too. I could tag a date in a document to a paragraph in another, using the tag “connect”. Or two names with the tag “is the daughter of”. Another person could link those documents differently, based on their views of the world. To summarize, my system would tag anything I ask it to, in an automated or semi-automated way, using tags that would be useable or at least exportable to any service, offering me content suggestions and social connections based on my tags and a shared tag repository. Tags would be streamlined based on semantic capabilities, so that synonyms are linked as such. The system would link information together using tags, and it would let me organize those tags while synchronizing them with all my tag-using applications and channels. In a future version, the system would review information for me, in the way I normally would, and tag it for me, based on a complex understanding of my goals. No more semi-automated tagging. The step after that would be for the system to review information and use it in a better way than I would, in a prescient way, based on my objectives (or perhaps, objectives that, in my own eyes, are even superior to the ones I had initially). Then I think I might finally stop being frustrated about computing. I’m only a little worried computing will be frustrated with me, and start to think about its own objectives. I’ve watched Terminator a little too often. For now, I know that this vision is idealistic. So much so, in fact, that making the multi-level relational autotagging work would pretty much complete the dream of the semantic web! So let’s do what many start-ups should learn to do better: sequence their go-to-market strategy. See my next post for a suggested first step in the direction of Supertags.

Enhanced by Zemanta