Digg makes official its adoption of a 'semantic Web' standard

By Scott M. Fulton, III | Published May 2, 2008, 12:14 PM

It could be the very thing the Web has lacked all these years, even with its wealth of intermingled hyperlinks: a markup language for conclusively identifying context. Now, Digg is making the bold attempt to be its biggest "beta tester."

One of the principal deficiencies all these years about HTML or XHTML as a markup language has been the absence of any genuine, built-in feature for explaining to indexing services or even to browsers with intelligent features, just exactly what a page contains at a granular level. Metadata could conceivably help categorize data, assuming everything on a page had the same category; but with more Web pages these days constituting whole blogs, whole-page metadata is rapidly becoming useless.

The W3C standards body in charge of maintaining and developing the language of the Web has actually been addressing this problem for several years; though while it's been busy building "standards," by the objective definition, relatively very few sites have actually tried implementing them. Last month, one major exception was Digg, the social news aggregator which quietly began trials of a W3C standard for labeling contextual data at a very low level -- meaning, right next to the data itself.

The concept is called RDFa, and essentially it's a way to put W3C's existing RDF contextual markup language to real-world use by converting it to XML form. It borrows RDF's rather inspired way of explaining what an element means, or what the space being reserved for an element (say, from a database) should mean.

It calls for a stretch of the imagination a little bit because it uses a loose metaphor from the realm of common grammar: All context -- everything that can symbolize relative relevance in text -- can be represented in terms of a something "x" that does a certain something "y" to a something "z." In this case, "x" is the subject of the relationship and "z" is the object, in the grammatical (not the programming) sense. The action of that relationship is the "y," which in RDF is called the predicate (note, not "verb").

These three items together form what are called triples in RDF; and in the XML-based RDFa notation, a triple can be embedded into an HTML element -- such as <P>, <H2>, <IMG>, or <SPAN> -- in such a way that it effectively describes the context of the element's contents. This happens after you merge the RDFa namespace into the XML for the page.

For example, a paragraph about a subject defined by an online resource can include in its <P> element bracket an attribute about: that is set as a member of a defined class, and that points to the HTTP address of that resource. Then whenever the name of that person specified in the resource is referenced, that name can be placed in a <SPAN> element with an attribute such as contact:name, where contact is a property defined in the specified class. This way, an index or smart browser can detect when and where a paragraph is about a person whose name is indexed and catalogued by an outside resource.

The "triple" in this case is easy: The subject is the person's name, the predicate is the act of naming, and the object is the resource where the name is catalogued. Imagine if you ran a certain online resource -- say, a wiki/encyclopedia thingie of some sort -- and what an advantageous position you might be in.

Digg's involvement in all of this came by way of a very brief announcement on its company blog yesterday, where principal member Steve Williams wrote, "We've added RDFa, making Digg part of the 'semantic web' where Web pages become more sophisticated, beyond simply words and pictures."

But Williams is actually an active proponent of Digg's involvement in new and emerging standards, as demonstrated by his announcement last January of its entry into the DataPortability project, the gathering place for standards efforts in the field of data exchange, of which RSS and RDF are two prominent members.

Other brief mentions on Digg's blogs over the past month have been the only indications the company has been giving to the world of its direct -- and perhaps even principal -- involvement in RDF and RDFa, besides a simple check of the site's own source code, where attributions such as rel="dc:source" property="dc:title" within <DIV> elements are now common. A few weeks ago, developer Bob DuCharme discovered these little attributions and began playing with them to discern their viability.

On his personal blog, DuCharme wrote, "The first few times I tried the RDFa Highlight bookmarklet, which puts red rectangles around all the parts of a Web page that have RDFa metadata assigned, I didn't think it was very useful; I thought, OK, red rectangles, what can I do with them? My experience with Digg changed my mind. A single button click gives a very quick and intuitive display of how much RDFa a page offers to work with."

The possibility exists for a kind of mega-meta-source to emerge from Digg, where interesting news topics are associated with cataloged resources. But for that to actually work, someone has to manage those resources -- and that effort will take a level of humanpower and resources of another kind (the kind symbolized with "$") that RDF won't provide even the most ambitious sites just on its own.

Comments

View comments by with a score of at least

This is a significant milestone in the Semantic Web movement. I see RDFa and Microformats are important technologies that will define the success of the Semantic Web. http://tinyurl.com/4fcuor

Score: 0

|

How can you have an article about rdf and no mention of Apple?

Score: 0

|

If this works out it should prove to be rather interesting. I wonder if they have given any thought to RIAs (Rich Internet Applications) such as Flex or Flash? especially since they are becoming more and more common...

Score: 0

|

All hail! Good job! Welcome! Kudos!

Score: 0

|

PDC 2009: What have we learned this week?

There was the freebie that no one will forget, the heebie-jeebies courtesy of Scott Guthrie, and a teensy bit clearer picture of how this cloud thingie should work.

Live report: Will Google Chrome OS change Linux?

The mysteries of just what Chrome OS is, and how much of an operating system it truly is, may be resolved today.

PDC 2009: Microsoft cares about Web browser performance

The effort to give users of the world's dominant Web browser the impression of quality, is a personal one for the man who leads that battle.

Nokia re-affirms its commitment to Symbian, sort of

Maemo won't necessarily be replacing Symbian in the Nokia N-Series, but that's definitely a place where it will be found.

E-book readers will be in short supply this holiday season

E-readers are hot this year, and a lot of compelling new products have been released, but are there enough electrophoretic displays to go around?

Sony looks to finally open a single storefront for downloads

Sony has had many different download portals for movies, music, e-books, and games, and now it's looking to make a single shop for all of it.

Tuning out the tablet: Time to give the endless speculation a rest

Wide Angle Zoom: Wishing and hoping and thinking and praying....won't put an iTablet on the market.

Five improvements for IT managers in 2010

If businesses are to improve their efficiency for next year, they need to stop and reassess the basic tenets of their job.

AOL's spinoff from Time Warner to shed 2,500 jobs

As AOL moves toward become an independent company again, it will cut nearly a third of its workforce.

Gartner: SMS-based money transfer will be bigger than mobile browsing, search

Gartner issues its predictions for the 10 things our phones will be doing in 2012.

Don't forget to upgrade to Firefox 3.6 beta 3 today

Mozilla has released the latest beta its Firefox 3.6 browser software, just over one week after beta 2.