Readability Introduces Iris, Context of Content

April 17, 2012

Speaking of reading-it-later-type services, Readability has Introduced Iris: A Big Leap Forward in Drawing Meaning from the Web. The short version is this:

With Iris, we’ve built an engine that you might call abstract—inspired by IBM’s Watson, the machine that beat contestants on Jeopardy!, Iris’ first order of business is to figure what type of content source is at hand. It analyzes a page, determines the likely context based on a number of factors and extracts what a human would expect as meaningful information from that source. Each context is fully malleable, and can be modified and improved upon individually.

And farther in:

Once the content type is determined, there’s still the complex task of knowing precisely what to tease out of a web resource. Even web articles—Readability’s wheelhouse—are comprised of much more than just a headline and body text. With Iris, Readability gains the ability to glean a whole new level of insight into what facets of a web resource matters to readers and developers: titles and headlines. Subheadlines. Lead images. Videos. Excerpts. Authors. Languages. Captions. Beyond just a great end-user experience, Iris represents a powerful bridge to the new ways content is being consumed beyond the browser.

So as I understand it this is two parts. One is a justification for linking shared links for readability content back to the readability site instead of the original page, something that blew up a bit in their faces in the continuing passive-aggressive battle between Readability and Instapaper. The second is that when you view articles on the Readability site, instead of just getting the content and nothing more, articles will contain other meaningful data gleamed from analysis of the article by the new Iris engine.

Posted by Arcterex at April 17, 2012 10:45 AM