WeMedia: Search World | Trust, relevance and rights

AP’s Jim Kennedy, VP Strategy, The Associated Press with Fabrice Florin, NewsTrust, Mary Hodder, Dabble and Josh Cohen, Google News talking about search relevancy, metadata and what needs to be done to move search, news aggregation, filtering, forward.
Question: What about the metadata people and machines need to apply in search?
Mary: The tagging (organizer) class makes metadata you can hook into, but the larger issue is how to harness metadata from people–publishers do not always understand that–it’s the World Cup 2007 vs. the head butt video on YouTube. So, how do you match up how people understand things? …This is a huge problem…
Jim: Is there a standard form all publishers can use?
Mary: That’s the problem right there…There is a huge value in microformats, as Kevin Marks says, puing the most basic elements can be a bottoms up way to come at a basic set of metadata, instead of trying to tag everything via a standards body.
Fabrice: At NewsTrust, we look at whether an article is fair, well sourced etc. We also keep track of reputation for sources of content and rate the raters.
Jim: AP is looking to do more text and entity extraction to news stories as a form of taxonomy that drives richer information retrieval and is testing a metadata/tagging structure with a subset of members (Susan says this reminds me of the goal of having web pages fully hyperlinked back around 1996).
Question: So what is Google doing around news metadata, standards?
Josh: Google’s challenge is one of scale. We come at it from an algo sandpoint–Google is describing how they geo code their pages and weight that data as part of the relevancy process; publishers will use their own methods.
Jim: It’s not about SEO; it’s about true relevancy.
Josh: The more data we can get the better, but it’s not always feasible.
Fabrice: We also need to know the source of the metadata–who added these tags?
Mary: This kind of sourcing is amazingly tough, especially at the person level. (Susan says: And expensive!) If we have this we can ferret out linkspam, etc. but it is dificult to translate across all stakeholders and to execute.
Amy G(question): In the writing of the content, what can we do to connect better with indexing and metadata?
Jim: There’s a need for structure and a data model-headline, dateline, name, all in a specific place.
Others: Type of content (breaking, analysis etc.).
Jim: How do community ranging news systems like Digg help or hinder with data and relevancy?
Josh: No one has cracked the nut, but there is value to it (Susan: translation: Goog not giving greater relevancy to this ype of weighting.)
Fabrice: We need to find out if a person has the expertise to have an opinion on a piece of information, What research have they done? What history? What happy medium tells if you can trust a source or if they are gaming?
Nick, Reuters: Will there ever be a model that rewards revenue against trust?
(Susan says: This is a better panel, but the twitter stream comments on it are also fascinating..that is the backchannel dialogue that should be happening right here in the session.. a projection device could offer that.)
Mary: dataportability.org is something to watch…

Labels: ,

Latest Comments

  1. Jean-Baptiste says:

    It seems very interesting. A shame I was in another session about presidential campaign. Is there any sort of document about this subject ?
    Jean-Baptiste

Latest Comments

Comments are closed.