Sponsors
  • Etelos
  • IBM
  • Microsoft
  • Adobe Systems, Inc.
  • Cynergy
  • Nokia
  • Openmaru Studio
  • WebEx
  • AOL
  • Citrix Systems
  • Coghead
  • Confident Technologies
  • Disney
  • Disney
  • EffectiveUI
  • F5 Networks
  • HCL Technologies
  • Intuit Quickbase
  • Oracle
  • S60
  • Salesforce.com
  • Spinscape
  • Sun Microsystems
  • Symphoniq Corporation
  • TeleAtlas
  • Yahoo! Inc.
  • Amazon Web Services
  • Atlassian Software Systems
  • awareness
  • BroadSoft
  • Curl
  • Denodo
  • Dixero
  • Force10 Networks
  • Humanix Inc.
  • Intel
  • JackBe
  • Jaduka
  • Jive Software
  • Juniper Networks
  • Kapow Technologies
  • Keynote Systems
  • Leverage Software
  • LiquidApps
  • LithiumTechnologies
  • LongJump
  • Morfik
  • Mzinga
  • NeuStar
  • Octopz
  • ONEsite
  • OpSource
  • Panther Express
  • Profy
  • Real Time Content
  • Rearden
  • Rearden Commerce
  • Remy
  • Reply
  • spigit
  • StreamVerse, Inc.
  • StrikeIron
  • XBOSoft
  • Znak
  • O'Reilly Alpha Tech Ventures
  • Panorama Capital
  • ACM Queue
  • Berlin Partner
  • BlogHer
  • Business Marketing Association
  • Dr. Dobbs
  • Fast Company
  • GigaOM
  • Juniper Research
  • Mashable
  • MSDN Magazine
  • NewTeeVee
  • Revenue Magazine
  • TechNet
  • Technorati
  • Topix
  • Webware
  • Wired
  • WOW

Sponsor & Exhibitor Opportunities

Vicki Sanders
415-947-6107
vsanders@techweb.com

Media Sponsor Opportunities

Liliana Arancibia
415-947-6179
larancibia@cmp.com

Press/Media Inquiries

confpr@oreilly.com

or

Natalia Wodecki
415-947-6762
NWodecki@cmp.com

Contact Us

View a complete list of Web 2.0 Expo contacts.

Social Data: Collecting, Mining, and Using it in Your Applications

Toby Segaran (Google)
Development
Location: 2003

Huge sets of data are generated every day by people using online applications, whether they’re blogging, shopping, or just clicking on links. Many techniques for analyzing and interpreting these datasets exist in the fields of visualization, data-mining, and machine learning, making it possible to use this data to draw new conclusions, build predictive models, and make web applications smarter.

This talk will use this idea to explore some analyses of several different types of data:

  • Movie preference data from MovieLens
  • The top bloggers on Technorati
  • Personal ads on Craigslist
  • Home prices on Zillow
  • Messages from Google Groups
  • People on Hot Or Not

And several analysis techniques:

  • Collaborative filtering
  • Hierarchical and K-Means Clustering
  • Multidimensional Scaling
  • Classification and Regression Trees
  • Independent Feature Extraction

Segaran will show you some collected data, an overview of how the algorithms work, some results and ideas about how the various techniques could be incorporated into a web application.

Photo of Toby Segaran

Toby Segaran

Google

Toby Segaran is the author of the O’Reilly title, “Programming Collective Intelligence”, Amazon’s top-selling AI book, formerly the Director of Software Development at Genstruct, and now a Data Magnate at Metaweb. He loves applying data-mining algorithms to everything ranging from pharmaceutical trials to the Technorati Top 100.