Mining Wikipedia For Awesome Data

Mining Wikipedia For Awesome Data screenshot

Text-only Preview

Mining Wikipedia for Awesome DataNeil CrosbyWhat’s this about then?• There’s loads of groovy content on Wikipedia[citation needed].• You are lazy.• You want groovy content on your site.Wikipedia has an API• Who knew?• http://en.wikipedia.org/w/api.phpAPI has lots of optionsParamValuesWhat does it do?formatphp, json, Output format.TODOredirects0, 1Redirect to good pages.rvsection0, 1, 2, 3, etc Page section to get data for.actionquery, parseAPI method.Getting WikiText? Easy• http://en.wikipedia.org/w/api.php?format=php&action=query&titles=one+flew+over+the+cuckoo’s+nest&rvprop=content&prop=revisions&redirects=1Searching? Harder• Wikipedia doesn’t have a good search engine.Use Yahoo! BOSS• http://boss.yahooapis.com/ysearch/web/v1/site:en.wikipedia.org+one+flew+over+the+cuckoo’s+nest?appid=yourBOSSiD• First result: http://en.wikipedia.org/wiki/One_Flew_Over_the_Cuckoo's_Nest_(film)Then get WikiText• http://en.wikipedia.org/w/api.php?format=php&action=query&titles= One_Flew_Over_the_Cuckoo's_Nest_(film)&rvprop=content&prop=revisions&redirects=1The WikiText'''''One Flew Over the Cuckoo's Nest''''' is a [[1975 in film|1975]] [[comedy-drama]] film [[film director|directed]] by [[Miloš Forman]]. The film is an adaptation of the 1962 novel ''[[One Flew Over the Cuckoo's Nest (novel)|One Flew Over the Cuckoo's Nest]]'' by [[Ken Kesey]]. The movie was the first to [[List of Big Five Academy Award winners and nominees|win all five]]...But I wanted HTML!• WikiText is no good for dumping into a website.