Mining Wikipedia For Awesome Data

Mining Wikipedia For Awesome Data screenshot

Text-only Preview

Mining Wikipedia for Awesome DataNeil CrosbyWhat’s this about then?• There’s loads of groovy content on Wikipedia[citation needed].• You are lazy.• You want groovy content on your site.Wikipedia has an API• Who knew?• has lots of optionsParamValuesWhat does it do?formatphp, json, Output format.TODOredirects0, 1Redirect to good pages.rvsection0, 1, 2, 3, etc Page section to get data for.actionquery, parseAPI method.Getting WikiText? Easy•’s+nest&rvprop=content&prop=revisions&redirects=1Searching? Harder• Wikipedia doesn’t have a good search engine.Use Yahoo! BOSS•’s+nest?appid=yourBOSSiD• First result:'s_Nest_(film)Then get WikiText• One_Flew_Over_the_Cuckoo's_Nest_(film)&rvprop=content&prop=revisions&redirects=1The WikiText'''''One Flew Over the Cuckoo's Nest''''' is a [[1975 in film|1975]] [[comedy-drama]] film [[film director|directed]] by [[Miloš Forman]]. The film is an adaptation of the 1962 novel ''[[One Flew Over the Cuckoo's Nest (novel)|One Flew Over the Cuckoo's Nest]]'' by [[Ken Kesey]]. The movie was the first to [[List of Big Five Academy Award winners and nominees|win all five]]...But I wanted HTML!• WikiText is no good for dumping into a website.