Data Mining With Excel 2007 And SQL Server 2008

Text-only Preview

Data Mining with Excel 2007 and SQLSQL Server 2008Mark Tabladillo Ph.D.MTabladillo <(at)> solidq.comNovember 10, 2008Approach of this Presentation• Emphasize– Conceptual value of data mining– Relationship of data mining to the real world• Reserve– Specific procedures and mechanics– Specific mathematics– Production implementation© 2008 Mark Tabladillo Ph.D.2Introduction• Microsoft Data Mining (MDM) is a major branch f o S QLSQL S erver Alna ysi Ss ervices (SSAS)• The technology is supported by a new language within SSAS called DMX (Data Mining Extensions)• Currently, the two promoted interfaces are pBIDS (Business Intelligence Development Studio) and Excel 2007)© 2008 Mark Tabladillo Ph.D.3Introduction• SQL Server 2008 has some improvements 2005over b, tu th ie matn echlnoiogy s similar• A major improvement for 2008 is the documentation (Books Online)• Microsoft’s team releases technology information at© 2008 Mark Tabladillo Ph.D.4Outline• Main Conclusions on Data Mining• DaD ta Mi inDng eD fi itin on• Microsoft Data Mining Fundamentals• Overview of Microsoft Data Mining Overview Algorithms• ConcClusion© 2008 Mark Tabladillo Ph.D.5Four Interactive Demos• Card Sorting• Demographic Profiles• Sports p(Colle(ge Footballg)• Money (American Economy)© 2008 Mark Tabladillo Ph.D.6Data Mining Definitions• Data mining is the automatic or semi-tautoma itifc process o expl ior ng d tafa or meaningful or useful patterns.• Data mining algorithms typically use estimation or optimization to achieve results (as opposed to only calculations).© 2008 Mark Tabladillo Ph.D.7Data Mining Provides Insight• Business– What reasons contribute to stock price changes?– Wh dy o longer tjerm oblbess enefit hits a 2525 year high?• E tnter ia nment– Who is more likely to lose a civil lawsuit?– How well will new DVD sales do in the next few months?© 2008 Mark Tabladillo Ph.D.8Data Mining Provides Insight• Sports– How much should a sports team offer for a proven free agent?– What f actlors ead tio w nnitng a ennis championship?• Techlno ogy– How does Cisco know there are warning signal is n thth te ech sector?– What is the net loss in losing corporate secrets?© 2008 Mark Tabladillo Ph.D.9Data Mining Provides Insight• Politics– What priorities do American voters have for the new President?– Wh didy a certain candiddid taie w n or lose a race?• Science– What factors contribute to ozone holes over th Ae ntarctic?– Why do we believe that Tyrannosaurus Rex had a good sense o fof s mell?smell?© 2008 Mark Tabladillo Ph.D.10