Statistics Seminar

Date/Time:Monday, 29 Sep 2014 from 4:10 pm to 5:00 pm
Location:Morrill 2019
Contact:Jeanette La Grange
Channel:Statistics Department
"How ISU Became World Data Mining Champs," Data Mining Cup Team*, Department of Statistics.

ISU graduate students won the 15th Data Mining Cup, making Iowa State the first American university to finish first in the annual college data analysis competition. In this presentation, team members discuss the contest and the approaches that led to their winning solution. This year's competition focused on data from an online retailer offering free return shipping. Teams were given purchase return history and asked to predict which of 50,000 shipped purchases were returned by the customer. Our team developed successful prediction models by learning to predict which purchases are returned (when withholding the known return outcomes). The team's approach was to first fully characterize each order by "engineering" a useful set of purchase feature variables. This set of features was then employed using state-of-the-art multivariate statistical methods in order to predict return shipment probability. A novel use of likelihood ratio statistics allowed utilization of predictive information contained in categorical variables. The statistical learning solutions to various challenges are highlighted.

*Team members: Guillermo Basulto-Elias, Fan Cao, Xiaoyue Cheng, Marius Dragomirolu, Jessica Hicks, Cory Lanker, Ian Mouzon, Lanfeng Pan, Xin Yiu