skip to main content
10.1145/1390156.1390282acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Detecting statistical interactions with additive groves of trees

Published:05 July 2008Publication History

ABSTRACT

Discovering additive structure is an important step towards understanding a complex multi-dimensional function because it allows the function to be expressed as the sum of lower-dimensional components. When variables interact, however, their effects are not additive and must be modeled and interpreted simultaneously. We present a new approach for the problem of interaction detection. Our method is based on comparing the performance of unrestricted and restricted prediction models, where restricted models are prevented from modeling an interaction in question. We show that an additive model-based regression ensemble, Additive Groves, can be restricted appropriately for use with this framework, and thus has the right properties for accurately detecting variable interactions.

References

  1. Camacho, R. (1998). Inducing models of human control skills. Proc. ECML'98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Caruana, R., Elhawary, M., Fink, D., Hochachka, W. M., Kelling, S., Munson, A., Riedewald, M., & Sorokina, D. (2006). Mining citizen science data to predict prevalence of wild bird species. KDD'06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Christensen, R. (1996). Plane answers to complex questions, the theory of linear models. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Friedman, J. (2005). RuleFit with R. http://www-stat.stanford.edu/~jhf/R-RuleFit.html.Google ScholarGoogle Scholar
  5. Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29, 1189--1232.Google ScholarGoogle ScholarCross RefCross Ref
  6. Friedman, J. H., & Popescu, B. E. (2005). Predictive learning via rule ensembles (Technical Report). Stanford University.Google ScholarGoogle Scholar
  7. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. JMLR, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hooker, G. (2004). Discovering ANOVA structure in black box functions. Proc. ACM SIGKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hooker, G. (2007). Generalized functional ANOVA diagnostics for high dimensional functions of dependent variables. JCGS.Google ScholarGoogle Scholar
  10. Jakulin, A., & Bratko, I. (2004). Testing the significance of attribute interactions. Proc. ICML'04. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Pace, R. K., & Barry, R. (1997). Sparse spatial autoregressions. Statistics and Probability Letters, 33.Google ScholarGoogle Scholar
  12. Rasmussen, C. E., Neal, R. M., Hinton, G., van Camp, D., Revow, M., Ghahramani, Z., Kustra, R., & Tibshirani, R. (2003). Delve. University of Toronto. http://www.cs.toronto.edu/~delve.Google ScholarGoogle Scholar
  13. Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). Semiparametric regression. Cambridge.Google ScholarGoogle Scholar
  14. Sorokina, D., Caruana, R., & Riedewald, M. (2007). Additive Groves of regression trees. Proc. ECML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Torgo, L. (2007). Regression DataSets. www.liacc.up.pt/~ltorgo/Regression/DataSets.html.Google ScholarGoogle Scholar

Index Terms

  1. Detecting statistical interactions with additive groves of trees

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ICML '08: Proceedings of the 25th international conference on Machine learning
              July 2008
              1310 pages
              ISBN:9781605582054
              DOI:10.1145/1390156

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 5 July 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader