Dynamic itemset counting and implication rules for market basket data

Authors:
Sergey Brin

Department of Computer Science, Stanford University and R&D Division, Hitachi America Ltd.

Department of Computer Science, Stanford University and R&D Division, Hitachi America Ltd.
View Profile

,
Rajeev Motwani

Department of Computer Science, Stanford University

Department of Computer Science, Stanford University
View Profile

,
Jeffrey D. Ullman

Department of Computer Science, Stanford University

Department of Computer Science, Stanford University
View Profile

,
Shalom Tsur

R&D Division, Hitachi America Ltd.

R&D Division, Hitachi America Ltd.
View Profile

Authors Info & Claims

ACM SIGMOD Record Volume 26 Issue 2June 1997pp 255–264https://doi.org/10.1145/253262.253325

Published:01 June 1997Publication History

ACM SIGMOD Record

Abstract

We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate itemsets than methods based on sampling. We investigate the idea of item reordering, which can improve the low-level efficiency of the algorithm. Second, we present a new way of generating “implication rules,” which are normalized based on both the antecedent and the consequent and are truly implications (not simply a measure of co-occurrence), and we show how they produce more intuitive results than other methods. Finally, we show how different characteristics of real data, as opposed by synthetic data, can dramatically affect the performance of the system and the form of the results.

References

AIS93a R. Agrawal, T. Imilienski, and A. Swami. Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineerin9, 5(6):914-925, December 1993. Google ScholarDigital Library
AIS93b R. Agrawal, T. Imilienski, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. Proc. of the A CM SIGMOD Int'l Conf. on Management of Data, pages 207- 216, May 1993. Google ScholarDigital Library
ALSS95 R. Agrawal, K. Lin, S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling and translation in time-series databases. in Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), 1995. Google ScholarDigital Library
AS94 R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th VLDB Conference, Santiago, Chile, 1994. Google ScholarDigital Library
AS95 R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the l lth International Conm}erence on Data Engineering, Taipei, Taiwan, 1995. Google ScholarDigital Library
MAR96 M. Mehta, R. Agrawal, and J. Rissanen. Sliq: A fast scalable classifier for data mining. March 1996.Google Scholar
SA95 R. Srikant and R. Agrawal. Mining generalized association rules. 1995.Google Scholar
Toi96 H. Toivonen. Sampling large databases for association rules. Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), 1996. Google ScholarDigital Library

Index Terms

Dynamic itemset counting and implication rules for market basket data
1. Information systems
  1. Data management systems
    1. Database design and models
2. Mathematics of computing
  1. Mathematical software

Recommendations

Dynamic itemset counting and implication rules for market basket data
SIGMOD '97: Proceedings of the 1997 ACM SIGMOD international conference on Management of data

We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate ...
Read More
Evaluating time variations to identify valuable association rules in market basket analysis
Innovative Decision Systems, extended papers from the 12th EANN/7th IFIP AIAI 2011 Joint Conferences

Data mining techniques are extensively applied to retail market to discover association rules between products. Indicators such as lift and confidence have been defined to evaluate and promote valuable association rules. The aim of this paper is to ...
Read More
Market basket analysis of retail data: supervised learning approach
EUROCAST'11: Proceedings of the 13th international conference on Computer Aided Systems Theory - Volume Part I

In this work we discuss a supervised learning approach for identification of frequent itemsets and association rules from transactional data. This task is typically encountered in market basket analysis, where the goal is to find subsets of products ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGMOD Record Volume 26, Issue 2
June 1997
583 pages
ISSN:0163-5808
DOI:10.1145/253262
Chairman:
Sudha Ram
Univ. of Arizona, Tucson
,
Editor:
Joan M. Peckham
Univ. of Rhode Island, Kingston
Issue’s Table of Contents
SIGMOD '97: Proceedings of the 1997 ACM SIGMOD international conference on Management of data
June 1997
594 pages
ISBN:0897919114
DOI:10.1145/253260
Editors:
Joan M. Peckman
Univ. of Rhode Island, Kingston
,
Sudha Ram
Univ. of Arizona, Tucson
,
Michael Franklin
Copyright © 1997 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 1997
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1,352
  Total Citations
  View Citations
- 6,417
  Total Downloads
- Downloads (Last 12 months)584
- Downloads (Last 6 weeks)89
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dynamic itemset counting and implication rules for market basket data

ACM SIGMOD Record

Abstract

References

Cited By

Index Terms

Recommendations

Dynamic itemset counting and implication rules for market basket data

Evaluating time variations to identify valuable association rules in market basket analysis

Market basket analysis of retail data: supervised learning approach