research-article

Query by output

Authors:
Quoc Trung Tran

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Chee-Yong Chan

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Srinivasan Parthasarathy

The Ohio State University, Columbus, USA

The Ohio State University, Columbus, USA
View Profile

SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataJune 2009Pages 535–548https://doi.org/10.1145/1559845.1559902

Published:29 June 2009Publication History

SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

Pages 535–548

ABSTRACT

It has recently been asserted that the usability of a database is as important as its capability. Understanding the database schema, the hidden relationships among attributes in the data all play an important role in this context. Subscribing to this viewpoint, in this paper, we present a novel data-driven approach, called Query By Output (QBO), which can enhance the usability of database systems. The central goal of QBO is as follows: given the output of some query Q on a database D, denoted by Q(D), we wish to construct an alternative query Q′ such that Q(D) and Q′ (D) are instance-equivalent. To generate instance-equivalent queries from Q(D), we devise a novel data classification-based technique that can handle the at-least-one semantics that is inherent in the query derivation. In addition to the basic framework, we design several optimization techniques to reduce processing overhead and introduce a set of criteria to rank order output queries by various notions of utility. Our framework is evaluated comprehensively on three real data sets and the results show that the instance-equivalent queries we obtain are interesting and that the approach is scalable and robust to queries of different selectivities.

References

P. Andritsos, R. J. Miller, and P. Tsaparas. Information-theoretic tools for mining database structure from large data sets. In SIGMOD, pages 731--742, 2004. Google ScholarDigital Library
C. Binnig, D. Kossmann, and E. Lo. Reverse query processing. In ICDE, pages 506--515, 2007.Google ScholarCross Ref
C. Binnig, D. Kossmann, E. Lo, and M. T. Ä Ozsu. QAGen: generating query-aware test databases. In SIGMOD, pages 341--352, 2007. Google ScholarDigital Library
S. Borzsonyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--430, 2001. Google ScholarDigital Library
N. Bruno, S. Chaudhuri, and D. Thomas. Generating queries with cardinality constraints for dbms testing. IEEE TKDE, 18(12):1721--1725, 2006. Google ScholarDigital Library
P. Buneman, J. Cheney, W.-C. Tan, and S. Vansummeren. Curated databases. In PODS, pages 1--12, 2008. Google ScholarDigital Library
T. Gaasterl, P. Godfrey, and J. Minker. An overview of cooperative answering. Journal of Intelligent Information Systems, (2):123--157, 1992.Google ScholarCross Ref
L. Getoor, B. Taskar, and D. Koller. Selectivity estimation using probabilistic models. In SIGMOD, pages 461--472, 2001. Google ScholarDigital Library
P. Godfrey, J. Gryz, and C. Zuzarte. Exploiting constraint-like data characterizations in query optimization. In SIGMOD, pages 582--592, 2001. Google ScholarDigital Library
H. V. Jagadish, A. Chapman, A. Elkiss, M. Jayapandian, Y. Li, A. Nandi, and C. Yu. Making database systems usable. In SIGMOD, pages 13--24, 2007. Google ScholarDigital Library
T. Johnson, A. Marathe, and T. Dasu. Database exploration and bellman. 26(3):34--39, 2003.Google Scholar
G. Koutrika, A. Simitsis, and Y. Ioannidis. Précis: The essence of a query answer. In ICDE, page 69, 2006. Google ScholarDigital Library
M. Mehta, R. Agrawal, and J. Rissanen. SLIQ: A fast scalable classifier for data mining. In EDBT, pages 18--32, 1996. Google ScholarDigital Library
C. Mishra, N. Koudas, and C. Zuzarte. Generating targeted queries for database testing. In SIGMOD, pages 499--510, 2008. Google ScholarDigital Library
A. Motro. Intensional answers to database queries. IEEE TKDE, 6(3):444--454, 1994. Google ScholarDigital Library
M. P.N. Tan and V.Kumar. Introduction to Data Mining. Addison-Wesley, 2006. Google ScholarDigital Library
N. Ramakrishnan, D. Kumar, B. Mishra, M. Potts, and R. F. Helm. Turning cartwheels: An alternating algorithm for mining redescriptions. In KDD, pages 266--275, 2004. Google ScholarDigital Library
J. Rissanen. Modeling by shortest data description. Automatica, 14:465--471, 1978.Google ScholarDigital Library
A. Simitsis, G. Koutrika, and Y. E. Ioannidis. Generalized précis queries for logical database subset creation. In ICDE, pages 1382--1386, 2007.Google Scholar
Q. T. Tran, C.-Y. Chan, and S. Parthasarathy. Query by output. Technical Report TRA4/09, National University of Singapore - School of Computing, April 2009.Google Scholar
P. Valduriez. Join indices. ACM Trans. Database Syst., 12(2):218--246, 1987. Google ScholarDigital Library
C. J. van Rijsbergen. Information Retireval. Butterworth, 1979. Google ScholarDigital Library
W. Wu, B. Reinwald, Y. Sismanis, and R. Manjrekar. Discovering topical structures of databases. In SIGMOD, pages 1019--1030, 2008. Google ScholarDigital Library
X. Xiao and Y. Tao. Output perturbation with query relaxation. Proc. VLDB Endow., 1(1):857--869, 2008. Google ScholarDigital Library
M. M. Zloof. Query by example. In AFIPS NCC, pages 431--438, 1975. Google ScholarDigital Library

Index Terms

Query by output
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
  2. Information systems applications
    1. Data mining
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory
      1. Database query processing and optimization (theory)

Recommendations

Query reverse engineering

In this paper, we introduce a new problem termed query reverse engineering (QRE). Given a database D and a result table T-the output of some known or unknown query Q on D-the goal of QRE is to reverse-engineer a query Q' such that the output of query Q' ...
Read More
Interactive Query Synthesis from Input-Output Examples
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data

This demo showcases Scythe, a novel query-by-example system that can synthesize expressive SQL queries from input-output examples. Scythe is designed to help end-users program SQL and explore data simply using input-output examples. From a web-browser, ...
Read More
View-based query containment
PODS '03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Query containment is the problem of checking whether for all databases the answer to a query is a subset of the answer to a second query. In several data management tasks, such as data integration, mobile computing, etc., the data of interest are only ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
June 2009
1168 pages
ISBN:9781605585512
DOI:10.1145/1559845
Editors:
Carsten Binnig,
Benoit Dageville,
General Chairs:
Uğur Çetintemel
Brown University, USA
,
Stan Zdonik
Brown University, USA
,
Program Chair:
Donald Kossmann
ETH Zurich, Switzerland
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
at-least-one semantics
instance-equivalent queries
query by output
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 92
  Total Citations
  View Citations
- 1,263
  Total Downloads
- Downloads (Last 12 months)38
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Query by output

SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Query reverse engineering

Interactive Query Synthesis from Input-Output Examples

View-based query containment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Query by output

SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Query reverse engineering

Interactive Query Synthesis from Input-Output Examples

View-based query containment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media