Abstract
Much of the scanning literature focuses on unusual clusters of a given type of event in a single sequence of trials or time period. In this chapter, we discuss approaches to simultaneously scan multiple series. In one set of problems, there are multiple series corresponding to the occurrence of different types of events over the same period of time; the researcher looks for multiple-type clusters allowing for lagged effects between the different types of events. In the second set of problems, one scans multiple series looking for the largest common perfect or almost perfect match between all or most of the series. This second set of problems is of importance to molecular biologists searching for strong homologies in DNA sequences. Some related problems in two-dimensional scanning are mentioned.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ahn, H. and Kuo, W. (1994). Applications of consecutive system reliability in selecting acceptance sampling strategies, InRuns and Patterns in Probability(Eds., A. P. Godbole and S. G. Papastavridis), pp. 131–162, Dordrecht, The Netherlands: Kluwer Academic Publishers.
Aldous, D. (1989).Probability Approximations via the Poisson Clumping HeuristicNew York: Springer-Verlag.
Anscombe, F. J., Godwin, H. J. and Plackett, R. L. (1947). Methods of deferred sentencing in testingJournal of the Royal Statistical Society Series B 7198–217.
Arratia, R., Goldstein, L. and Gordon, L. (1990). Poisson approximation and the Chen-Stein methodStatistical Science 5403–434.
Arratia, R., Gordon, L. and Waterman, M. S. (1990). The Erdös-Rényi law in distribution, for coin tossing and sequence matchingAnnals of Statistics 18539–570.
Barbour, A. D., Holst, L. and Janson, S. (1992).Poisson ApproximationOxford, England: Clarendon Press.
Chen, J. and Glaz, J. (1996). Two-dimensional discrete scan statisticsStatistics ε Probability Letters 3159–68.
Darling, R. W. R. and Waterman, M. S. (1985). Matching rectangles in d-dimensions: Algorithms and laws of large numbersAdvances in Mathematics 551–12.
Deheuvels, P. (1985). On the Erdös-Rényi theorem for random fields and sequences and its relationships with the theory of runs and spacingsZeitschrift Wahrscheinlichkeitstheorie 7091–115.
Drosnin, M. (1997).The Bible CodeNew York: Simon & Schuster.
Glaz, J. and Naus J. (1991). Tight bounds and approximations for scan statistic probabilities for discrete dataAnnals of Applied Probability 1306–318.
Greenberg, M., Naus, J., Schneider, D. and Wartenberg, D. (1991). Temporal clustering of homicide and suicide among 15–24 year old white and black AmericansEthnicity and Disease 1342–350.
Huntington, R. J. (1976). Expected waiting time till a constrained quotaTechnical ReportAT&T.
Karlin, S. and Ost, F. (1987). Counts of long aligned word matches among random letter sequencesAdvances in Applied Probability 19293–351.
Karlin, S. and Ost, F. (1988). Maximal length of common words among random letter sequencesAnnals of Probability 16535–563.
Koutras, M. V. and Alexandrou, V. A. (1995). Runs, scans and urn model distributions: A unified Markov chain approachAnnals of the Institute of Statistical Mathematics 47743–766.
Leung, M. Y., Blaisdell, B. E., Burge, C. and Karlin, S. (1991). An efficient algorithm for identifying matches with errors in multiple long molecular sequencesJournal of Molecular Biology 2211367–1378.
Mott, R. F., Kirkwood, T. B. L. and Curnow, R. N. (1990). An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequencesBulletin of Mathematical Biology 52773–784.
Naus, J. I. (1974). Probabilities for a generalized birthday problemJournal of the American Statistical Association 69810–815.
Naus, J. I. (1982). Approximations for distributions of scan statisticsJournal of the American Statistical Association 77177–183.
Naus, J. I. (1988). Scan statistics, InEncyclopedia of Statistical SciencesVolume 8 (Eds., N. L. Johnson and S. Kotz), pp. 281–284, New York: John Wiley & Sons.
Naus, J. I. and Sheng, K. N. (1996). Screening for unusual matched segments in multiple protein sequencesCommunications in Statistics-Simulation and Computation 25937–952.
Naus, J.I. and Sheng, K.N. (1997). Matching among multiple random sequencesBulletin of Mathematical Biology 59483–496.
Naus, J. I. and Wartenberg, D. (1997). A double scan statistic for clusters of two types of eventsJournal of the American Statistical Association 921105–1113.
Page, E. S. (1955). Control charts with warning linesBiometrika 42243–257.
Papastavridis, S. G. and Koutras, M. V. (1993). Bounds for reliability of consecutivek-within-m-out-of-n:FsystemIEEE Transactions on Reliability 42156–160.
Piterbarg, V. I. (1992). On the distribution of the maximum similarity score for fragments of two random sequences, InMathematical Methods of Analysis of Biopolymer Sequences(F,d., Simon Gindikin), pp. 11–18,DI-MACS series in Discrete Mathematics and Theoretical Computer Science, Volume8Providence, RI: American Mathematical Society.
Roberts, S. W. (1958). Properties of control chart zone testsBell System Technical Journal 3783–114.
Sheng, K. N. and Naus, J. I. (1994). Pattern matching between two nonaligned random sequencesBulletin of Mathematical Biology56, 1143–1162.
Sheng, K. N. and Naus, J. I. (1996). Matching fixed rectangles in 2-dimensionsStatistics ε Probability Letters26, 83–90.
Waterman, M. S. (1986). Multiple sequence alignment by consensusNucleic Acids Research14, 9095–9102.
Witztum, D., Rips, E. and Rosenberg, Y. (1994). Equidistant letter sequences in the book of GenesisStatistical Science9, 429–438.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media New York
About this chapter
Cite this chapter
Naus, J.I. (1999). Scanning Multiple Sequences. In: Glaz, J., Balakrishnan, N. (eds) Scan Statistics and Applications. Statistics for Industry and Technology. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-1-4612-1578-3_4
Download citation
DOI: https://doi.org/10.1007/978-1-4612-1578-3_4
Publisher Name: Birkhäuser, Boston, MA
Print ISBN: 978-1-4612-7201-4
Online ISBN: 978-1-4612-1578-3
eBook Packages: Springer Book Archive