skip to main content
article
Free Access

Software profiling for hot path prediction: less is more

Published:01 November 2000Publication History
Skip Abstract Section

Abstract

Recently, there has been a growing interest in exploiting profile information in adaptive systems such as just-in-time compilers, dynamic optimizers and, binary translators. In this paper, we show that sophisticated software profiling schemes that provide highly accurate information in an offline setting are ill-suited for these dynamic code generation systems. We experimentally demonstrate that hot path predictions must be made early in order to control the rising cost of missed opportunity that result from the prediction delay. We also show that existing sophisticated path profiling schemes, if used in an online setting, offer no prediction advantages over simpler schemes that exhibit much lower runtime overheads.Based on these observation we developed a new low-overhead software profiling scheme for hot path prediction. Using an abstract metric we compare our scheme to path profile based prediction and show that our scheme achieves comparable prediction quality. In our second set of experiments we include runtime overhead and evaluate the performance of our scheme in a realistic application: Dynamo, a dynamic optimization system. The results show that our prediction scheme clearly outperforms path profile based prediction and thus confirm that less profiling as exhibited in our scheme will actually lead to more effective hot path prediction.

References

  1. 1 Ammons, G., Ball, T., and Larus, J.R. Exploiting hardware performance counters with flow and context sensitive profiling. In Proc. of the 1997 Conf. on Programming Language Design and Implementation, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 Anderson, J.M., Bere, L.M., Dean, J., Ghemawat, S., Henzinger, M.R., Leung, S.A., Sites, R.L., Vandevoorde, M.T., Waldspurger, C.A., and Weihl, W.E. Continuous profiling: Where have all the cycles gone? In Proc. of the 16 th ACM Syrup. on Operating Systems Principles, St. Malo, France. October 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 Bala, V., Duesterwald, E., and Banerjia, S. Transparent dynamic optimization: The design and implementation of Dynamo. Hewlett Packard Laboratories Technical Report HPL-1999-78. June 1999.Google ScholarGoogle Scholar
  4. 4 Bala, V., Duesterwald, E., and Banerjia, S. Dynamo: A transparent runtime optimization system. In Proc. of the 2000 Conf. on Programming Language Design and Implementation. Vancouver, B.C., June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 Ball, T. and Larus, J.R. Efficient path profiling. In Proc. of the 29 th Int. Symp. on Microarchitecture, Paris. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 Ball, T., Mataga, P. and Sagiv, M. Edge profiling versus path profiling: The showdown. In Proc. of the 25 th Symp. on Principles of Programming Languages, San Diego, CA, January 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 Burke, M., Choi, J.-D., Fink, S., Grove, D., Hind, M., Sarkar, V., Serrano, M.J., Sreedhar, V.C., Srinivasa, H. The Jalapeno Dynamic Optimizing Compiler for Java. In Proc. of the 1999 ACM dava Grande Conference, San Francisco, CA. June 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 Chang, P., Mahlke, S.A., and Hwu, W.M. Using profile information to assist classic code optimization. Software - Practice and Experience, Vol. 21, No. 12, December 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 Calder, B. and Grunwald, D. Fast and accurate instruction fetch and branch prediction. In Proc. of the 21st Int. Syrup. on Computer Architecture. April 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 Cmelik, R.F. and Keppel, D. Shade: a fast instruction set simulator for execution profiling. Technical Report UWCSE- 93-06-06, Dept. Comp. Science and Engineering, Univ. Washington. 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 Ebeioglu, K., Altrnan E., Sathaye, S., and Gschwind, M. Execution-based scheduling for VLIW architectures. In Proc. of Europar'99, Lecture Notes in Computer Science 1685, Springer-Verlag 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 McFarling, S., and Hennesy, J. Reducing the cost of branches. In Proc. of the 13 th lnt. Syrup. on Computer Architecture. 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 Merten, C.M., Trick, A., George, C.N., Gyllenhaal, J.C., and Hwu, W.-M.W. A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization. In Proc. of the 26 th Int. Syrup. on Computer Architecture. Atlanta, Georgia. 1999, Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 Pan, S, So, K., and Rahmeh, J. Improving the accuracy of dynamic branch prediction using branch correlation. In Proc. of the 5th Int. Conf. on Architectural Support for Programming Languages and Operating Systems. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 Rotenberg, E., Bennett, S., and Smith, J.E. Trace cache: a low latency approach to high bandwidth instruction fetching. In Proc. of the 29 th lnt. Symp. on Microarchitecture, Paris. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 Sannella, M., Maloney, J., Freeman-Benson, B., and Borning, A. Multi-way versus one-way constraints in user interfaces: experiences with the DeltaBlue algorithm. Software - Practice and Experience 23, 5 (May). 529-566. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 Sathaye, S., Ledak, P., LeBlanc, J., Kosonocky, S., Gschwind, M., Fritts, J., Filan, Z., Bright, A., AppenzeUer, D., Airman, E., and Agricola, C. BOA: Targeting multigigahertz with binary translation. In Proc. of the 1999 Workshop on Binary Translation, Newport Beach, CA., October 1999.Google ScholarGoogle Scholar
  18. 18 Smith, M. Private communication, March 2000.Google ScholarGoogle Scholar
  19. 19 Yeh, T. and Patt, Y. A comparison of dynamic branch predictors that use two levels of branch history. In Proc. of the 20 th Int. Symp. on Computer Architecture. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 Young, C. and Smith, M. Static correlated branch prediction. ACM Transactions on Programming Languages and Systems, Vol. 21, No. 5, September 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 Zhang, X. et al. System support for automatic profiling and optimization. In Proc. of the 16 th ACM Symposium on Operating Systems Principles, St. Malo, France. Oct. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Software profiling for hot path prediction: less is more

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 35, Issue 11
        Nov. 2000
        269 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/356989
        Issue’s Table of Contents

        Copyright © 2000 Authors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 November 2000

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader