Skip to main content
placeholder image

Data analysis in HEP: A statistical toolkit

Conference Paper


Abstract


  • Statistical methods play a significant role throughout the life-cycle of HEP experiments, being an essential component of physics analysis. Only a few basic tools for statistical analysis were available in the public domain FORTRAN libraries for HEP. Nowadays the situation is hardly unchanged even among the libraries of the new generation. The present project in progress aims to develop an object-oriented software toolkit for statistical data analysis. More in particular, the Statistical Comparison component of the toolkit provides algorithms for the comparison of data distributions in a variety of use cases typical of HEP experiments, as regression testing (in various phases of the software life-cycle), validation of simulation through comparison to experimental data, comparison of expected versus reconstructed distributions, comparison of data from different sources - such as different sets of experimental data, or experimental with respect to theoretical distributions. The toolkit contains a variety of goodness-of-fit tests, from chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Lilliefors, Cramer-von Mises, Kuiper, Thanks to the component-based design and the usage of the standard AIDA interfaces, this tool can be used by other data analysis systems or integrated in experimental software frameworks. We present the architecture of the system, the statistics methods implemented and the results of its first applications to the validation of the Geant4 Simulation Toolkit and to experimental data analysis. Statistics, Goodness-of-fit tests, Distributions comparison, Data analysis, Software.

Publication Date


  • 2003

Citation


  • Donadio, S., Guatelli, S., Mascialino, B., Pfeiffer, A., Pia, M. G., Ribon, A., & Viarengo, P. (2003). Data analysis in HEP: A statistical toolkit. In IEEE Nuclear Science Symposium Conference Record Vol. 1 (pp. 412-416). doi:10.1109/nssmic.2003.1352074

Scopus Eid


  • 2-s2.0-11844290148

Web Of Science Accession Number


Start Page


  • 412

End Page


  • 416

Volume


  • 1

Abstract


  • Statistical methods play a significant role throughout the life-cycle of HEP experiments, being an essential component of physics analysis. Only a few basic tools for statistical analysis were available in the public domain FORTRAN libraries for HEP. Nowadays the situation is hardly unchanged even among the libraries of the new generation. The present project in progress aims to develop an object-oriented software toolkit for statistical data analysis. More in particular, the Statistical Comparison component of the toolkit provides algorithms for the comparison of data distributions in a variety of use cases typical of HEP experiments, as regression testing (in various phases of the software life-cycle), validation of simulation through comparison to experimental data, comparison of expected versus reconstructed distributions, comparison of data from different sources - such as different sets of experimental data, or experimental with respect to theoretical distributions. The toolkit contains a variety of goodness-of-fit tests, from chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Lilliefors, Cramer-von Mises, Kuiper, Thanks to the component-based design and the usage of the standard AIDA interfaces, this tool can be used by other data analysis systems or integrated in experimental software frameworks. We present the architecture of the system, the statistics methods implemented and the results of its first applications to the validation of the Geant4 Simulation Toolkit and to experimental data analysis. Statistics, Goodness-of-fit tests, Distributions comparison, Data analysis, Software.

Publication Date


  • 2003

Citation


  • Donadio, S., Guatelli, S., Mascialino, B., Pfeiffer, A., Pia, M. G., Ribon, A., & Viarengo, P. (2003). Data analysis in HEP: A statistical toolkit. In IEEE Nuclear Science Symposium Conference Record Vol. 1 (pp. 412-416). doi:10.1109/nssmic.2003.1352074

Scopus Eid


  • 2-s2.0-11844290148

Web Of Science Accession Number


Start Page


  • 412

End Page


  • 416

Volume


  • 1