skip to main content

An Algorithm for Generalized Conversion to Normal Distribution for Independent and Identically Distributed Random Variables

*Louie Resti Sandoval Rellon orcid scopus  -  University of Mindanao, Philippines

Citation Format:
Abstract

The paper analyzes an efficient alternative to the Box-Cox and Johnson’s transformation to normality methods which operates under fairly general settings. The method hinges on two results in mathematical statistics: the fact that the cumulative distribution function F(x) of a random variable x always has a U(0,1) distribution and the Box-Mueller transformation of uniform random variables to standard normal random variables.  Bounds for the Kolmogorov-Smirnov statistic between the distribution of the transformed observations and the normal distribution are provided by numerical simulation and by appealing to the Dvoretzky-Kiefer- Wolfowitz inequality.

Fulltext View|Download
Keywords: transformation to normality, Box-Cox method, Johnson method, inequalities

Article Metrics:

  1. Box, G. E. P.; Muller, Mervin E. (1958). "A Note on the Generation of Random Normal Deviates". The Annals of Mathematical Statistics. 29 (2): 610–611. doi: 10.1214/aoms/1177706645.JSTOR2237361
  2. Cantelli, F. P. (1933). Sulla determinazione empirica della legge di probabilita. Giorn.lst. Ital.Attuari 4, 221-424
  3. Craig, W and Hogg, An Introduction to Mathematical Statistics, (Willey and Sons, New York, 2000)
  4. Glivenko, V. (1933). Sulla determinazione empirica della legge di probabilita. Giorn.lst. Ital.Attuari 4, 92-99
  5. Huber, P. (1981) Robust Statistics. Wiley, New York
  6. Johnson, R and Wichern, Applied Multivariate Statistical Analysis (Willey and Sons, New York, 2000)
  7. Massart, P.(1990) "The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality." Ann. Probab. 18 (3) 1269 - 1283. https://doi.org/10.1214/aop/1176990746
  8. Shorack, G. R., Wellner, J. A. (1986) Empirical Processes with Applications to Statistics, Wiley.ISBN 0-471-86725-X
  9. Vapnik, V.N. and Chervonenkis, A. Ya (1971). On uniform convergence of the frequencies of events to their probabilities. Theor. Prob. Appl. 16, 264-280
  10. Yeo, In-Kwon and Johnson, Richard (2000). A new family of power transformations
  11. to improve normality or symmetry. Biometrika, 87, 954-959
  12. Pan, P., Li, R., & Zhang, Y. (2023). Predicting punching shear in RC interior flat slabs with steel and FRP reinforcements using Box-Cox and Yeo-Johnson transformations. Case Studies in Construction Materials, 19, e02409. https://doi.org/10.1016/j.cscm.2023.e02409
  13. Thomas K, Peter JV, Christina J, et al. Cost-utility in medical intensive care patients. Rationalizing ongoing care and timing of discharge from intensive care. Ann Am Thorac Soc. 2015 Jul;12(7):1058–1065
  14. Peter JV, Thomas K, Jeyaseelan L, et al. Cost OF intensive care IN India. Int J Technol Assess Health Care. 2016 Jan;32(4):241–245
  15. Osborne J. Improving your data transformations: applying the Box-Cox transformation. Practical Assessment. Res Eval [Internet]. 2019 Nov 23;15(1). Available from https://scholarworks.umass.edu/pare/vol15/iss1/12
  16. Agarwal GG, Pant R. Regression model with power transformation weighting: application to peak expiratory flow rate. J Reliabil Stat Stud. 2009 Aug 10:52–59
  17. Dickey, D. A., & Chatfield, C. (1991). The Analysis of Time Series: An Introduction. Technometrics, 33(3), 363. https://doi.org/10.2307/1268794
  18. Cressie, N. (1992). STATISTICS FOR SPATIAL DATA. Terra Nova, 4(5), 613–617. https://doi.org/10.1111/j.1365-3121.1992.tb00605.x
  19. McLachlan, G., & Peel, D. (2000). Finite mixture models. http://ci.nii.ac.jp/ncid/BA49548052
  20. Sugiyama, M., & Kawanabe, M. (2012). Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation. http://ci.nii.ac.jp/ncid/BB08941717
  21. Hosmer, D. W., Lemeshow, S., & May, S. (2008). Applied Survival Analysis: Regression modeling of Time-to-Event data. http://cds.cern.ch/record/1555255
  22. Walter, S. D., & Altman, D. G. (1992). Practical Statistics for Medical Research. Biometrics, 48(2), 656. https://doi.org/10.2307/2532320
  23. Campbell, J. Y., Lo, A. W., & MacKinlay, A. (2012). The econometrics of financial markets. https://doi.org/10.2307/j.ctt7skm5
  24. Tsay, R. S. (2006). Analysis of Financial Time Series. Technometrics, 48(2), 316. https://doi.org/10.1198/tech.2006.s405
  25. Sauter, R. M., & Montgomery, D. C. (1992). Introduction to statistical quality control. Technometrics, 34(2), 232. https://doi.org/10.2307/1269251
  26. Menten, T., & Phadke, M. (1991). Quality Engineering Using Robust Design. Technometrics, 33(2), 236. https://doi.org/10.2307/1269049
  27. Kline, R. B. (1998). Principles and practice of structural equation modeling. http://cds.cern.ch/record/2123621
  28. Tabachnick, B. G., & Fidell, L. S. (1983). Using multivariate statistics. https://www.goodreads.com/work/editions/1559750-using-multivariate-statistics

Last update:

No citation recorded.

Last update:

No citation recorded.