A PROPOSAL FOR THE ANALYSIS OF MAIN COMPONENTS IN THE PRESENCE OF NON-RANDOM VARIABLES

Main Article Content

Juliana Vieira GOMES
Camila Rafaela Gomes DIAS
José Ivo RIBEIRO JUNIOR

Abstract

For exploratory analysis of the principal components (CPs), the assumption of multivariate normality of the variables is not required, nor necessarily that they are random. This means that variables that do not behave randomly can also be included in this analysis. Thus, in order to carry out the analysis of the PCs with random variables or not, a correction of the matrix based on the coefficients of variation was proposed (Campana et al., 2010) by applying the method of Lenth (1989), whose new array was named . To verify its feasibility, ten data sets of random variables Y1, Y2, Y3 and Y4 were simulated, with 10,000 values each and that followed multivariate normal distribution. After the simulation, 0%, 1%, 2%, 3% and 4% of the random values of Y4 were replaced by the same and respective percentages of outliers, in order to break its randomness. Subsequently, response surface analyzes were performed for eight different absolute mean percentage errors obtained in relation to eight parameters related to the performance of the CP analysis, as a function of the replacement percentages by Y4 outliers (0, 1, 2, 3 and 4 ) and the matrices used in the analysis of the PCs. According to the results, it was concluded that, in the presence of only normal random variables,  it is the best matrix. On the other hand, when there are outliers, it is the most recommended.

Article Details

How to Cite
GOMES, J. V., DIAS, C. R. G., & RIBEIRO JUNIOR, J. I. (2022). A PROPOSAL FOR THE ANALYSIS OF MAIN COMPONENTS IN THE PRESENCE OF NON-RANDOM VARIABLES. Brazilian Journal of Biometrics, 40(3). https://doi.org/10.28951/bjb.v40i3.551
Section
Articles

References

CAMPANA, A. C. M.; RIBEIRO JÚNIOR, J. I.; NASCIMENTO, M. Uma proposta de transformação de dados para a análise de componentes principais. Revista Brasileira de Biometria, v.28, p.1-15, 2010.

FERREIRA, D. F. M Estatística multivariada. 2.ed. Lavras: Editora UFLA, 2009. 676p.

HOTELLING, H. Review of the triumph of mediocrity in business. Journal of the American Statistical Association. v. 28, p. 463-465, 1933.

JOHNSON, R. A; WICHERN, D. W. Applied multivariate statistical analysis. 5.ed. New Jersey: Prentice Hall, 2002.767p.

LAWSON, J. SAS macros for analysis of unreplicated 2kand 2k-pdesigns with a possible outlier. Journal of Statistical Software, v. 25, p. 1-17, 2008.

LENTH, R. V. Quick and easy analysis of unreplicated factorials. Technometrics, v.31, p. 469-473, 1989.

MINGOTI, S. A. Análise de dados através de métodos de estatística multivariada –uma abordagem aplicada. Belo Horizonte: Editora UFMG, 2007. 297p.

R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing, 2020. URL https://www.r-project.org