Barros , E. A. C. & Mazucheli , J. 2005. Um estudo sobre o tamanho e poder dos testes t-student e wilcoxon. Acta Scientiarum: Technology 27(1), 23–32.

Benavoli , A., Corani , G., Demšar , J. & Zaffalon , M. 2017. Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. Journal of Machine Learning Research 18(1), 1–36.

Berben , L., Sereika , S. M. & Engberg , S. 2012. Effect size estimation: methods and examples. International Journal of Nursing Studies 49(8), 1039–1047.

Bertsimas , D. & Dunn , J. 2017. Optimal classification trees. Machine Learning 106(7), 1039–1082.

Breiman , L. 2001. Random forests. Machine Learning 45(1), 5–32.

Bussab , W. O. & Morettin , P. 2010. Estatística Básica, 6a. edição. Editora Saraiva.

Cardoso , D. O., Gama , J. & França , F. M. 2017. Weightless neural networks for open set recognition. Machine Learning 106(9–10), 1547–1567.

Cohen , J. 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd edition. Erlbaum.

Cousins , S. & Taylor , J. S. 2017. High-probability minimax probability machines. Machine Learning 106(6), 863–886.

Cover , T. & Hart , P. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27.

Dheeru , D. & Taniskidou , E. K. 2017. UCI machine learning repository. http://archive.ics.uci.edu/ml.

du Plessis , M. C., Niu , G. & Sugiyama , M. 2017. Class-prior estimation for learning from positive and unlabeled data. Machine Learning 106(4), 463–492.

Fern , E. F. & Monroe , K. B. 1996. Effect-size estimates: issues and problems in interpretation. Journal of Consumer Research 23(2), 89–105.

Fisher , R. A. 1925. Statistical Methods for Research Workers. Springer.

Fritz , C. O., Morris , P. E. & Richler , J. J. 2012. Effect size estimates: current use, calculations, and interpretation. Journal of Experimental Psychology: General 141(1), 2–18.

Gomes , H. M., Bifet , A., Read , J., Barddal , J. P., Enembreck , F., Pfharinger , B., Holmes , G. & Abdessalem , T. 2017. Adaptive random forests for evolving data stream classification. Machine Learning 106(9–10), 1469–1495.

Hair , J. F., Black , W. C., Babin , B. J., Anderson , R. E. & Tatham , R. L. 2009. Análise multivariada de dados. Bookman Editora.

Hearst , M. A., Dumais , S. T., Osuna , E., Platt , J. & Scholkopf , B. 1998. Support vector machines. IEEE Intelligent Systems and Their Applications 13(4), 18–28.

Huang , K. H. & Lin , H. T. 2017. Cost-sensitive label embedding for multi-label classification. Machine Learning 106(9–10), 1725–1746.

Japkowicz , N. & Shah , M. 2011. Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press.

Júnior , P. R. M., de Souza , R. M., Werneck , R. d. O., Stein , B. V., Pazinato , D. V., de Almeida , W. R., Penatti , O. A., Torres , R. d. S. & Rocha , A. 2017. Nearest neighbors distance ratio open-set classifier. Machine Learning 106(3), 359–386.

Kim , D. & Oh , A. 2017. Hierarchical dirichlet scaling process. Machine Learning 106(3), 387–418.

Kline , R. B. 2004. Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research. American Psychological Association.

Kotłowski , W. & Dembczyński , K. 2017. Surrogate regret bounds for generalized classification performance metrics. Machine Learning 106(4), 549–572.

Krijthe , J. H. & Loog , M. 2017. Projected estimators for robust semi-supervised classification. Machine Learning 106(7), 993–1008.

Langley , P., Iba , W., Thompson , K. 1992. An analysis of Bayesian classifiers. In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI-92), California, AAAI Press, 90, 223–228.

Mena , D., Montañés , E., Quevedo , J. R. & Del Coz , J. J. 2017. A family of admissible heuristics for a* to perform inference in probabilistic classifier chains. Machine Learning 106(1), 143–169.

ML Journal . 2017. Machine Learning 106(1–12). https://link.springer.com/journal/10994/106/1

Nakagawa , S. & Cuthill , I. C. 2007. Effect size, confidence interval and statistical significance: a practical guide for biologists. Biological Reviews 82(4), 591–605.

Neumann , N. M., Plastino , A., Junior , J. A. P. & Freitas , A. A. 2018. Is p-value< 0.05 enough? two case studies in classifiers evaluation (in Portuguese). In Anais do XV Encontro Nacional de Inteligência Artificial e Computacional, SBC, 94–103.

Osojnik , A., Panov , P. & Džeroski , S. 2017. Multi-label classification via multi-target regression on data streams. Machine Learning 106(6), 745–770.

Snyder , P. & Lawson , S. 1993. Evaluating results using corrected and uncorrected effect size estimates. The Journal of Experimental Education 61(4), 334–349.

Sullivan , G. M. & Feinn , R. 2012. Using effect size-or why the p-value is not enough. Journal of Graduate Medical Education 4(3), 279–282.

Suzumura , S., Ogawa , K., Sugiyama , M., Karasuyama , M. & Takeuchi , I. 2017. Homotopy continuation approaches for robust SV classification and regression. Machine Learning 106(7), 1009–1038.

Tomczak , M. & Tomczak , E. 2014. The need to report effect size estimates revisited. an overview of some recommended measures of effect size. Trends in Sport Sciences 21(1), 19–25.

Wasserstein , R. L. & Lazar , N. A. 2016. The ASA’s statement on p-values: context, process, and purpose. The American Statistician 70, 129–133.

Witten , I. H., Frank , E., Hall , M. A. & Pal , C. J. 2016. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.

Wu , Y. P. & Lin , H. T. 2017. Progressive random k-labelsets for cost-sensitive multi-label classification. Machine Learning 106(5), 671–694.

Xuan , J., Lu , J., Zhang , G., Da Xu , R. Y. & Luo , X. 2017. A Bayesian nonparametric model for multi-label learning. Machine Learning 106(11), 1787–1815.

Yu , F. & Zhang , M. L. 2017. Maximum margin partial label learning. Machine Learning 106(4), 573–593.

Zaidi , N. A., Webb , G. I., Carman , M. J., Petitjean , F., Buntine , W., Hynes , M. & De Sterck , H. 2017. Efficient parameter learning of Bayesian network classifiers. Machine Learning 106(9–10), 1289–1329.