Grzymala-Busse, Jerzy W.2017-11-092017-11-092016-02-25Grzymala-Busse, J. W., & Mroczek, T. (2016). A comparison of four approaches to discretization based on entropy. Entropy, 18(3), 69.https://hdl.handle.net/1808/25317We compare four discretization methods, all based on entropy: the original C4.5 approach to discretization, two globalized methods, known as equal interval width and equal frequency per interval, and a relatively new method for discretization called multiple scanning using the C4.5 decision tree generation system. The main objective of our research is to compare the quality of these four methods using two criteria: an error rate evaluated by ten-fold cross-validation and the size of the decision tree generated by C4.5. Our results show that multiple scanning is the best discretization method in terms of the error rate and that decision trees generated from datasets discretized by multiple scanning are simpler than decision trees generated directly by C4.5 or generated from datasets discretized by both globalized discretization methods.© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license.https://creativecommons.org/licenses/by/4.0/Data miningDiscretizationNumerical attributesEntropyA Comparison of Four Approaches to Discretization Based on Entropy †Article10.3390/e18030069openAccess