Publikationsserver der Universitätsbibliothek Marburg

Titel:Machine Learning Methods for Fuzzy Pattern Tree Induction
Autor:Senge, Robin
Weitere Beteiligte: Hüllermeier, Eyke (Prof. Dr.)
Veröffentlicht:2014
URI:https://archiv.ub.uni-marburg.de/diss/z2014/0391
URN: urn:nbn:de:hebis:04-z2014-03918
DOI: https://doi.org/10.17192/z2014.0391
DDC: Informatik
Titel (trans.):Maschinelles Lernen für Fuzzy Pattern Tree Induktion
Publikationsdatum:2014-09-25
Lizenz:https://rightsstatements.org/vocab/InC-NC/1.0/

Dokument

Schlagwörter:
Fuzzy-Menge, Induktion, Fuzzy-Set, Fuzzy Pattern Tree, Maschinelles Lernen, Fuzzy Pattern Tree, Supervised Learning, Machine Learning, Fuzzy-Logik, Fuzzy-Logic, Überwachtes Lernen, Induction

Summary:
This thesis elaborates on a novel approach to fuzzy machine learning, that is, the combination of machine learning methods with mathematical tools for modeling and information processing based on fuzzy logic. More specifically, the thesis is devoted to so-called fuzzy pattern trees, a model class that has recently been introduced for representing dependencies between input and output variables in supervised learning tasks, such as classification and regression. Due to its hierarchical, modular structure and the use of different types of (nonlinear) aggregation operators, a fuzzy pattern tree has the ability to represent such dependencies in a very exible and compact way, thereby offering a reasonable balance between accuracy and model transparency. The focus of the thesis is on novel algorithms for pattern tree induction, i.e., for learning fuzzy pattern trees from observed data. In total, three new algorithms are introduced and compared to an existing method for the data-driven construction of pattern trees. While the first two algorithms are mainly geared toward an improvement of predictive accuracy, the last one focuses on eficiency aspects and seeks to make the learning process faster. The description and discussion of each algorithm is complemented with theoretical analyses and empirical studies in order to show the effectiveness of the proposed solutions.

Bibliographie / References

  1. Poon, H. & Domingos, P. (2011). Sum-product networks: A new deep architecture. In Proceedings of 2011 IEEE International Conference on Computer Vision, (pp. 689–690). 107, 108
  2. Altendorf, E. E., Restificar, A. C., & Dietterich, T. G. (2012). Learning from sparse data by exploiting monotonicity constraints. ArXiv preprint arXiv:1207.1364. 20
  3. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to Algorithms (2 ed.). The Massachusetts Institute of Technology (MIT) Press. 1
  4. Kearns, M. J. & Vazirani, U. (1994). An introduction to computational learning theory. 28
  5. Höppner, F., Klawonn, F., Kruse, R., & Runkler, T. (1999). Fuzzy Cluster Anal- ysis: Methods for Classification, Data Analysis and Image Recognition. John Wiley. 10
  6. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plau- sible Inference. Morgan Kaufmann. 107
  7. Chang, P.-C. & Liu, C.-H. (2008). A tsk type fuzzy rule based system for stock price prediction. Expert Systems with Applications, 34 (1), 135 – 144. 101
  8. Cohen, W. W. (1995). Fast effective rule induction. In Proceedings of 2009 In- ternational Conference on Machine Learning, (pp. 115–123)., Tahoe City, CA. 93, 101
  9. DemÅ¡ar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30. 92, 93
  10. Klement, E. P., Mesiar, R., & Pap, E. (2002). Triangular Norms. Kluwer Academic Publishers. 6
  11. Quinlan, R. J. (1986). Induction of decision trees. Machine Learning, 1 (1), 81–106.
  12. Janssen, F. & Fürnkranz, J. (2009). A re-evaluation of the over-searching phe- nomenon in inductive rule learning. In Proceedings of 2009 SIAM International Conference on Data Mining, (pp. 329–340). 91
  13. Maron, O. & Moore, A. W. (1993). Hoeffding races: Accelerating model selection search for classification and function approximation. Robotics Institute, 263. 68
  14. Mitchell, T. M. (1997). Machine Learning. New York: McGraw-Hill. 1 REFERENCES
  15. Duivesteijn, W. & Feelders, A. (2008). Nearest neighbour classification with mono- tonicity constraints. In Machine Learning and Knowledge Discovery in Databases (pp. 301–316). Springer. 19
  16. Allwein, E. L., Schapire, R. E., & Singer, Y. (2001). Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1, 113–141. 35
  17. Yi, Y., Fober, T., & Hüllermeier, E. (2009). Fuzzy operator trees for modeling rating functions. International Journal of Computational Intelligence and Applica- tions, 8 (4), 413–428. 13
  18. Sra, S., Nowozin, S., & Wright, S. J. (2011). Optimization for Machine Learning. 5
  19. Ishibuchi, H. & Nakashima, T. (2001). Effect of rule weights in fuzzy rule-based classification systems. IEEE Transactions on Fuzzy Systems, 9 (4), 506–515. 102
  20. Hüllermeier, E. (2011). Fuzzy sets in machine learning and data mining. Applica- tions of Soft Computing, 11 (2), 1493–1505. 10
  21. Nasiri, M., Fober, T., Senge, R., & Hüllermeier, E. (2013). Fuzzy pattern trees as an alternative to rule-based fuzzy systems: Knowledge-driven, data-driven and hybrid modeling of color yield in polyester dyeing. In Proceedings of 2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS), (pp. 715–721). 13
  22. Nasiri, M., Hüllermeier, E., Senge, R., & Lughofer, E. (2011). Comparing meth- ods for knowledge-driven and data-driven fuzzy modeling: A case study in textile industry. In Proceedings of 2011 World Congress of the International Fuzzy Systems Association, (pp. RW–103–1–6). 13
  23. Janikow, C. Z. (1998). Fuzzy decision trees: Issues and methods. IEEE Transac- tions on Systems, Man, and Cybernetics, 28 (1), 1–14. 10, 105, 107
  24. Maron, O. & Moore, A. W. (1997). The racing algorithm: Model selection for lazy learners. Artificial Intelligence Review, 11 (1-5), 193–225. 70
  25. Cordón, O., del Jesus, M. J., & Herrera, F. (1999). A proposal on reasoning meth- ods in fuzzy rule-based classification systems. International Journal of Approximate Reasoning, 20 (1), 21–45. 101
  26. González, A. & Pérez, R. (1999). Slave: A genetic learning system based on an iterative approach. IEEE Transactions on Fuzzy Systems, 7, 176–191. 93, 101
  27. Keshwani, D. R., Jones, D. D., Meyer, G. E., & Brand, R. M. (2008). Rule-based mamdani-type fuzzy modeling of skin permeability. Applied Soft Computing, 8 (1), 285–294. 101
  28. Bellman, R., Kalaba, R., & Zadeh, L. (1966). Abstraction and pattern classifica- tion. Journal of Mathematical Analysis and Applications, 13 (1), 1–7. 8
  29. Alcalá-Fdez, J., Sánchez, L., García, S., del Jesús, M. J., Ventura, S., Garrell, J., Otero, J., Romero, C., Bacardit, J., & Rivas, V. M. (2009). Keel: A software tool to assess evolutionary algorithms for data mining problems. Soft Computing, 13 (3), 307–318. 93, 96
  30. Takagi, T. & Sugeno, M. (1985). Fuzzy identification of systems and its applica- tions to modeling and control. IEEE Transactions on Systems, Man and Cybernetics, SMC-15 (1), 116–132. 10, 36, 100
  31. Fine, T. L. (1973). Theories of Probability. Academic Press. 8
  32. Passino, K. M. & Yurkovich, S. (1998). Fuzzy Control, volume 42. Citeseer. 100
  33. González, A. & Pérez, R. (2001). Selection of relevant features in a fuzzy genetic learning algorithm. IEEE Transactions on Systems, Man and and Cybernetics, 31, 417–425. 93
  34. Joo, M. G. & Lee, J. S. (1999). Hierarchical fuzzy control scheme using structured takagi-sugeno type fuzzy inference. In Proceedings of 1999 IEEE International Fuzzy Systems Conference, volume 1, (pp. 78–83). 103, 104
  35. Bellman, R. E. & Zadeh, L. A. (1970). Decision-making in a fuzzy environment. Management Science, 17 (4), B–141. 8
  36. Chen, G., Wei, Q., Kerre, E., & Wets, G. (2003). Overview of fuzzy associations mining. In Proceedings of 2003 ISIS. 10 REFERENCES
  37. Suárez, A. & Lutsko, J. F. (1999). Globally optimal fuzzy decision trees for clas- sification and regression. IEEE Transactions on Pattern Analysis and Machine In- telligence, 21 (12), 1297–1311. 105
  38. Hagras, H. A. (2004). A hierarchical type-2 fuzzy logic control architecture for autonomous mobile robots. IEEE Transactions on Fuzzy Systems, 12 (4), 524–539. 103 REFERENCES [45] Hamacher, H. (1975). ¨ Uber logische Verknüpfungen unscharfer Aussagen und deren zugehörige Bewertungsfunktionen, volume 14 of Arbeitsbericht. 17
  39. Huang, Z., Gedeon, T., & Nikravesh, M. (2008). Pattern trees induction: A new machine learning method. IEEE Transactions on Fuzzy Systems, 16 (4), 958 –970. 13, 15, 29, 39, 43, 46, 59, 63, 67, 92, 107
  40. Thompson, S. (1999). Haskell: The Craft of Functional Programming, volume 2. Addison-Wesley. 109
  41. Angluin, D. (1988). Queries and concept learning. Machine Learning, 2 (4), 319– 342. 11
  42. Table 7.8: Accuracy results of the three main variants selected for comparison in Sec- tion 4.2. References [1] Aha, D. W. & Kibler, D. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66. 92
  43. Olaru, C. & Wehenkel, L. (2003). A complete fuzzy decision tree technique. Fuzzy Sets and Systems, 138 (2), 221–254. 105, 107
  44. Tunstel Jr., E. W. (1996). Adaptive Hierarchy of Distributed Fuzzy Control: Application to Behavior Control of Rovers. PhD thesis. 103, 104
  45. Grabisch, M., Marichal, J.-L., Mesiar, R., & Pap, E. (2009). Aggregation Functions (1st ed.). New York, NY, USA: Cambridge University Press. 16, 17 [42] Greco, S., Mousseau, V., & SS lowi´lowi´nski, R. (2008). Ordinal regression revisited: Multiple criteria ranking using a set of additive value functions. European Journal of Operational Research, 191 (2), 416–436. 113 [43] Hagan, M. T., Demuth, H. B., & Beale, M. H. (1996). Neural Network Design. PWS Publishing Company, Boston. 108
  46. Maeda, H. (1996). An investigation on the spread of fuzziness in multi-fold multi- stage approximate reasoning by pictorial representation—under sup-min composition and triangular type membership function. Fuzzy Sets and Systems, 80 (2), 133–148. 104 [63] Mamdani, E. H. (1974). Application of fuzzy algorithms for control of simple dynamic plant. Proceedings of the Institution of Electrical Engineers, 121 (12), 1585– 1588. 100, 101
  47. Torra, V. (2002). A review of the construction of hierarchical fuzzy systems. International Journal of Intelligent Systems, 531–543. 103
  48. Zadeh, L. A. (1979). A theory of approximate reasoning. Machine Intelligence, 9, 149–194. 8
  49. Neal, R. M. & Hinton, G. E. (1998). A view of the em algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models (pp. 355– 368). Springer. 108
  50. De'ath, G. & Fabricius, K. E. (2000). Classification and regression trees: A pow- erful yet simple technique for ecological data analysis. Ecology, 81 (11), 3178–3192.
  51. Potharst, R. & Feelders, A. J. (2002). Classification trees for problems with mono- tonicity constraints. ACM SIGKDD Explorations Newsletter, 4 (1), 1–10. 19 [80] Potter, M. A. & De Jong, K. A. (1994). A cooperative coevolutionary approach to function optimization. In Parallel Problem Solving from Nature—PPSN III (pp. 249–257). Springer. 80
  52. Quinlan, R. J. (1987). Decision trees at probabilistic classifiers. Proceedings of 1987 International Workshop on Machine Learning, 31–37. 105 [85] Quinlan, R. J. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers. 5, 92 [86] Reed, M. & Simon, B. (1981). I: Functional Analysis, volume 1. Access Online via Elsevier. 23
  53. Cheong, F. & Lai, R. (2007). Designing a hierarchical fuzzy logic controller using the differential evolution approach. Applied Soft Computing, 7 (2), 481–491. 104 [23] Church, A. (1932). A set of postulates for the foundation of logic. Annals of Mathematics, 33 (2), 346–366. 1
  54. Vapnik, V. N. & Kotz, S. (1982). Estimation of Dependences Based on Empirical Data, volume 41. Springer-Verlag New York. 26 [108] Wang, L.-X. (1999). Analysis and design of hierarchical fuzzy systems. IEEE Transactions on Fuzzy Systems, 7 (5), 617–624. 104 [109] Wang, L.-X. & Mendel, J. M. (1992). Generating fuzzy rules by learning from examples. IEEE Transactions on Systems, Man and Cybernetics, 22 (6), 1414–1427. 10, 96
  55. Fodor, J. & Yager, R. R. (2000). Fuzzy set-theoretic operators and quantifiers. In Fundamentals of Fuzzy Sets (pp. 125–193). Springer. 17 [36] Fürnkranz, J. & Hüllermeier, E. (2010). Preference Learning. Springer. 112
  56. Linkens, D. A., Shieh, J. S., & Peacock, J. E. (1996). Hierarchical fuzzy modelling for monitoring depth of anaesthesia. Fuzzy Sets and Systems, 79 (1), 43–57. 103 [61] Ludbrook, J. (1998). Multiple comparison procedures updated. Clinical and Ex- perimental Pharmacology and Physiology, 25 (12), 1032–1037. 70
  57. Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics, 1, 80–83. 64, 91 [112] Witten, I. H. & Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques (2 ed.). Morgan Kaufmann. 29
  58. Yuan, Y. & Shaw, M. J. (1995). Induction of fuzzy decision trees. Fuzzy Sets and Systems, 69 (2), 125–139. 105, 107 [118] Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8 (3), 338–353. 5
  59. del Jesus, M. J., Hoffmann, F., Navascués, L. J., & Sánchez, L. (2004). Induction of fuzzy-rule-based classifiers with evolutionary boosting algorithms. IEEE Transac- tions on Fuzzy Systems, 12 (3), 296–308. 101 [29] Delgado, M., Marín, N., Sánchez, D., & Vila, M.-A. (2003). Fuzzy association rules: General model and applications. IEEE Transactions on Fuzzy Systems, 11 (2), 214–225. 10
  60. Rubin, D. B. (1976). Inference and missing data. Biometrika, 63 (3), 581–592. 20 [88] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1985). Learning internal representations by error propagation. Technical report, DTIC Document. 108
  61. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1989). Learnability and the vapnik-chervonenkis dimension. Journal of the Association of Computing Machinery (ACM), 36 (4), 929–965. 26
  62. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2 (1), 1–127. 103 [13] Bertsekas, D. P. (1999). Nonlinear Programming. Athena Scientific. 18 [14] Beyer, H.-G. & Schwefel, H.-P. (2002). Evolution strategies -a comprehensive introduction. Natural Computing, 1, 3–52. 18, 108 [15] Bishop, C. M. & Nasrabadi, N. M. (2006). Pattern Recognition and Machine Learning, volume 1. Springer New York. 4
  63. Senge, R. & Hüllermeier, E. (2009). Learning pattern tree classifiers using a co- evolutionary algorithm. In Proceedings of Knowledge Discovery, Data Mining und Maschinelles Lernen, volume 19, (pp. 105–110). 79
  64. Murthy, S. K. & Salzberg, S. (1995). Lookahead and pathology in decision tree in- duction. In Proceedings of International Joint Conferences on Artificial Intelligence, (pp. 1025–1033). 91 [70] Myerson, R. B. (1991). Game Theory: Analysis of Conflict. Harvard University Press. 42
  65. Flach, P. (2012). Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press. 2, 4, 92
  66. Schafer, J. L. & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7 (2), 147–177. 20, 21 [90] Schweizer, B. & Sklar, A. (1983). Probabilistic Metric Spaces. New York. 15 REFERENCES [91] Senge, R., Fober, T., Nasiri, M., & Hüllermeier, E. (2012). Fuzzy Pattern Trees: Ein alternativer Ansatz zur Fuzzy-Modellierung. at-Automatisierungstechnik, 60 (10), 622–629. 13
  67. Wolpert, D. H. & Macready, W. G. (1997). No free lunch theorems for optimiza- tion. IEEE Transactions on Evolutionary Computation, 1 (1), 67–82. 92 REFERENCES [114] Yager, R. R. (1988). On ordered weighted averaging aggregation operators in multi-criteria decision making. IEEE Transactions on Systems, Man and Cybernet- ics, 18(1), 183–190. 15
  68. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1987). Occam's razor. Information Processing Letters, 24 (6), 377 – 380. 28, 60
  69. Turing, A. (1936-7). On computable numbers, with an application to the entschei- dungsproblem (1936). Proceedings of the London Mathematical Society, 42, 230–265. 1 [104] Van Buuren, S. (2012). Flexible Imputation of Missing Data. Chapman & Hall. 21 [105] Vapnik, V., Levin, E., & Cun, Y. L. (1994). Measuring the VC-dimension of a learning machine. Neural Computation, 6, 851–876. 26 [106] Vapnik, V. N. & Chervonenkis, Y. A. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability & its Ap- plications, 16 (2), 264–280. 26
  70. Meyer, M. & Vlachos, P. (2009). Statlib data, software and news from the statistics community. Accessed 13 Nov 2009. 30
  71. Dubois, D. & Prade, H. (1997). The three semantics of fuzzy sets. Fuzzy Sets and Systems, 90 (2), 141–150. 8
  72. Senge, R. & Hüllermeier, E. (2011). Top-down induction of fuzzy pattern trees. IEEE Transactions on Fuzzy Systems, 19 (2), 241 –252. 43, 56, 85, 92 [94] Senge, R. & Hüllermeier, E. (2014). Fast fuzzy pattern tree learning. IEEE Transactions on Fuzzy Systems. Submitted and under review. 37, 67
  73. Audibert, J.-Y., Munos, R., & Szepesvári, C. (2007). Tuning bandit algorithms in stochastic environments. In Algorithmic Learning Theory, (pp. 150–165). 70 [9] Bellman, R. (1961). Adaptive Control Processes: A Guided Tour. A Rand Corpo- ration Research Study Series. Princeton University Press. 20
  74. Asuncion, A. & Newman, D. (2009). UCI machine learning repository. Accessed 13 Nov 2009. 30
  75. Weber, R. (1992). Fuzzy-id3: A class of methods for automatic knowledge acquisi- tion. In Proceedings of the International Conference on Fuzzy Logic Neural Networks, (pp. 265 –268). 10
  76. Platt, J. C. (1999). Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods: Support Vector Learning (pp. 185–208). Cambridge, MA, USA: Massachusetts Institute of Technology (MIT) Press. 92
  77. Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The Elements of Statistical Learning, volume 1. Springer New York. 2
  78. Steele, G. L. (1990). Common LISP: The Language. Digital Press. 108
  79. Stufflebeam, J. & Prasad, N. R. (1999). Hierarchical fuzzy control. In Proceedings of 1999 IEEE International Fuzzy Systems Conference, volume 1, (pp. 498–503). 103
  80. Alpaydin, E. (2004). Introduction to Machine Learning. Massachusetts Institute of Technology (MIT) Press. 4
  81. Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandi- navian Journal of Statistics, 6, 65–70. 93
  82. Bösner, S., Haasenritter, J., Becker, A., Karatolios, K., Vaucher, P., Gencer, B., Herzig, L., Heinzel-Gutenbrunner, M., Schaefer, J. R., Hani, M. A., Keller, H., Sönnichsen, A. C., Baum, E., & Donner-Banzhoff, N. (2010). Ruling out coronary artery disease in primary care: development and validation of a simple prediction rule. Canadian Medical Association Journal, 182 (12), 1295–1300. 19
  83. Jacquet-Lagreze, E. & Siskos, J. (1982). Assessing a set of additive utility functions for multicriteria decision-making, the UTA method. European Journal of Operational Research, 10 (2), 151–164. 113
  84. García, S., Fernández, A., Luengo, J., & Herrera, F. (2010). Advanced nonpara- metric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Information Sciences, 180, 2044 – 2064. 93 [38] Gath, I. & Geva, A. B. (1989). Unsupervised optimal fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11 (7), 773–780. 101
  85. Yager, R. R. (1997). On a class of weak triangular norm operators. Information Sciences, 96 (1–2), 47 – 78. 18
  86. Hühn, J. & Hüllermeier, E. (2009). Furia: An algorithm for unordered fuzzy rule induction. Data Mining and Knowledge Discovery, 19 (3), 293–319. 101
  87. Bro, R. & De Jong, S. (1997). A fast non-negativity-constrained least squares algorithm. Journal of Chemometrics, 11 (5), 393–401. 18
  88. Mnih, V., Szepesvári, C., & Audibert, J.-Y. (2008). Empirical bernstein stopping. In Proceedings of 2008 International Conference on Machine Learning, (pp. 672–679).


* Das Dokument ist im Internet frei zugänglich - Hinweise zu den Nutzungsrechten