Publikationsserver der Universitätsbibliothek Marburg

Titel:Learning nonlinear monotone classifiers using the Choquet Integral
Autor:Fallah Tehrani, Ali
Weitere Beteiligte: Hüllermeier, Eyke (Prof. Dr.)
Veröffentlicht:2014
URI:https://archiv.ub.uni-marburg.de/diss/z2014/0358
URN: urn:nbn:de:hebis:04-z2014-03580
DOI: https://doi.org/10.17192/z2014.0358
DDC: Informatik
Titel (trans.):Einsatz des Choquet Integrals für das maschinelle Lernen monotoner Klassifikationsmodelle
Publikationsdatum:2014-07-09
Lizenz:https://rightsstatements.org/vocab/InC-NC/1.0/

Dokument

Schlagwörter:
Choquet integral, monotone classifier, monotone Klassifizier, logistic regression, Choquet integral, logistic regression, Choquet integral, monotone Klassifizier, logistic regression

Summary:
In der jüngeren Vergangenheit hat das Lernen von Vorhersagemodellen, die eine monotone Beziehung zwischen Ein- und Ausgabevariablen garantieren, wachsende Aufmerksamkeit im Bereich des maschinellen Lernens erlangt. Besonders für flexible nichtlineare Modelle stellt die Gewährleistung der Monotonie eine große Herausforderung für die Umsetzung dar. Die vorgelegte Arbeit nutzt das Choquet Integral als mathematische Grundlage für die Entwicklung neuer Modelle für nichtlineare Klassifikationsaufgaben. Neben den bekannten Einsatzgebieten des Choquet-Integrals als flexible Aggregationsfunktion in multi-kriteriellen Entscheidungsverfahren, findet der Formalismus damit Eingang als wichtiges Werkzeug für Modelle des maschinellen Lernens. Neben dem Vorteil, Monotonie und Flexibilität auf elegante Weise mathematisch vereinbar zu machen, bietet das Choquet-Integral Möglichkeiten zur Quantifizierung von Wechselwirkungen zwischen Gruppen von Attributen der Eingabedaten, wodurch interpretierbare Modelle gewonnen werden können. In der Arbeit werden konkrete Methoden für das Lernen mit dem Choquet Integral entwickelt, welche zwei unterschiedliche Ansätze nutzen, die Maximum-Likelihood-Schätzung und die strukturelle Risikominimierung. Während der erste Ansatz zu einer Verallgemeinerung der logistischen Regression führt, wird der zweite mit Hilfe von Support-Vektor-Maschinen realisiert. In beiden Fällen wird das Lernproblem imWesentlichen auf die Parameter-Identifikation von Fuzzy-Maßen für das Choquet Integral zurückgeführt. Die exponentielle Anzahl von Freiheitsgraden zur Modellierung aller Attribut-Teilmengen stellt dabei besondere Herausforderungen im Hinblick auf Laufzeitkomplexität und Generalisierungsleistung. Vor deren Hintergrund werden die beiden Ansätze praktisch bewertet und auch theoretisch analysiert. Zudem werden auch geeignete Verfahren zur Komplexitätsreduktion und Modellregularisierung vorgeschlagen und untersucht. Die experimentellen Ergebnisse sind auch für anspruchsvolle Referenzprobleme im Vergleich mit aktuellen Verfahren sehr gut und heben die Nützlichkeit der Kombination aus Monotonie und Flexibilität des Choquet Integrals in verschiedenen Ansätzen des maschinellen Lernens hervor.

Bibliographie / References

  1. F. Bach. High-dimensional non-linear variable selection through hierarchical kernel learning. arXiv:0909.0844, September 2009.
  2. J. Figueira, V. Mousseau, and B. Roy. ELECTRE methods. In José Figueira, Salvatore Greco, and Matthias Ehrgott, editors, Multiple Criteria Decision Analysis: State of the Art Surveys, chapter 4, pages 133 – 163. Springer's International Series, 2005.
  3. M. Grabisch and M. Roubens. Application of the Choquet integral in mul- ticriteria decision making. In Michel Grabisch, Toshiaki Murofushi, and Michio Sugeno, editors, Fuzzy Measures and Integrals: Theory and Appli- cations, pages 348–374. Physica, 2000.
  4. J. Demsar. Statistical comparisons of classifiers over multiple data sets. Jour- nal of Machine Learning Research, 7:1–30, 2006.
  5. E. Frank and M. Hall. A simple approach to ordinal classification. In Proc 12th European Conference on Machine Learning, pages 145–156. Springer, 2001.
  6. N. Barile and A. J. Feelders. Nonparametric monotone classification with MOCA. In ICDM, pages 731–736. IEEE Computer Society, 2008.
  7. W. Duivesteijn and A. Feelders. Nearest neighbour classification with mono- tonicity constraints. In Machine Learning and Knowledge Discovery in Databases, volume 5211 of Lecture Notes in Computer Science, pages 301– 316. Springer, 2008.
  8. M. Grabisch. Fuzzy integral for classification and feature extraction. In Michel Grabisch, Toshiaki Murofushi, and Michio Sugeno, editors, Fuzzy Measures and Integrals: Theory and Applications, pages 415–434. Physica, 2000.
  9. E. Hüllermeier and A. Fallah Tehrani. Efficient learning of classifiers based on the 2-additive choquet integral. In Christian Moewes and Andreas Nürn- berger, editors, Computational Intelligence in Intelligent Data Analysis, vol- ume 445 of Studies in Computational Intelligence, pages 17–29. Springer Berlin Heidelberg, 2013.
  10. E. Hüllermeier and A. Fallah Tehrani. On the VC-dimension of the cho- quet integral. In Salvatore Greco, Bernadette Bouchon-Meunier, Giulianella Coletti, Mario Fedrizzi, Benedetto Matarazzo, and RonaldR. Yager, editors, Advances on Computational Intelligence, volume 297 of Communications in Computer and Information Science, pages 42–50. Springer Berlin Heidel- berg, 2012.
  11. A. Feelders. Monotone relabeling in ordinal classification. In Proceedings of the 10th IEEE International Conference on Data Mining, pages 803–808. IEEE Computer Society, 2010.
  12. M. Grabisch. A new algorithm for identifying fuzzy measures and its applica- tion to pattern recognition. In Proceedings of IEEE International Conference on Fuzzy Systems, volume 1, pages 145–150. IEEE, 1995.
  13. K. Dembczy´Dembczy´nski, G. Salvatore, W. Kotłowski, and R. Słowi´Słowi´nski. Optimized generalized decision in dominance-based rough set approach. In JingTao Yao, Pawan Lingras, Wei-Zhi Wu, Marcin Szczuka, NickJ. Cercone, and Dominik S ´ lezak, editors, Rough Sets and Knowledge Technology, volume 4481 of Lecture Notes in Computer Science, pages 118–125. Springer Berlin Heidelberg, 2007.
  14. M. Grabisch. Modelling data by the Choquet integral. In Information Fusion in Data Mining, pages 135–148. Springer, 2003.
  15. K. Dembczy´Dembczy´nski, W. Kotłowski, and R. Słowi´Słowi´nski. Learning rule ensembles for ordinal classification with monotonicity constraints. Fundamenta Infor- maticae, 94(2):163–178, 2009.
  16. A. Ben-David. Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19:29–43, 1995.
  17. V. Balla, C. Gaganis, F. Pasiouras, and C. Zopounidis. Multicriteria decision aid models for the prediction of securities class actions: evidence from the banking sector. OR Spectrum, 36(1):57–72, 2014.
  18. H. Daniels and B. Kamp. Applications of MLP networks to bond rating and house pricing. Neural Computation and Applications, 8:226–234, 1999.
  19. H. Block, S. Qian, and A. Sampson. Structure algorithms for partially or- dered isotonic regression. Journal of Computational and Graphical Statis- tics, 3(3):285–300, 1994.
  20. D. Hosmer and S. Lemeshow. Applied Logistic Regression. Wiley, 2nd edi- tion, 2000.
  21. S. Angilella, S. Greco, and B. Matarazzo. The most representative utility function for non-additive robust ordinal regression. In Eyke Hüllermeier, Rudolf Kruse, and Frank Hoffmann, editors, Computational Intelligence for Knowledge-Based Systems Design, volume 6178 of Lecture Notes in Com- puter Science, pages 220–229. Springer, 2010.
  22. S. Angilella, S. Greco, and B. Matarazzo. Non-additive robust ordinal regres- sion with Choquet integral, bipolar and level dependent Choquet integrals. In Joao Paulo Carvalho, Didier Dubois, Uzay Kaymak, and Joao Miguel da Costa Sousa, editors, Proceedings of the Joint International Fuzzy Sys- tems Association World Congress and European Society of Fuzzy Logic and Technology Conference, pages 1194–1199. IFSA/EUSFLAT, 2009.
  23. K. De Brucker, C. Macharis, and A. Verbeke. Multi-criteria analysis in transport project evaluation: an institutional approach. European Transport Trasporti Europei, (47):3–24, 2011.
  24. Ø. Birkenes. A framework for speech recognition using logistic regression. PhD thesis, Trondheim : Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Electronics and Telecommunications, 2007.
  25. A. Ben-David, L. Sterling, and Y. H. Pao. Learning and classification of monotonic ordinal concepts. Computational Intelligence, 5(1):45–49, 1989.
  26. A. Ben-David, L. Sterling, and T. Tran. Adding monotonicity to learning algorithms may impair their accuracy. Expert Systems with Applications, 36(3):6627–6634, 2009.
  27. K. Dembczy´Dembczy´nski, W. Kotłowski, and R. Słowi´Słowi´nski. Additive preference model with piecewise linear components resulting from dominance-based rough set approximations. In International Conference on Artificial Intelli- gence and Soft Computing 2006, volume 4029 of Lecture Notes in Computer Science, pages 499–508, 2006.
  28. M. Grabisch, J. Marichal, R. Mesiar, and E. Pap. Aggregation Functions (Encyclopedia of Mathematics and Its Applications). Cambridge University Press, New York, NY, USA, 1st edition, 2009.
  29. F. Gebhardt. An algorithm for monotone regression with one or more inde- pendent variables. Biometrika 57, pages 263 – 271, 1970.
  30. O. Burdakov, O. Sysoev, A. Grimvall, and M. Hussian. An O(n 2 ) algorithm for isotonic regression. In Gianni Pillo and Massimo Roma, editors, Large- Scale Nonlinear Optimization, volume 83 of Nonconvex Optimization and Its Applications, pages 25–33. Springer US, 2006.
  31. G. Beliakov and S. James. Citation-based journal ranks: the use of fuzzy measures. Fuzzy Sets and Systems, 167(1):101–119, March 2011.
  32. M. Grabisch and J. M. Nicolas. Classification by fuzzy integral: performance and tests. Fuzzy Sets and Systems, 65(2-3):255–271, 1994.
  33. S. Greco, B. Matarazzo, and R. Slowinsk. Decision rule approach. In Mul- tiple Criteria Decision Analysis: State of the Art Surveys, volume 78 of In- ternational Series in Operations Research & Management Science, pages 507–555. Springer New York, 2005.
  34. M. Grabisch. Fuzzy integral in multicriteria decision making. Fuzzy Sets and Systems, 69(3):279–298, 1995.
  35. M. Grabisch and C. Labreuche. Fuzzy measures and integrals in MCDA. In José Figueira, Salvatore Greco, and Matthias Ehrgott, editors, Multiple Criteria Decision Analysis: State of the Art Surveys, chapter 14, pages 563 – 609. Springer's International Series, 2005.
  36. R. Chandrasekaran, Y. Ryu, V. Jacob, and S. Hong. Isotonic separation. INFORMS Journal on Computing, 17:462–474, 2005.
  37. K. Fukunaga and L. Hostetler. k-nearest-neighbor bayes-risk estimation. In- formation Theory, IEEE Transactions on, 21(3):285–293, 1975.
  38. P. Flach. Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press, New York, NY, USA, 2012.
  39. J. S. Dyer. MAUT -Multi Atribute Utility Theory. In José Figueira, Salvatore Greco, and Matthias Ehrgott, editors, Multiple Criteria Decision Analysis: State of the Art Surveys, chapter 7, pages 265 – 297. Springer's International Series, 2005.
  40. D. Diakoulaki, C. H. Antunes, and A. Gomes Martins. MCDA and Energy Planning. In Multiple Criteria Decision Analysis: State of the Art Surveys, volume 78 of International Series in Operations Research & Management Science, pages 859–890. Springer New York, 2005.
  41. J. C. Bioch and V. Popova. Monotone decision trees and noisy data. Re- search Paper ERS-2002-53-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Man- agement, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus Uni, 2002.
  42. J. Figueira, S. Greco, and M. Ehrgott, editors. Multiple Criteria Decision Analysis: State of the Art Surveys. Springer's International Series, 2005.
  43. T. Cover and P. Hart. Nearest neighbor pattern classification. Information Theory, IEEE Transactions on, 13(1):21–27, 1967.
  44. J. P. Brans and B. Mareschal. PROMETHEE methods. In José Figueira, Salvatore Greco, and Matthias Ehrgott, editors, Multiple Criteria Decision Analysis: State of the Art Surveys, chapter 5, pages 163 – 197. Springer's International Series, 2005.
  45. G. Choquet. Theory of capacities. Annales de l'institut Fourier, 5:131–295, 1954.
  46. J. Aldrich. R. A. Fisher and the making of maximum likelihood 1912-1922. Statistical Science, 12(3):pp. 162–176, 1997.
  47. T. Hofmann, B. Scholköpf, and A. J. Smola. Kernel methods in machine learning. The Annals of Statistics, 36:1171–1220, 2008.
  48. O. Burdakov, A. Grimwall, and M. Hussian. A generalised PAV algorithm for monotonic regression in several variables. In COMPSTAT, 2004.
  49. M. Grabisch. The application of fuzzy integrals in multicriteria decision making. European Journal of Operational Research, 89(3):445 – 456, 1996.
  50. H. Bustince, J. Fernandez, and R. Mesiar. Aggregation Functions in Theory and in Practise: Proceedings of the 7th International Summer School on Ag- gregation Operators at the Public University of Navarra, Pamplona, Spain, July 16-20, 2013. Springer Publishing Company, Incorporated, 2013.
  51. M. Grabisch. k-order additive discrete fuzzy measures and their representa- tion. Fuzzy Sets and Systems, 92(2):167–189, 1997.
  52. P. Y. Kwong Cheng. Improving financial decision making with unconscious thought: A transcendent model. Journal of Behavioral Finance, 11(2):92– 102, 2010.
  53. M. Grabisch, I. Kojadinovic, and P. Meyer. A review of methods for ca- pacity identification in Choquet integral based multi-attribute utility theory: Applications of the Kappalab R package. European Journal of Operational Research, 186(2):766–785, 2008.


* Das Dokument ist im Internet frei zugänglich - Hinweise zu den Nutzungsrechten