Label Ranking with Probabilistic Models

Cheng, Weiwei

Titel:	Label Ranking with Probabilistic Models
Autor:	Cheng, Weiwei
Weitere Beteiligte:	Hüllermeier, Eyke (Prof.)
Veröffentlicht:	2012
URI:	https://archiv.ub.uni-marburg.de/diss/z2012/0493
DOI:	https://doi.org/10.17192/z2012.0493
URN:	urn:nbn:de:hebis:04-z2012-04934
DDC:	Informatik
*Titel (trans.):*	Label Ranking mit probabilistischen Modellen
Publikationsdatum:	2012-07-11
Lizenz:	https://rightsstatements.org/vocab/InC-NC/1.0/

Dokument

Schlagwörter:
Label ranking, Probabilistic models, Artificial Intelligence: AI, Maschinelles Lernen, Preference learning

Summary:
Diese Arbeit konzentriert sich auf eine spezielle Prognoseform, das sogenannte Label Ranking. Auf den Punkt gebracht, kann Label Ranking als eine Erweiterung des herkömmlichen Klassifizierungproblems betrachtet werden. Bei einer Anfrage (z. B. durch einen Kunden) und einem vordefinierten Set von Kandidaten Labels (zB AUDI, BMW, VW), wird ein einzelnes Label (zB BMW) zur Vorhersage in der Klassifizierung benötigt, während ein komplettes Ranking aller Label (zB BMW> VW> Audi) für das Label Ranking erforderlich ist. Da Vorhersagen dieser Art, bei vielen Problemen der realen Welt nützlich sind, können Label Ranking-Methoden in mehreren Anwendungen, darunter Information Retrieval, Kundenwunsch Lernen und E-Commerce eingesetzt werden. Die vorliegende Arbeit stellt eine Auswahl an Methoden für Label-Ranking vor, die Maschinelles Lernen mit statistischen Bewertungsmodellen kombiniert. Wir konzentrieren wir uns auf zwei statistische Ranking-Modelle, das Mallows- und das Plackett-Luce-Modell und zwei Techniken des maschinellen Lernens, das Beispielbasierte Lernen und das Verallgemeinernde Lineare Modell.

Bibliographie / References

Weiwei Cheng, Eyke Hüllermeier Label ranking with abstention: predicting partial orders by thresholding probability distributions Workshop Proceedings of Choice Models and Preference Learning (CMPL-11): arXiv:1112.0508v1 [cs.AI]
Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.
Ian Witten, Eibe Frank, and Mark Hall. Data Mining: Practical Ma- chine Learning Tools and Techniques. Morgan Kaufmann, 3rd edition, 2011.
Shantanu Godbole and Sunita Sarawagi. Discriminative methods for multi-labeled classification. In Honghua Dai, Ramakrishnan Srikant, and Chengqi Zhang, editors, Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 20–33. Springer, 2004.
Robert Schapire and Yoram Singer. Boostexter: a boosting-based sys- tem for text categorization. Machine Learning, 39(2):135–168, 2000.
Grigorios Tsoumakas and Ioannis Katakis. Multi-label classification: an overview. International Journal of Data Warehousing and Mining, 3(3):1–17, 2007.
Minling Zhang and Zhihua Zhou. Multi-label neural networks with ap- plications to functional genomics and text categorization. IEEE Trans- actions on Knowledge and Data Engineering, 18(10):1338–1351, 2006.
Weiwei Cheng, Jens Hühn, and Eyke Hüllermeier. Decision tree and instance-based learning for label ranking. In Léon Bottou and Michael Littman, editors, Proceedings of the 26th International Conference on Machine Learning, pages 161–168. Omnipress, 2009.
Johannes Fürnkranz, Eyke Hüllermeier, Eneldo Mencía, and Klaus Brinker. Multilabel classification via calibrated label ranking. Machine Learning, 73(2):133–153, 2008.
Shai Shalev-Shwartz and Yoram Singer. Efficiently learning of label ranking by soft projections onto polyhedra. Journal of Machine Learning Research, 7:1567–1599, 2006.
Amanda Clare and Ross King. Knowledge discovery in multi-label phe- notype data. In Luc De Raedt and Arno Siebes, editors, Proceedings of the 5th European Conference on Principles of Data Mining and Knowl- edge Discovery, pages 42–53. Springer, 2001.
Frans Schalekamp and Anke van Zuylen. Rank aggregation: together we're strong. In Irene Finocchi and John Hershberger, editors, Proceed- ings of the 11th Workshop on Algorithm Engineering and Experiments, pages 38–51. SIAM, 2009.
Oded Maron and Aparna Ratan. Multiple-instance learning for natural scene classification. In Jude Shavlik, editor, Proceedings of the 15th International Conference on Machine Learning, pages 341–349. Morgan Kaufmann, 1998.
Cees Snoek, Marcel Worring, Jan van Gemert, Jan-Mark Geusebroek, and Arnold Smeulders. The challenge problem for automated detec- tion of 101 semantic concepts in multimedia. In Klara Nahrstedt and Matthew Turk, editors, Proceedings of the 14th Annual ACM Interna- tional Conference on Multimedia, pages 421–430. ACM Press, 2006.
Sariel Har-Peled, Dan Roth, and Dav Zimak. Constraint classification for multiclass classification and ranking. In Suzanna Becker, Sebastian Thrun, and Klaus Obermayer, editors, Advances in Neural Information Processing Systems, volume 15, pages 785–792. MIT Press, 2003.
Roni Khardon and Gabriel Wachman. Noise tolerant variants of the per- ceptron algorithm. The Journal of Machine Learning Research, 8:227– 248, 2007.
Eyke Hüllermeier, Johannes Fürnkranz, Weiwei Cheng, and Klaus Brinker. Label ranking by learning pairwise preferences. Artificial In- telligence, 172(16-17):1897–1916, 2008.
Hideto Kazawa, Tomonori Izumitani, Hirotoshi Taira, and Eisaku Maeda. Maximal margin labeling for multi-topic text categorization.
Claire Kenyon-Mathieu and Warren Schudy. How to rank with few errors: a PTAS for weighted feedback arc set on tournaments. In David Johnson and Uriel Feige, editors, Proceedings of the 39th Annual ACM Symposium on Theory of Computing, pages 95–103. ACM Press, 2007.
Minling Zhang and Zhihua Zhou. ML-kNN: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038–2048, 2007.
Mithat Gönen and Glenn Heller. Concordance probability and discrimi- natory power in proportional hazards regression. Biometrika, 92(4):965– 970, 2005.
Konstantinos Trohidis, Grigorios Tsoumakas, George Kalliris, and Ioan- nis Vlahavas. Multilabel classification of music into emotions. In Juan Pablo Bello, Elaine Chew, and Douglas Turnbull, editors, Proceed- ings of the 9th International Conference on Music Information Retrieval, pages 325–330. Drexel University, 2008.
Francesco De Comite, Remi Gilleron, and Marc Tommasi. Learning multi-label alternating decision tree from texts and data. In Petra Perner and Azriel Rosenfeld, editors, Proceedings of the 3rd International Con- ference on Machine Learning and Data Mining in Pattern Recognition, pages 35–49. Springer, 2003.
Krzysztof Dembczyński, Willem Waegeman, Weiwei Cheng, Eyke Hüllermeier An exact algorithm for F-measure maximization Advances in Neural Information Processing Systems 24 (NIPS-11): 223-230, Curran Associates Granada, Spain NIPS travel grant Dec. 2011
Michaël Rademaker and Bernard de Baets. A threshold for majority in the context of aggregating partial order relations. In Proceedings of IEEE International Conference on Fuzzy Systems, pages 1–4. IEEE, 2010.
Dec. 2011 NIPS travel grant Sponsored by Winton Capital, etc.
Acceptance speech of 2009 Chinese Government Award for Outstanding Self-Financed Students Abroad Embassy of China, Berlin, Germany Nov. 2009 Text classification: concepts and methods Association of Chinese Computer Scientists in Germany (GCI) Annual Conference Fürth, Germany Nov. 2009 Text classification: a back-to-school tutorial GTO IES Audit & Risk Management Department, Deutsche Bank Frankfurt, Germany Oct. 2009
Zhihua zhou and Minling Zhang. Multi-instance multi-label learning with application to scene classification. In Bernhard Schölkopf, John Platt, and Thomas Hofmann, editors, Advances in Neural Information Processing Systems, volume 19, pages 1609–1616. MIT Press, 2007. Data mining consulting intern Analyzing and mining IT-related audit issues Group Technology and Operations Deutsche Bank Eschborn, Germany 2006 – 2007 Research assistant Efficient preference operators in very large database systems Faculty of Computer Science Otto-von-Guericke University Magdeburg, Germany 2003 Software development intern Geographic information system for electricity distribution management Xinli Software Co. Ltd. Hefei, China EDUCATION 2007 – now PhD candidate in Computer Science Research interests: machine learning, data mining, preference learning, ranking, multi-label classification Faculty of Mathematics and Computer Science Philipps University Marburg, Germany 2005 – 2007 Master's degree in Computer Science Subject: Data and Knowledge Engineering GPA: 4.0/4.0 with highest distinction + best graduate award Master's thesis: Interactive ranking of skylines using machine learning techniques Otto-von-Guericke University Magdeburg, Germany April 2012. Check www.chengweiwei.com for update. 2000 – 2004 Bachelor's degrees in Computer Science and Business Administration Bachelor's thesis of CS: Database systems for personnel management Bachelor's thesis of BA: Retail analysis and development in Zhengzhou Zhengzhou University, China PUBLICATIONS Dec. 2011
André Elisseeff and Jason Weston. A kernel method for multi-labelled classification. In Thomas Dietterich, Suzanna Becker, and Zoubin Ghahramani, editors, Advances in Neural Information Processing Sys- tems, volume 14, pages 681–687. MIT Press, 2002.
Louis Thurstone. A law of comparative judgement. Psychological Re- view, 34:273–286, 1927.
John Marden. Analyzing and Modeling Rank Data. CRC Press, 1995.
John Yellott. A relationship between Luce's choice axiom, Thurstone's theory of comparative judgement, and the double exponential distribu- tion. Journal of Mathematical Psychology, 15(2):109–144, 1977.
Stephen Warshall. A theorem on Boolean matrices. Journal of the ACM, 9(1):11–12, 1962.
David McGarvey. A theorem on the construction of voting paradoxes. Econometrica, 21(4):608–610, 1953.
Klaus Brinker and Eyke Hüllermeier. Case-based label ranking. In Johannes Fürnkranz, Tobias Scheffer, and Myra Spiliopoulou, editors, Proceedings of the European Conference on Machine Learning and Prin- ciples and Practice of Knowledge Discovery in Databases, pages 566–573.
April 2012. Check www.chengweiwei.com for update. Bled, Slovenia Sep. 2009
Radu Herbei and Marten Wegkamp. Classification with reject option. Canadian Journal of Statistics, 34(4):709–721, 2006.
Multi-label classification: a new machine learning problem Lightning Talk, the 4 th Annual Google Test Automation Conference (GTAC-09) Zurich, Switzerland Sep. 2009 Combining instance-based learning and logistic regression for multi- label classification Lernen-Wissen-Adaptivität (LWA-09) Darmstadt, Germany Sep. 2009 Human-computation in Internet Association of Chinese Computer Scientists in Germany (GCI) IT Strategy Workshop Schwetzingen, Germany Nov. 2008 Ranking the skyline: an application of preference learning Association of Chinese Computer Scientists in Germany (GCI) Annual Conference Lahnstein, Germany Sep. 2008 Instance-based label ranking Student Talk, the 10 th Machine Learning Summer School (MLSS-08)
Eyke Hüllermeier and Johannes Fürnkranz. Comparison of ranking pro- cedures in pairwise preference learning. In Proceedings of the 10th In- ternational Conference on Information Processing and Management of Uncertainty in Knoweldge-Based Systems, pages 535–542. Università La Sapienza, 2004.
Human vs. computer: case studies Association of Chinese Scholars and Students in Magdeburg (VCWSM e.V.) Magdeburg, Germany Oct. 2009
Kalervo Järvelin and Jaana Kekäläinen. Cumulated gain-based eval- uation of IR techniques. ACM Transactions on Information Systems, 20(4):422–446, 2002.
Celine Vens, Jan Struyf, Leander Schietgat, Sašo Džeroski, and Hen- drik Blockeel. Decision trees for hierarchical multi-label classification. Machine Learning, 73(2):185–214, 2008.
Michael Fligner and Joseph Verducci. Distance based ranking models. Journal of the Royal Statistical Society B, 48(3):359–369, 1986.
Weiwei Cheng, Eyke Hüllermeier A new instance-based label ranking approach using the Mallows model LNCS 5551 Advances in Neural Networks: 707-716, Springer The 6 th International Symposium on Neural Networks (ISNN-09) Wuhan, China Oct. 2008
Weiwei Cheng, Eyke Hüllermeier A simple instance-based approach to multi-label classification using the Mallows model Workshop Proceedings of Learning from Multi-Label Data 2009 (MLD-09): 28-38 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD-09)
Krzysztof Dembczyński, Weiwei Cheng, Eyke Hüllermeier Bayes optimal multi-label classification via probabilistic classifier chains Proceedings of the 27 th International Conference on Machine Learning (ICML-10): 279-286, Omnipress Haifa, Israel ICML scholarship Sep. 2009
Ali Fallah Tehrani, Weiwei Cheng, Eyke Hüllermeier Choquistic regression: generalizing logistic regression using the Choquet integral Proceedings of the 7 th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT-2011): 868-875, Atlantis Press Aix-les-Bains, France nominated for best student paper award April 2012. Check www.chengweiwei.com for update.
Weiwei Cheng, Eyke Hüllermeier Combining instance-based learning and logistic regression for multi- label classification Machine Learning 76: 211-225, Springer European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD-09) Bled, Slovenia Machine Learning Journal Best Student Paper Award Jun. 2009
Weiwei Cheng, Jens Hühn, Eyke Hüllermeier Decision tree and instance-based learning for label ranking Proceedings of the 26 th International Conference on Machine Learning (ICML-09): 161-168, Omnipress Montreal, Canada ICML scholarship May 2009
Thomas Fober, Weiwei Cheng, Eyke Hüllermeier Focusing search in multiobjective evolutionary optimization through preference learning from user feedback Proceedings of the 21 st Workshop Computational Intelligence (WCI-11): 107-117, KIT Scientific Publishing Dortmund, Germany Oct. 2011
Weiwei Cheng, Krzysztof Dembczyński, Eyke Hüllermeier Graded multi-label classification: the ordinal case Proceedings of the 27 th International Conference on Machine Learning (ICML-10): 223-230, Omnipress Haifa, Israel ICML scholarship Jun. 2010
Weiwei Cheng, Eyke Hüllermeier Instance-based label ranking using the Mallows model Workshop Proceedings of ECCBR-08: 143-157 Uncertainty and Knowledge Discovery in CBR The 9 th European Conference on Case-Based Reasoning (ECCBR-08) Trier, Germany Sep. 2008
Weiwei Cheng, Eyke Hüllermeier Label ranking with partial abstention using ensemble learning Workshop Proceedings of Preference Learning 2009 (PL-09): 17-23
Ali Fallah Tehrani, Weiwei Cheng, Krzysztof Dembczyński, Eyke Hüllermeier Learning monotone nonlinear models using the Choquet integral LNAI 6913 Machine Learning and Knowledge Discovery in Databases: 414-429, Springer European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD-11)
Weiwei Cheng, Eyke Hüllermeier Learning similarity functions from qualitative feedback LNAI 5239 Advances in Case-Based Reasoning: 120-134, Springer The 9 th European Conference on Case-Based Reasoning (ECCBR-08) Trier, Germany nominated for best paper award Aug. 2008
Krzysztof Dembczyński, Willem Waegeman, Weiwei Cheng, Eyke Hüllermeier On label dependence in multi-label classification Workshop Proceedings of Learning from Multi-Label Data 2010 (MLD-10): 5-12 The 27 th International Conference on Machine Learning (ICML-10) Haifa, Israel invited paper Jun. 2010
Weiwei Cheng, Michaël Rademaker, Bernard De Baets, Eyke Hüllermeier Predicting partial orders: ranking with abstention LNAI 6321 Machine Learning and Knowledge Discovery in Databases: 215-230, Springer European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD-10) Barcelona, Spain UNESCO ECMLPKDD conference grant Sep. 2010
Ali Fallah Tehrani, Weiwei Cheng, Eyke Hüllermeier Preference learning using the Choquet integral: the case of multipartite ranking Proceedings of the 20 th Workshop Computational Intelligence (WCI-10): 119-130, KIT Scientific Publishing Dortmund, Germany Sep. 2010
Weiwei Cheng, Eyke Hüllermeier Ranking skylines using active learning techniques Proceedings of Chinese Intelligent Systems Engineering 2008 (CNISE-08) Chengdu, China Sep. 2008
Krzysztof Dembczyński, Willem Waegeman, Weiwei Cheng, Eyke Hüllermeier Regret analysis for performance metrics in multi-label classification: the case of Hamming and subset zero-one loss LNAI 6321 Machine Learning and Knowledge Discovery in Databases: 280-295, Springer European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD-10) Barcelona, Spain UNESCO ECMLPKDD conference grant Jun. 2010
Athens, Greece invited to Machine Learning Journal Special Issue Sep. 2011
The 25 th Annual Conference on Neural Information Processing Systems, Granada, Spain Apr. 2011 DAAD STIBET scholarship Sponsored by German Academic Exchange Service (DAAD) Marburg University Research Academy (MARA) Sep. 2010 UNESCO ECMLPKDD conference grant Sponsored by UNESCO Chair in Data Privacy European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Barcelona, Spain Jun. 2010 ICML scholarship Sponsored by IBM, National Science Foundation (NSF), etc. The 27 th International Conference on Machine Learning, Haifa, Israel Jun. 2009 ICML scholarship Sponsored by National Science Foundation (NSF), etc. The 26 th International Conference on Machine Learning, Montreal, Canada Sep. 2008 MLSS scholarship Sponsored by Predict & Control and Lille's Computer Science Laboratory (LIFL) The 10 th Machine Learning Summer School, Ile De Re, France Oct. 2006 Outstanding international student scholarship Ministry of Education and Cultural Affairs of Saxony-Anhalt, Germany INTERESTS & SKILLS Programming Java, MATLAB, Octave, Python, C#, C, HTML, SQL, SAP's ABAP, etc.
Weiwei Cheng, Eyke Hüllermeier, Bernhard Seeger, Ilya Vladimirskiy Interactive ranking of skylines using machine learning techniques April 2012. Check www.chengweiwei.com for update.
Robert Luce. Individual Choice Behavior: A Theoretical Analysis. Wi- ley, 1959.
Frank Wilcoxon. Individual comparisons by ranking methods. Biomet- rics Bulletin, 1(6):80–83, 1945.
Eyke Hüllermeier, Johannes Fürnkranz, Weiwei Cheng, Klaus Brinker Label ranking by learning pairwise preferences Artificial Intelligence 172: 1897-1916, Elsevier listed in Most Cited Artificial Intelligence Articles 2007-2012
Learning multi-label scene classification. Pattern Recognition, 37(9):1757–1771, 2004.
William Cohen, Robert Schapire, and Yoram Singer. Learning to order things. Journal of Artificial Intelligence Research, 10:243–270, 1999.
Bernhard Schölkopf and Alexander Smola. Learning with Kernels: Sup- port Vector Machines, Regularization, Optimization. MIT Press, 2001.
Leo Goodman and William Kruskal. Measures of association for cross classifications. Journal of the American Statistical Association, 49(268):732–764, 1954.
David Hunter. MM algorithms for generalized Bradley-Terry models. The Annals of Statistics, 32(1):386–408, 2004.
Colin Mallows. Non-null ranking models. Biometrika, 44(1):114–130, 1957.
Thorsten Joachims. Optimizing search engines using clickthrough data.
Ile De Re, France MLSS scholarship Jun. 2010 Outstanding Chinese student award Chinese Government Award for Outstanding Self-Financed Students Abroad Ministry of Education, China Sep. 2009 Best paper award Machine Learning Journal Best Student Paper Award The 19 th European Conference on Machine Learning, Bled, Slovenia Oct. 2008 Best graduate award Best Graduate 2007/2008 in Master of Data and Knowledge Engineering Otto-von-Guericke University Magdeburg, Germany Sep. 2008 Nomination for best paper award The 9 th European Conference on Case-Based Reasoning, Trier, Germany Jun. 2002 Best debater award The 1 st Varsity Debate Tournament of Henan, China Aug. 1999 Speech contest award The 3 rd prize at Speech Contest " The 50 Years " , Anqing, China Scholarships & Grants Jul. 2012 DAAD conference scholarship German Academic Exchange Service (DAAD) Vilnius, Lithuania, July, 2012
Naonori Ueda and Kazumi Saito. Parametric mixture models for multi- label text. In Suzanna Becker, Sebastian Thrun, and Klaus Obermayer, editors, Advances in Neural Information Processing Systems, volume 15, pages 721–728. MIT Press, 2003.
Cynthia Dwork, Ravi Kumary, Moni Naorz, and D. Sivakumar. Rank aggregation methods for the web. In Vincent Shen, Nobuo Saito, Michael Lyu, and Mary Zurko, editors, Proceedings of the 10th International Conference on World Wide Web, pages 613–622. ACM P, 2001.
Johannes Fürnkranz. Round robin classification. Journal of Machine Learning Research, 2:721–747, 2002.
Weiwei Cheng, Johannes Fürnkranz, Eyke Hüllermeier, Sang-Hyeun Park Preference-based policy iteration: leveraging preference learning for reinforcement learning LNAI 6911 Machine Learning and Knowledge Discovery in Databases: 312-327, Springer European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD-11)
Robin Plackett. The analysis of permutations. Applied Statistics, 24(2):193–202, 1975.
The RipOff! game: a tutorial of cooperative game theory with Yoram Bachrach Think Computer Science Cambridge, UK Sep. 2010 Graded multi-label classification: the ordinal case Lernen-Wissen-Adaptivität (LWA-10) Kassel, Germany Jun. 2010
Andrew Bradley. The use of the area under the ROC curve in the evalu- ation of machine learning algorithms. Pattern Recognition, 30(7):1145– 1159, 1997.
Paravin Vaidya. An O(n log n) algorithm for the all-nearest-neighbors problem. Discrete and Computational Geometry, 4(1):101–115, 1989.
Johannes Fürnkranz, Eyke Hüllermeier, and Stijn Vanderlooy. Binary decomposition methods for multipartite ranking. In Wray Buntine, Marko Grobelnik, Dunja Mladenic, and John Shawe-Taylor, editors, Proceedings of the European Conference on Machine Learning and Prin- ciples and Practice of Knowledge Discovery in Databases, pages 359–374.

Das Dokument ist im Internet frei zugänglich - Hinweise zu den Nutzungsrechten