/master_thesis

My research and findings on multilingual classification of Wikipedia articles

Primary LanguageTeX

Master's thesis

Title: Multilingual classification in Wikipedia

Wikipedia is the largest encyclopedia ever known and one of the most important resources on the Web. An important challenge regards the characterization of its content and how to infer the general topic of a page. The problem becomes even harder if one tries to infer topics in a language-independent way, with the additional constraint that a semantic concept should be always mapped to the same topic, no matter what is the language used to describe it. In this thesis, I study and analyze the problem as a whole, define how data coming from different Wikipedia language editions can be merged into a single graph and develop a powerful framework for addressing the multilingual classification of Wikipedia pages.