Multiiobjective topic modelling resources
Here are resources associated with the paper:
"Multi-Objective Topic Modelling", by Khalifa, Coorne,Chantler and
Halley.
Please find a pdf of a self-archived preliminary version of the paper here
The paper can be cited as:
Khalifa, O, , Corne, D., Chantler, M., Halley, F (2013)
Multi-Objective Topic Modelling,
in : Evolutionary Multi-Criterion Optimization (EMO 2013), (Eds:
Purshouse, Fleming
Fonseca, Greco, Shaw), Springer LNCS, 15pp, 2013, to appear.
Please keep checking back for better description and inclusion of
anyhing that we're told is missing!
Source code
Binaries and corpora
How to use the binaries: java -jar MOEATM.jar
Generating the wikipedia index for PMI calculations:
The index can be generated using the following steps:
- download a Wikipedia copy from here:
http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
-
Extract the previous file and get red of all XML Wikipedia format
tags (there are plenty of tools over internet) in order to produce one
large plain text file comprises all Wikipedia articles.
- in order to create the wikipedia index, this code is required:
http://www.macs.hw.ac.uk/~ok32/WikiLucene_src.zip
http://www.macs.hw.ac.uk/~ok32/WikiLucene.jar