The projects’ goal is to obtain, course of, and apply machine learning algorithms on Wikipedia articles. First, chosen articles from Wikipedia are downloaded and saved. Second, a corpus is generated, the totality of all text documents. Third, each documents text is preprocessed, e.g. by eradicating stop words and symbols, then tokenized. Fourth, the tokenized text […]