Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 1.32 KB

File metadata and controls

16 lines (12 loc) · 1.32 KB

Basic Text Mining using Python

Assignment of the Intelligent Systems course of the EIT Digital data science master at UPM

UPM License GitHub contributors

Abstract

This project aims to perform a basic analysis a provided corpus consisting of a head and neck cancer medication textual corpus. First, the dataset needs to be preprocessed, filtering the seer stage field and creating additional columns. Next, a basic word cloud will be created and the results discussed, followed my researching more advances techniques for word cloud generation. Approaches used include TextRank, MultipartiteRank, TopicRank, PositionRank, Yake, TF-IDF, SingleRank and a custom text rank. The implementation can be found in the format of Jupyter Notebook.

Authors