Welcome to the DARA Big Data project on Natural Language Processing and sentiment analysis of COVID19-related data from Twitter! From these tutorials you will learn i) How to collect data from Twitter and refine your search for Tweets ii) How to clean and prepare your Twitter data for NLP purposes iii) How to use existing libraries to perform sentiment analysis and iv) How to use machine learning to obtain sentiment predictions.
The DARA Big Data hackathon is designed to help you improve your data science skills in a friendly and supportive environment. Students should have a basic working knowledge of Python (including the scipy and numpy libraries) - but you do not have to be an expert to take part and enjoy yourself!
> git clone https://github.com/darabigdata/COVID19_Twitter_ProjectThen make sure you have the right Python libraries for the tutorials. They can all be installed using pip and the requirements.txt file in the repo:
> pip install -r requirements.txtThe easiest way to get all of the lecture and tutorial material is to clone this repository. To do this you need git installed on your laptop. If you're working on Linux you can install git using apt-get (you might need to use sudo):
apt install git
You can then clone the repository by typing:
git clone https://github.com/darabigdata/https://github.com/darabigdata/COVID19_Twitter_Project
To update your clone if changes are made, use:
cd COVID19_Twitter_Project/
git pull
