Documentation for Twitter US Airline Sentiment Analysis Repository
This repository contains the code for a sentiment analysis project that predicts the sentiment of tweets about US airlines. The dataset used contains over 14,000 tweets and includes the text of each tweet and a label indicating the sentiment of the tweet as positive, negative, or neutral.
/3 Dependencies: The following dependencies must be installed in order to run the code in this repository:
Python 3 Numpy Pandas NLTK
The repository contains the following files and directories:
Twitter US Airline Sentiment.ipynb: The main Jupyter Notebook containing the code for the sentiment analysis project.
The first step in the sentiment analysis process is preprocessing the data. This involves cleaning and preparing the data for analysis. The following steps are performed in the preprocessing phase:
Loading the data into a pandas dataframe Removing any irrelevant columns Removing any missing values Converting the sentiment labels to numerical values Tokenizing and stemming the text of the tweets Creating a bag of words representation of the tweets
Once the data has been preprocessed, the next step is to train a sentiment analysis model on the data. This project uses the Natural Language Toolkit (NLTK) library in Python to perform sentiment analysis.
The performance of the sentiment analysis model was evaluated using the following metrics:
Accuracy
The results of the sentiment analysis showed that the model performed with an accuracy of 52%.
This sentiment analysis project demonstrates the use of the Natural Language Toolkit (NLTK) library to predict the sentiment of tweets about US airlines. The results show that the NLTK library can accurately predict the sentiment of tweets and is a good choice for sentiment analysis tasks.