Description: In this assignment, you are going to write a python program to read and tokenize the data. The following is the training data format where the first column is the reviewer id, the second column indicates whether this review is fake or true, the third column represents whether the review is positive or negative, and the rest is the review. Your task is to learn whether the review is fake or true and positive or negative based on the review.
Input Data
Your first task is read the data into your python objects.
- Extract the labels
[‘Fake’, ‘Neg’] - Extract each review
I was very disappointed with … the chain’s reputation. - Tokenize the sentences
- Store the extracted data to lists
- Repeat it for all the data
- Print out the first and the last labels from your stored list
- Print out the first and the last tokens (reviews) from your stored list
To Run:
>> python learn.py training-text.txt