Pig programming

PROBLEM 1
Select frequent words (whose count is equal or greater than 50,000). 
Display the frequent words in descending order. 
PROBLEM 2
Get groups of words by their length (Hint: use the built-in function SIZE) and count each group.
For example,
(2,1096049) means that there are 1096049 occurrence of words that have two characters.
Problem 3 is based on dataset nyc_taxi_data_2014.csv.gz
PROBLEM 3
Find the effect of passenger_count on trip_distance, fare_amount, and tip_rate.
a) Create a new data set records2 that has passenger_count, trip_distance, fare_amount, 
tip_rate (tip_amount/total_amount)
b) Filter records2 by passenger_count (0 < passenger_count < 10) and name the data set as 
records3
c) Group records3 by passenger_count. 
d) Display the average trip_distance, average fare_amount, and average tip_rate per each 
group of passenger_count.