In this assignment, you are going to find the words that share the same letters using MRJOB, e.g. (act, cat)
The data file is stored in data.txt (MRJOB will read the data from the file and send it to the mapper). Note that not all the words have a match.
The following is what you need to do
- Convert all words to lower case
- Sorted all the letters and use it as a key and the word will be your value
- Gather the values from the reducer
Input Text
act
takes
big
cause
Tames
expel
dog
dig
listen
vase
flow
race
stressed
cheater
meats
tofu
desserts
kitchen
silent
night
maple
teams
knee
heart
mates
baker
care
thicken
part
Earth
keen
wolf
break
study
save
God
builder
mining
thing
tofu
trap
sauce
read
dare
stake
cat
dusty
data
learning
teacher
rebuild
Expected Output
“Output”
[“baker”,”break”]
[“cheater”,”teacher”]
[“race”,”care”]
[“cause”,”sauce”]
[“act”,”cat”]
[“read”,”dare”]
[“heart”,”Earth”]
[“takes”,”stake”]
[“Tames”,”meats”,”teams”,”mates”]
[“vase”,”save”]
[“part”,”trap”]
[“builder”,”rebuild”]