Apache Spark Distributed Application, using PySpark in Google Colab.

Develop an Apache Spark application per provided specifications and , using PySpark in Google Colab.

Details

Use the a reference:

Create a new notebook in Google Colab
Download and upload it to the “Files” section in your Colab notebook (may take a few minutes to upload)
Read the Crunchbase Orgs dataset into Spark DataFrame

Find all entities with the name that starts with a letter “F” (e.g. Facebook, etc.):
- print the count and show() the resulting Spark DataFrame
Find all entities located in New York City:
- print the count and show() the resulting Spark DataFrame
Add a “Blog” column to the DataFrame with the row entries set to 1 if the “domain” field contains “blogspot.com”, and 0 otherwise.
- show() only the records with the “Blog” field marked as 1
Find all entities with names that are palindromes (name reads the same way forward and reverse, e.g. madam):
- print the count and show() the resulting Spark DataFrame