The Dominant APIs of Spark: Datasets, Da


The Dominant APIs of Spark: Datasets, DataFrames, and RDDs – While working with Spark, often we come across the three APIs: DataFrames, Datasets, and RDDs. In this blog, I will discuss the three in terms of performance and optimization. There is seamless transformation available between DataFrames, Datasets, and RDDs. Implicitly, the RDD forms the apex of DataFrame and Datasets. The inception of the three is somewhat described below: http://ow.ly/tJLt50aXUNE

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s