Tag Archives: Spark

Video: Developing a Movie recommendation engine with Spark

How Apache Kafka is transforming Hadoop, Spark,Storm

Video: Apache Kafka with Spark Streaming_ Real Time Analytics Redefined

Resource Allocation Configuration for Spark on YARN

via Resource Allocation Configuration for Spark on YARN | MapR.

In this blog post, I will explain the resource allocation configurations for Spark on YARN, describe the yarn-client and yarn-cluster modes, and will include examples.

Using Spark to Create APIs in Scala

via Using Spark to Create APIs in Scala | Nordic APIs.

In our previous piece, we discussed the strengths of the Java Language within theSpark framework, highlighting the ways Java Spark increases simplicity, encourages good design, and allows for ease of development.

In this piece we continue our coverage on Spark, a micro framework great for defining and dispatching routes to functions that handle requests made to your web API’s endpoints. We’re going to examine the counterpoint to Java Spark, Scala Spark. We’ll discuss the origin, methodologies, and applications of Scala, as well as some use-cases where Scala Spark is highly effective.

Next-gen Data Analysis Framework for Telemetry

The easier it is to get answers, the more questions will be asked In that spirit me and Mark Reid have been working for a while now on a new analysis infrastracture to make it as easy as possible for engineers to get answers to data related questions. Our shiny new analysis infrastructure is based […]


Synonyms fun with Spark Word2Vec

Spark MLlib implements the Skip-gram approach of Word2Vec. With Skip-gram we want to predict a window of words given a single word. This is part of the work I have done with PySpark on IPython notebook. This outputs: And then to visualize it, with matplotlib and the WordCloud package WordCloud is expecting a document to […]