Spark can run on Hadoop 2's YARN and can read any existing Hadoop data. It is developed to run programs faster by making more use of in-memory data processing. Spark developers claim that it runs 100 times faster than Hadoop MapReduce in memory or 10 times faster on disk.

Spark has its own machine learning library, namely MLlib. It can be called upon using Java, Scala or Python.

Related news