What is Apache Spark 🌱

  • Apache Spark is a data processing framework that can quickly perform processing tasks on very large data sets, and can also distribute data processing tasks across multiple computers, either on its own or in tandem with other distributed computing tools.
  • Spark distributes data in RAM, across a cluster of computers, and processes it in parallel. The computers are able to access this sort of shared RAM.

Notes mentioning this note

There are no notes linking to this note.


Here are all the notes in this garden, along with their links, visualized as a graph.