Spark 101 for Scala Users

A quick hands-on intro into Spark for Scala users.

Here’s a link to the PDF of the slides I talked to.

Running Zeppelin via a Docker container

docker run --name zeppelin -p 8080:8080 -p 4040:4040 -v $HOME/spark/data:/data -v $HOME/spark/logs:/logs -v $HOME/spark/notebook:/notebook -e ZEPPELIN_NOTEBOOK_DIR='/notebook' -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_INT_JAVA_OPTS="-Dspark.driver.memory=4G" -e ZEPPELIN_INTP_MEM="-Xmx4g" -d apache/zeppelin:0.9.0 /zeppelin/bin/

Running Spark via a Docker container

docker run --name spark -v $HOME/spark/data:/data -p 4040:4040 -it mesosphere/spark bin/spark-shell

For a basic Spark SBT project


import Dependencies._

ThisBuild / scalaVersion     := "2.12.11"
ThisBuild / version          := "0.1.0-SNAPSHOT"
ThisBuild / organization     := "com.example"
ThisBuild / organizationName := "Meetup Spark Example"
ThisBuild / scalacOptions ++= Seq("-language:higherKinds")

lazy val root = (project in file("."))
    name := "SparkCatScratch",
    libraryDependencies ++= Seq( scalaTest % Test, sparkCore, sparkSQL, catsCore, catsFree, catsMTL)

initialCommands in console :=
    |import cats._,, cats.implicits._, org.apache.spark.sql.SparkSession
    |val spark = SparkSession.builder().master("local").getOrCreate

cleanupCommands in console := "spark.close"


import sbt._

object Dependencies {

  val sparkVersion = "2.4.5"
  val catsVersion = "2.0.0"

  lazy val scalaTest = "org.scalatest" %% "scalatest" % "3.0.8"
  lazy val sparkCore = "org.apache.spark" %% "spark-core" % sparkVersion
  lazy val sparkSQL = "org.apache.spark" %% "spark-sql" % sparkVersion
  lazy val catsCore = "org.typelevel" %% "cats-core" % catsVersion
  lazy val catsFree = "org.typelevel" %% "cats-free" % catsVersion
  lazy val catsMTL = "org.typelevel" %% "cats-mtl-core" % "0.7.0"

Starting Spark in the SBT console:

import org.apache.spark.sql.SparkSession

val spark = SparkSession.builder().master(?local").getOrCreate
val sc = spark.SparkContext

My “Instant Pot” Experience

Months ago, my husband and I were watching TV—I think it might have been CBS Sunday Morning—and there was a short segment about this new “must have” kitchen item that was a 6-in-1 gadget replacing your rice cooker, slow cooker, pressure cooker and all those sorts of things. I told Bill that this gadget would be a great candidate for my Christmas list. Long story short: this device was one of the presents under the Christmas tree, and I figured I would share my experiences and learnings so far. Continue reading “My “Instant Pot” Experience”

Funny Bug of the Day (Java)

This took a little while to figure out!

Date startDate = new Date();
Date endDate = new Date(startDate.getTime() + (24 * 3600000 * 42));

This was expected to result with startDate being right now (Feb 20, 2013 5:17:10 PM) and the end date being six weeks later (Apr 3, 2013 6:17:10 PM), but instead end date was being computed to be earlier than the start date… Feb 13, 2013 in fact! Continue reading “Funny Bug of the Day (Java)”