Introduction to Cascalog

1 What’s Cascalog?

\begin{quote} “Cascalog is a fully-featured data processing and querying library for Clojure. The main use cases for Cascalog are processing “Big Data” on top of Hadoop or doing analysis on your local computer from the Clojure REPL. Cascalog is a replacement for tools like Pig, Hive, and Cascading.

Cascalog operates at a significantly higher level of abstraction than a tool like SQL. More importantly, its tight integration with Clojure gives you the power to use abstraction and composition techniques with your data processing code just like you would with any other code. It’s this latter point that sets Cascalog far above any other tool in terms of expressive power.” \end{quote}

2 Features

a data processing and querying library for Clojure
for Hadoop but also for the REPL (locally)
based on Cascading
tight integration with Clojure

3 Programming Cascalog

DEMO TIME!

4 Thanks!

Questions?

5 Resources

https://github.com/nathanmarz/cascalog
Nathan Marz: Introducing Cascalog - A Clojure-based Query Language for Hadoop
Nathan Marz: New Cascalog Features
Backtype: Why Yieldbot chose Cascalog over Pig
Factual: Clojure on Hadoop - A New Hope
Stuart Sierra: Functional Relational Programming with Cascalog
LinkedIn Tech Talk: Clojure at Backtype (Video)
Nathan Marz: Cascalog - Making Data Processing Fun Again (Video, Clojure/conj 2011)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cascalog-minimal.org

cascalog-minimal.org

Introduction to Cascalog

1 What’s Cascalog?

2 Features

3 Programming Cascalog

4 Thanks!

5 Resources

Files

cascalog-minimal.org

Latest commit

History

cascalog-minimal.org

File metadata and controls

Introduction to Cascalog

1 What’s Cascalog?

2 Features

3 Programming Cascalog

4 Thanks!

5 Resources