Skip to content

Latest commit

 

History

History
16 lines (11 loc) · 335 Bytes

README.md

File metadata and controls

16 lines (11 loc) · 335 Bytes

hadoop-streaming

Hadoop-streaming with Python for DC Python

Using Python with Hadoop Streaming

##Pre-requisites

  • Java
  • Hadoop

##Setup:

  1. Run ./get_data.sh to download and unpack the DC Payment Card transaction data
  2. Modify HADOOP_HOME in ./mr_dc_payment_cards.sh
  3. Run ./mr_dc_payment_cards.sh