Skip to content

kennyzli/io.tusk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#DataStream to power the big data with functional programming

Big data is alwasy awesome. But writing hadoop is terrible with mapper, reducer combiner. why don't we make our distributed hadoop life easier and give the power back to java developers? Inspired by twitter's scalding, https://github.com/twitter/scalding

scala is nice language to work with but not highly adopted by java development community.

Datatream's purpose is trying to take advantage of the java8 functional programing features and make your big data life easier.

the datastream is based on the cascading framework http://www.cascading.org/

for details information shoot me email at kenny.zlee@gmail.com as this project is still on early stage, I need to work with team to make it pefect. the interface is intend to change here. I hope we would have first minor release in few month

Here is the example to make the datastream run locally. :)

DataStream<StreamData> stream = builder.source(new URI("data/input/sample.csv")).build();
stream.mapTo("county", "OtherCountry", x -> x + ":newData").
writeTo(new URI("data/output/mapped.dat"), ",");

Believe me, if might need to write over 1k lines of code to do exact the samething with map reduce to make it run and couple of hundrends of line of code in cascading as well.

##About me kenny li(Zhenqi li)

[About me]https://www.linkedin.com/profile/view?id=48862722&trk=hp-identity-photo

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published