spark-rdd

Star

Here are 24 public repositories matching this topic...

mahmoudparsian / pyspark-tutorial

Star

PySpark-Tutorial provides basic algorithms using PySpark

big-data spark pyspark spark-dataframes big-data-analytics data-algorithms spark-rdd

Updated Jan 20, 2023
Jupyter Notebook

mahmoudparsian / big-data-mapreduce-course

Star

Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University

Updated Dec 3, 2024
HTML

Thomas-George-T / Movies-Analytics-in-Spark-and-Scala

Star

Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.

scala movies big-data spark hadoop analytics movielens-data-analysis shell-script dataframes movielens-dataset rdd case-study spark-sql spark-programs spark-dataframes big-data-analytics spark-scala big-data-projects spark-rdd

Updated May 19, 2021
Scala

yennanliu / spark-etl-pipeline

Star

Various data stream/batch process demo with Apache Scala Spark 🚀

docker dockerfile scala twitter spark apache-spark sbt pipeline stream-processing sbt-plugin spark-streaming sbt-assembly spark-sql spark-dataframes spark-batch spark-rdd

Updated Feb 28, 2020
Scala

nipunmanral / Community-Detection-In-Graphs

Star

Implementation of Girvan-Newman Algorithm to detect communities in graphs using Yelp dataset

data-mining community-detection map-reduce betweenness breadth-first-search social-graph yelp-dataset girvan-newman spark-rdd

Updated Jul 16, 2019
Python

Ren294 / Log-Analysis-Project

Star

This project builds a scalable log analytics pipeline use Lambda architecture for real-time and batch processing of NASA server logs.

data-science big-data cassandra apache-spark hive hadoop grafana data-engineering spark-streaming apache-kafka apache-nifi powerbi spark-sql big-data-analytics hadoop-hdfs cassandra-driver spark-rdd

Updated Sep 16, 2024
Python

MaxineXiong / Item-based-collaborative-filtering

Star

This project utilizes PySpark DataFrames and PySpark RDD to implement item-based collaborative filtering. By calculating cosine similarity scores or identifying movies with the highest number of shared viewers, the system recommends 10 similar movies for a given target movie that aligns users’ preferences.

python spark apache-spark collaborative-filtering pyspark movie-recommendation spark-dataframes spark-rdd

Updated Jun 29, 2024
Jupyter Notebook

MaxineXiong / Degrees-of-Separation-with-Breadth-first-Search

Star

This project utilizes PySpark RDD and the Breadth-first Search (BFS) algorithm to find the shortest path and degrees of separation between two given Marvel superheroes based on based on their appearances together in the same comic books, empowering users to discover connections between their favourite superheroes in the Marvel universe.

python spark apache-spark pyspark breadth-first-search bfs-algorithm marvel-characters spark-rdd degrees-of-separation

Updated Jun 29, 2024
Jupyter Notebook

adityajn105 / Apache-Spark-Tutorials

Star

Apache spark is a big data analysis framework.

spark bigdata pyspark spark-ml spark-rdd spark-tutorials

Updated Apr 11, 2019
Jupyter Notebook

ricardoariasalazar / Flights-Delay

Star

In this project, we use Spark to visualize, manipulate, model and stream historical flight-delays data using Spark RDD, Spark SQL and Kafka

pyspark kafka-streams spark-sql big-data-analytics spark-rdd

Updated Jan 5, 2022
Jupyter Notebook

manojpawar94 / Spark-Scala-Examples

Star

I have implemented the sample programs using apache spark. The programs have developed on the concepts of Spark RDD and Spark SQL Dataframe.

spark apache-spark spark-sql spark-rdd

Updated Aug 31, 2021
Scala

nikhilkumawat03 / Extracting-Relevant-Document

Star

Projects contains based on Big Data

hadoop java-8 mapreduce spark-sql spark-rdd

Updated Feb 15, 2020
Java

mohammad-safari / spark-hadoop-exercise

Star

spark hadoop exercise of cloud computing course - aut 1402-1403 fall

big-data spark hadoop hdfs mapreduce spark-sql spark-dataframes hadoop-yarn spark-rdd

Updated Feb 1, 2024
Jupyter Notebook

ShreeshaN / SparkBigDataTutorials

Star

Demonstration of basic data transformations using Spark RDD and Spark DataFrame in Scala

spark spark-sql spark-scala spark-rdd scala-sbt spark-sql-udf

Updated Nov 18, 2022
Scala

firedent / Data-curation-and-indexing-with-ElasticSearch

Star

This program will process legal report via Stanford CoreNLP and index them in ElasticSearch

elasticsearch json scala xml spark-rdd

Updated Dec 4, 2019
Scala

vaibhav50596 / DeerfootTrailAnalysis

Star

The goal is to train a linear regression model to predict Deerfoot commute times given weather and accident conditions using Spark RDD and MLlib

spark spark-mllib spark-rdd

Updated Apr 12, 2020
Jupyter Notebook

demanejar / spark-rdd

Star

Spark RDD basic

spark project spark-rdd

Updated Jul 15, 2021
Java

on2e / ntua-atdb

Star

Advanced Topics in Databases course project - NTUA ECE - 2022-23

apache-spark pyspark spark-dataframes advanced-database apache-hadoop ntua-ece spark-rdd

Updated Mar 30, 2023
Python

RiccardoRevalor / Spark

Star

Spark exercises

spark pyspark spark-sql spark-rdd

Updated Nov 22, 2024
Jupyter Notebook

contactsunny / spring-spark-s3-file-read

Sponsor

Star

A POC written in Java using the Spring framework, which uses Apache Spark to read a file from Amazon S3 FS and counts the number of lines in the file.

java spark apache-spark spring spring-boot poc spark-rdd spark-s3 thetechcheck rdd-s3 spark-rdd-s3

Updated May 30, 2018
Java

Improve this page

Add a description, image, and links to the spark-rdd topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-rdd topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-rdd

Here are 24 public repositories matching this topic...

mahmoudparsian / pyspark-tutorial

mahmoudparsian / big-data-mapreduce-course

Thomas-George-T / Movies-Analytics-in-Spark-and-Scala

yennanliu / spark-etl-pipeline

nipunmanral / Community-Detection-In-Graphs

Ren294 / Log-Analysis-Project

MaxineXiong / Item-based-collaborative-filtering

MaxineXiong / Degrees-of-Separation-with-Breadth-first-Search

adityajn105 / Apache-Spark-Tutorials

ricardoariasalazar / Flights-Delay

manojpawar94 / Spark-Scala-Examples

nikhilkumawat03 / Extracting-Relevant-Document

mohammad-safari / spark-hadoop-exercise

ShreeshaN / SparkBigDataTutorials

firedent / Data-curation-and-indexing-with-ElasticSearch

vaibhav50596 / DeerfootTrailAnalysis

demanejar / spark-rdd

on2e / ntua-atdb

RiccardoRevalor / Spark

contactsunny / spring-spark-s3-file-read

Improve this page

Add this topic to your repo