-
Notifications
You must be signed in to change notification settings - Fork 131
/
final-project-sp19.html
296 lines (204 loc) · 33.2 KB
/
final-project-sp19.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>CMPT 733: Big Data Science (Spring 2019) </title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="">
<!-- Latest compiled and minified CSS -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<style>
body {
padding-top: 20px;
}
.constainer {
margin-top: 20px;
}
.top-buffer { margin-top:40px; }
a {
color: #00BFFF;
}
a:visited {
color: #00BFFF;
}
mark {
background: #FF9;
}
b {
font-weight: 700;
}
</style>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-112163654-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-112163654-1');
</script>
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></sc\
ript>
<![endif]-->
</head>
<body>
<div class="container">
<h2 id="cmpt843"><a href = "https://sfu-db.github.io/bigdata-cmpt733" target="_blank">CMPT 733: Big Data Science (Spring 2019)</a></h2>
<h3 id="project-showcase"><b>Project Showcase</b></h3>
<hr>
<div class="container">
<div class="row">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/VLd-LnQkUPY" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Predicting Stable Portfolios Using Machine Learning</strong> [<a href = "https://github.com/mrafayaleem/equity-portfolio-prediction" target="_blank"/>Code</a>, <a href = "https://medium.com/sfu-big-data/predicting-stable-portfolios-using-machine-learning-f2e27d6dbbec" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/KiranA-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Kiran, Nandita Dwivedi and Muhammad Rafay Aleem</i></small>
<p class="text-muted"> <small>We aim to make the process of portfolio management better and simpler by using predictive modeling and deep learning techniques. We generate stable portfolios on predicted stock prices for next quarter.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/FiyLI8CX0Sk" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Measuring observable influence and impact of scientific research beyond academia.</strong> [<a href = "https://github.com/hhwangS27/cmpt733_proj" target="_blank"/>Code</a>, <a href = "https://medium.com/sfu-big-data/measuring-observable-influence-and-impact-of-scientific-research-beyond-academia-c00372c96a76" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/VermaKW-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Chhavi Verma, Shray Khanna, Honghui Wang </i></small>
<p class="text-muted"> <small>In this project, we have observed the impact of Genome BC’s academic publications on the real world by observing the references in downstream documents. We have provided the insights in the form of pyhton-igraphs and plotly bar graphs which help us understand the connections between Genome BC publications and the downstream documents. The more impactful the publication is, the more connections it will have and higher the depth is, more powerful the academic publication is. We have also found the top influencers of Genome BC publications for different depths.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/fTjW-ydDby0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Real-time Cryptocurrency Prediction and Analysis Platform</strong> [<a href = "https://github.com/ycxmichael/CMPT733_CryptoCurrency_Project" target="_blank"/>Code</a>, <a href = "https://medium.com/sfu-big-data/cryptomania-f7069ce9f374" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/LiWYZ-report.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Chengxi Li, Haopeng Wang, Michael Yang, Hao Zheng</i></small>
<p class="text-muted"> <small>In this project, we build a one-stop web application for cryptocurrency enthusiasts. By fetching our data through API, our web application is able to provide ever-updating, comprehensive information regarding cryptocurrencies in all aspects. By integrating cryptocurrency price data with news sentiment analysis and social status information, we build a deep learning model that is able to provide predictions on coin prices or binary returns to help investors with their decisions. By further integrating the model with our real-time pipeline, our web application is able to provide dynamic prediction curve every hour, minute or even second.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/EsmHvD_fj6g" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Developing an NLP based PR platform for the Canadian Elections</strong> [<a href = "https://github.com/theIps/Fall-Detection-Using-Wearable-Sensor-Data" target="_blank"/>Code</a>, <a href = "https://medium.com/sfu-big-data/developing-a-nlp-based-pr-platform-for-the-canadian-elections-d63ebed6b2f3" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/SunnakRT-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Abhishek Sunnak, Sri Gayatri Rachakonda, Oluwaseyi Talabi</i></small>
<p class="text-muted"> <small>In this project, we developed an NLP-based application which analyzed the sentiment and bias of news articles and tweets related to the Canadian 2019 elections to understand the public opinion of the candidate. We also analyzed the approval ratings of the top 3 candidates across different provinces. We used the latest NLP techniques to train deep-learning models for sentiment and bias analysis to classify news and tweets about the election. Using these results, an interactive dashboard was developed to provide a PR manager a visual platform to gain insights about the public’s perception and the media coverage of a candidate. This project can be further extended to any public relations team for their candidates.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/oa-dE6WKqjY" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Music Analysis & Recommendation System (M.A.R.S)</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/kmasrani/MARS" target="_blank"/>Code</a>, <a href = "https://medium.com/sfu-big-data/m-a-r-s-music-analysis-recommendation-system-f30424c2c362" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/KohliM-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i> Kashish Kohli, Kanksha Masrani</i></small>
<p class="text-muted"> <small>We aim to get inferences from our data that would help companies and startups fare better than they already do in the music market. In this regard, we extracted several findings and useful results. This is done by extracting insights from the songs from the last 5 decades. These insights are then used to create a Popularity predictor module which can predict if a song will be famous or not by just inputting the metadata. The accuracy of this module is 70%. Secondly we have created a Sentiment Analyzer which tells you the sentiment of the music heard in major countries. This can help shape the music of an artist to create better songs that would resonate with a wider audience. Thirdly, we have created a Recommendation engine which based on the taste of music of the user can suggest other songs more suited to him or her. It also suggests the top songs that might be on the chart that time as done by Netflix and Hulu.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/ouCQtGzxTME" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>CRYPTOIntel - Digging Deep Into The Crypto World</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/tkapoor/cryptointel" target="_blank"/>Code</a>, <a href = "https://medium.com/sfu-big-data/cryptointel-353beb13756b?sk=db8c092e157055b47d5665cd52c3784b" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/KapoorPI-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Tushar Chand Kapoor, Syed Ikram, Mehak Parashar</i></small>
<p class="text-muted"> <small>CryptoIntel is a one stop dashboard which gives all the information about cryptocurrencies. All the inquisitive users can get their answers related to cryptocurrencies from CRYPTOIntel.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/4DaB2CKG-dk" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Distributed News Monitor System</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/dxiang/news-monitor-system/tree/master" target="_blank"/>Code</a>, <a href = "https://medium.com/sfu-big-data/distributed-news-monitor-system-5921c10fa432" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/XiangHZX-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Dao Xiang, Yi Xiao, Hang Hu, Shi Heng Zhang</i></small>
<p class="text-muted"> <small>When social media has become the most cost-efficient way of communication among people, it is extremely intriguing to analyze people’s reaction to a popular news post while eliminating the false information online. Therefore, we designed the Distributed News Monitor System that concentrates on the news content to alert the public about the fake news and produce analysis of the public opinions from the Twitter comments on the news. Deep learning model is deployed and able to detect the integrity of the news according to its content. Big data streaming analysis is expanded to reach real-time news monitoring and thus guide people to think deeper about the news to make their own judgement. This system encompasses advanced modelling, real-time analytics, and scalability all in one.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/b5wkez0ENVk" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Herald: Know the Stock Movement Before It Happens</strong> [<a href = "https://github.com/ChangshengYan/StockPredict" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/MaZYC-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/MaZYC-report.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Andong Ma, Angel Zhang, Changsheng Yan, and Denise Chen</i></small>
<p class="text-muted"> <small>In this project, we build up a data science pipeline for stock movement prediction and a real-time prediction web platform. Specifically, we perform topic modeling on news articles to discover the general topics discussed in news and visualize the frequent words of positive and negative news to observe the similarity of these words with t-SNE. Besides, we aim at applying NLP methods to generate words embeddings and built the deep learning models based on the crawled news and stocks data to predict future stock price and stock price trend. With CNN+RNN model, we get the best performance of the model with the accuracy rate reaching 58%, outperforming baseline models (52% - 57% of accuracy rate). One of our most important final products is a web application accomplishing two main goals, acquiring the latest news, twitter, and stock dataset from different sources, and achieving real-time process and prediction on the future stock price. This web is designed for investors to get insightful news and tweets associated with each stock ticker and take the predicted stock price as a reference to better make their investing decisions.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/beyPiOhJXwg" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>End-to-End Solution For IoT devices Predictive Maintenance and Management</strong> [<a href = "https://github.com/harrisonxia/Predictive-Maintenance" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/XiaWL-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/XiaWL-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Chuangxin Xia, Risheng Wang, Yifan Li</i></small>
<p class="text-muted"> <small>In this project, we want to achieve an end-to-end solution for IoT device predictive maintenance and management. We performed ETL and EDA using pySpark, incorporated feature selection and anomaly detection on top of prediction neural network model trained using LSTM RNN. Those layers ensure our prediction to be precise. Our dashboard console communicate with live Node.js server and live model served on Google Cloud Machine Learning Engine while providing an interactive user experience and easy-to-interpret data visualization. The whole pipeline was built with flexibility and scalability in mind.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/7khgvYT3dO8" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>TradeSpade - Price Signal Forecast for Financial Assets</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/cmpt-733-tradespade/tradespade" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/BejjuSRP-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/BejjuSRP-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Anurag Bejju, Rishabh Singh, Nikitha Ravi , Manan Parasher</i></small>
<p class="text-muted"> <small>TradeSpade is a one-stop solution that provides day traders assistance with intraday trading by predicting Buy and Sell signals in order to maximize profits and make optimized decisions. It is targeted for both traditional and exploratory stock and cryptocurrency traders by providing a robust web application that can help them make data-driven decisions. It actively supports novice traders by providing intuitive financial predictions based on historical and contextual information collected for the last one year. As part of this project, we have also effectively depicted the influence of social media and everyday news on market fluctuations.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/rZVwo9eNmG4" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Duplicate Questions across multiple Question-Answering Forums</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/vma19/finding-duplicate-questions-across-multiple-question-answering-forums.git" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/ZolaktafM-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/ZolaktafM-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Neda Zolaktaf and Vaishnavi Malhotra</i></small>
<p class="text-muted"> <small>In this project, we worked on 6 individual (Quora, Apple, Android, Sprint, Superuser, and AskUbuntu) and an integrated dataset (Quora and AskUbuntu) to tackle the problem of duplicate question detection across multiple question-answering platforms. </small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/H1bPJaeIvJI" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>FootWizard - Predict The Unpredictable</strong> [<a href = "https://github.com/sagarparikh2013/FootWizard" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/ParikhSS-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/ParikhSS-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Sagar Parikh, Chirayu Shah, Abhi Savaliya</i></small>
<p class="text-muted"> <small>We aimed at predicting the outcomes of the EPL matches on the basis of their previous records based on winning streaks, head-to-head and overall rating. We implemented these models using Machine Learning techniques and found that SVM provides the best accuracy among the the other 4 techniques which was 61%. Currently, we have predicted the Football Matches outcomes. However, betting is next challenge as it involves predicting the matches with higher accuracy and predict the dynamic odds in real-time. We plan to recommend best platform to maximize the betting profits.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/2mCF2DcIeGI" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Skills Job Advisor</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/skillful/skills-job-advisor" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/ChopraTKU-poster.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/ChopraTKU-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Bhuvan Chopra, Btara Truhandarien, Grace Kingsly, Mohammad Ullah</i></small>
<p class="text-muted"> <small>In this project we have demonstrated various techniques that we used to tackle the challenging problem of giving advice on the matter of skills that one should cultivate to be more suitable for a particular job. Our data reflects real world scenarios and people by using information from resumes, gathered by searching for resumes of people that have worked in one of 15 different jobs. With information retrieval techniques such as TF-IDF we are able to build a corpus of relatable skills for a particular job. Attempts at job normalization, required as part of our initial modeling plan, yielded unsatisfactory results, ultimately leading us to another kind of model. The final model that we built is a K-Means clustering model, with the supplied data points being document and word embeddings of job titles and skills. This final model though rudimentary, allows us to give a basic notion of advice on skills that needs to be cultivated through comparison of a given skill-set and the skills within a particular job cluster.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/sHkFyeWxPU4" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Metro Vancouver Housing Market Analysis and Prediction</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/nmisra/eigentum/" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/KrishnaH-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/KrishnaH-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Krishna Chaitanya Gopaluni, Nitin Misra, Harish Bhargav Dasika, Manjur Rahaman</i></small>
<p class="text-muted"> <small>Vancouver is always in the bubble. Potential buyers take the current increasing price trends for granted to invest. But the prices may suddenly fall and it takes a really long time to get Return
On Investment. Keeping this in mind we have come up with the following goals. (1) Identifying Bubble Prone Areas in Metro Vancouver. (2) Predicting the housing price based on current trends. (3) Predicting HPI Benchmark Prices future trends. (4) Predicting property price range based on a property image. In the end, the project can help a potential buyer in warning about bubble-prone areas and he will be able to make informed decisions based on future trend prediction. </small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/zVFIBfbuDhM" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Real-time Cryptocurrency Analysis (financial-analysis)</strong> [<a href = "https://github.com/MohammadMazraeh/realtime-crypto-analysis" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/RenaniCM-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/RenaniCM-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Fatemeh Renani, Jaskaran Kaur Cheema, Mohammad Mazraeh</i></small>
<p class="text-muted"> <small>Stock price forecasting is a popular and important topic in financial and academic studies and cryptocurrency market is not an exception. In this project we have created a general scalable platform for real-time cryptocurrency price prediction. The platform received the news and price history as its input and it performs feature extraction, feature aggregation, and price movement prediction. Finally the platform outputs the predicted Bitcoin price movement for next minute. At each stage in the pipeline the data is read from a kafka and the new data is written into another kafka. Hence, other cryptocurrency can be easily integrated using this architecture to produce the most realtime, robust and accurate cryptocurrency price prediction project!</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/vPjfGbmhX4A" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Internet Media Influence</strong> [<a href = "https://github.com/lmbkv/imi_733" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/DalawatAB-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/DalawatAB-report.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Aroun Amitabh Dalawat, Aisuluu Alymbekova, Shreejata Bhattacharjee</i></small>
<p class="text-muted"> <small>Internet media platforms have evolved from low-quality entertainment content to global media and tech companies, whose articles go viral and have great influence on people’s opinions all around the world. Every company needs efficient marketing to thrive, grow and effectively communicate to their potential customers. With a rapid growth of e-commerce segment, the influence of internet media platforms can be leveraged as a strong marketing tool to promote goods. Hence, platforms such as BuzzFeed, BestProducts.com, etc. can be used for digital advertising in e-commerce. These are the websites you look to when you’re trying to get information, opinions, even suggestions on the kind of products that we want to buy or should buy. The project is focused on two things in particular. First, identification and evaluation of the impact of internet media platforms on e-commerce. Second, development of a tool that will automate the creation of articles for internet media platforms. So, for example, from the point of view of a Buzzfeed employee, the time and labor spent in manually searching for potential products to be featured in articles and writing descriptions individually for each of them will be reduced. Hence, we might say that the practical application of this project will be in the digital marketing sphere.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/V635gdcw1h0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Intelligent Travel Recommendation System</strong> [<a href = "https://github.com/sthandap199/ITRS" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/VenkateswaranST-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/VenkateswaranST-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Savitaa Venkateswaran, Subikshaa Senthilkumar, Sachin Prabhu Thandapani</i></small>
<p class="text-muted"> <small>Our project provides a Tailor-made Travel plan for Users using their travel details like destination, budget, start and end dates of travel and their preferences of attraction categories, hotel amenities and cuisine type. Our project significantly reduces the time spent on planning for a satisfactory vacation.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/tFp5J7_7bPY" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>News Sentiment Tracker: A Targeted Opinion Mining Interface</strong> [<a href = "https://github.com/andrewjwesson/news-sentiment-tracker" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/WessonR-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/WessonR-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Andrew Wesson, Prashanth Rao</i></small>
<p class="text-muted"> <small>In this project, we developed an end-to-end NLP-based application that automatically detects fine-grained sentiment towards a specific target query (such as a person, event, product or organization) in news articles. We applied novel combinations of techniques from big data, NLP and time series visualization to provide the end user targeted insights into press coverage on a specific entity. Our system was shown to identify large-scale shifts in sentiment in news coverage towards a target reliably, and we foresee numerous commercial applications that could benefit from this approach and help guide the relevant personnel in making data-driven decisions.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/ucRwm4dBVGs" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>VANREAPER - Vancouver Real Estate Analysis and Predictions</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/quad_squad/housing_market_analysis" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/AhujaHKS-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/AhujaHKS-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Chirag Ahuja, Ekramul Hoque, Pavan Kosaraju, Rohith Sooram</i></small>
<p class="text-muted"> <small>VANREAPER is an online tool which aims to improve the process of how people in Vancouver buy and sell homes, empowering them with the information they need to make a decision before making the purchase. In this project, we have scraped data from multiple sources like property tax data from BC Assessment Authority, property listings data from REW, school ratings data from Fraser Institute and historical interest rates from Bank of Canada. Upon collecting the data, we have applied various time series models (ARIMA, LSTM), Regression models (GBTR, Linear), and Recommendation models (KNN). Finally, all the statistical models were serialized, persisted and deployed using an interactive Flask Web Application.</small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/8l-cNF2IGjU" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Predicting Stock Prices using Social Media</strong> [<a href = "https://github.com/gauravprachchhak/Stock-Brokers" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/GajjarPBD-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/GajjarPBD-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Mihir Gajjar, Gaurav Prachchhak, Tommy, Betz, Veekesh Dhununjoy</i></small>
<p class="text-muted"> <small>We predict the future closing stock price using historical stock data in combination with the sentiments of news articles and twitter data. We collected the historical stock price, twitter and news data by web scraping and through various data sources. In the preprocessing stage, we filter the unwanted records and carry out aggregations to extract useful features. Sentiment analysis has been performed using TextBlob on the news and twitter data to generate weighted average sentiments. By using various statistical techniques, we generate new features using the stock data. Using the ‘Date’ field we combine all the features. The usefulness of the features is validated by performing correlation analysis. After performing feature engineering, we provide these features as an input to our LSTM model and predict the future closing stock prices. The best results were obtained by using the stock data along with the new data. On the other hand, when including twitter sentiments, the error was higher which indicates that the vast number of tweets were not directly related to Apple’s success which can interfere with predictions. </small></p> <br/>
</div>
</div>
<div class="row top-buffer">
<div class="col-md-4"> <iframe width="336" height="189" src="https://www.youtube.com/embed/oOdUFaErtY8" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> </div>
<div class="col-md-8"> <strong>Assessment and Visual Analysis of Trends using Article Reviews</strong> [<a href = "https://csil-git1.cs.surrey.sfu.ca/rajendra/avatar" target="_blank"/>Code</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/reports/KhanRM-report.pdf" target="_blank"/>Report</a>, <a href = "https://github.com/sfu-db/bigdata-cmpt733/blob/cmpt733-2019sp/posters/KhanRM-poster.pdf" target="_blank"/>Poster</a>]<br>
<small><i>Jamshed Khan, Padmanabhan Rajendrakumar, Jaideep Misra</i></small>
<p class="text-muted"> <small>In the age of Big Data, an estimated 2.5 quintillion bytes of data is generated every day and a huge amount of this is of a textual nature. With scores of documents available on the web and more pouring in day after day, how can one make sense of a general summary without actually diving in and reading every word? Searching for insights from such an enormous amount of information
can become very tedious and time-consuming.</small></p> <br/>
</div>
</div>
</div>
<div class="row"><h4> </h4><hr><p class="text-center"> © Jiannan Wang 2019</p></div>
</div>
</body>
</html>