Skip to content

inf-2202-f17/p3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Third mandatory assignment: techniques for working with cloud-scale datasets

In this exercise set you will get hands on experience with some techniques and concepts that often turn up when working with cloud systems and data. You will be using Azure Data Lake, which is a public cloud offering based on previously internal Microsoft technology. These are tools that close to every developer at the Microsoft office here in Tromsø has worked with at some point. World-wide, this technology – together with other systems – power reporting and logging pipelines for most (if not all) large Microsoft services.

The full assignment text is in p3.pdf in the P3 folder. The repo also contains the slides from the presentation of this assignment.

Practical info

  • The deadline is Friday, November 3rd. at 23h00.
  • The report should not be more than 3 pages long.
  • We expect you to seriously report and discuss performance evaluation and the different things that the assignment text instructs you to consider. Follow the advice from Åge's and Steffen's lectures. Skip the technical background unless it pertains to your interpretation of the numbers.
  • We expect you to run your code in the cloud. The datalake free trial asks you to provide a credit card, but spending real money after the trial credits run out is opt-in. Hence you shouldn't be nervous about your money. That said, do keep an eye on your credits.

Releases

No releases published

Packages

No packages published

Languages