You've been invited to complete this data science assignment.
We have deliberately designed this assignment to be open-ended in order to leave plenty of room for you to make choices.
Please document your code. We do not expect you to spend days on this nor to build a full-blown application.
Good luck!
In this repository you'll find a data directory with files containing energy meter-data.
Each row corresponds to a meter and time interval, and shows the energy consumed and energy generated, or curtailed, in units of watt-hours.
Please build a solution that does the following:
Part I: Produce a table showing the mean energy, total energy, and total generation in kilowatt-hours for each meter in every calendar month between Jan 2020 to April 2021.
Part II: Using any simple model of your choice, forecast the total monthly generation for each meter from May - August 2021.
Please provide your code and results of this assignment.
Part III: Please describe how you would turn this into a scalable system that would run on millions of rows and how to run and maintain this in production.
We are looking forward to your solution!