Skip to content

puradev/data-engineering-interview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

data-engineering-interview

2023 Data Engineering Interview

Task: Retrieve and process data from the RandomUser API

Instructions:

  1. You are required to write a Python script that retrieves data from the RandomUser API.
  2. The API you will be working with is the RandomUser API (https://randomuser.me/). It provides a RESTful interface for generating random user data.
  3. The API documentation can be found at https://randomuser.me/documentation.
  4. Your script should retrieve the data, process it, and store it in a local file in a suitable format.
  5. You should handle any necessary error checking and exception handling.
  6. Your script should be well-structured, modular, and include appropriate comments.
  7. Use any Python libraries or frameworks you deem necessary to accomplish the task.

Requirements:

  1. Your script should retrieve a list of random users from the API.
  2. For each user, extract the following information:
    • First name
    • Last name
    • Gender
    • Email address
    • Date of birth
    • Phone number
    • Nationality
  3. Store the extracted information in a local file in a suitable format (e.g., CSV, JSON, Parquet, etc.).
  4. Your script should handle pagination if the API response is paginated.
  5. Create a separate Markdown file that includes:
    • Overview
    • Setup Details & Instructions
    • Known Limitations or Inefficiencies

Evaluation Criteria:

  1. Ability to interact with a public API and retrieve data.
  2. Correct extraction and processing of the required information.
  3. Proper error handling and exception management.
  4. Suitable storage of the data in a local file.
  5. Code structure, organization, and comments.
  6. Efficient and effective handling of pagination (if applicable).
  7. Overall code quality, readability, and adherence to best practices.
  8. Extensibility of the code kept in mind.

Submission Guidelines:

  1. Clone this repository locally.
  2. Place your folders and files in a ZIP folder.
  3. Send an email to data.engineering@pura.com with the ZIP folder attached.
  4. Please let your company contact or recruiter know that you've finished the exercise.
  5. You have five (5) full business days to work on this (e.g. Receive on: Monday@12pm -> Submit by: Monday@12pm)

Some Notes on Time:

  • You're free to use as much time as you deem necessary to work on this within the assignment window. Please note, however, a marquee skill in software engineering is being aware of diminishing marginal returns, i.e. knowing when to stop.
  • Additionally, please note that the amount of time spent working on this does not necessarily correlate with an increased probability of moving on in the interview process. Your work will be judged on the criteria listed above.

About

2023 Data Engineering Interview

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published