12_nosql-challenge

Deliverable(s)

Deliverable 1: A Jupyter notebook containing code that imports the data and sets up and updates the uk_food database.
NoSQL Setup

Deliverable 2: A Jupyter notebook containing code that performs the exploratory analysis queries in the database.
NoSQL Analysis

Instructions

The UK Food Standards Agency evaluates various establishments across the United Kingdom, and gives them a food hygiene rating. You've been contracted by the editors of a food magazine, Eat Safe, Love, to evaluate some of the ratings data in order to help their journalists and food critics decide where to focus future articles.

Requirements

Part 1: Database and Jupyter Notebook Set Up

Include the 'mongoimport' command text you used to import establishments.json in a markdown cell at the beginning of your Jupyter notebook file
The 'mongoimport' command text correctly drops any existing establishments collection before importing establishments.json into MongoDB
The database is named 'uk_food' and the collection is named 'establishments'
Correctly imports 'PyMongo' and 'Pretty Print'
An instance of the 'Mongo Client' is created
Lists the databases you have in Mongo, which includes 'uk_food'
Lists the collection(s) in the 'uk_food' database, which includes 'establishments' in the output
Uses 'find_one()' and 'pprint' to display one document in the 'establishments' collection
The 'establishments' collection is assigned to a variable

Part 2: Update the Database

The supplied data for the "Penang Flavours" restaurant is correctly inserted into the 'establishments' collection
A query is performed to find the 'BusinessTypeID' for "Restaurant/Cafe/Canteen" and returns only the 'BusinessTypeID' and 'BusinessType' fields
The "Penang Flavours" document is updated with the correct value for 'BusinessTypeID'
A query is correctly performed to delete all the documents in the collection where "Dover Local Authority" is the value for 'LocalAuthorityName'
A 'count_documents()' check is performed before and after the removal of the Dover documents to ensure the documents were removed
An 'update_many()' query is performed to convert the latitude and longitude fields from strings to decimal numbers and 'RatingValue' to integers

Part 3: Exploratory Analysis

Question 1: Which establishments have a hygiene score equal to 20?

Answer: There are 41 establishments with a hygiene score of 20.

A query is correctly performed to find the establishments with a hygiene score of 20
'count_documents()' is used to list the correct number of documents (answer: 41)
The first result is printed using 'pprint'
The results are converted to a Pandas DataFrame and displays the first 10 rows

Question 2: Which establishments in London have a RatingValue greater than or equal to 4?

Answer: There are 33 establishments with London as the Local Authoriy and a Rating Value greater than or equal to 4.

A query is correctly performed to find the establishments in London with a 'RatingValue' greater than or equal to 4
The query uses the '$regex' operator to locate the London establishments
'count_documents()' is used to list the correct number of documents (answer: 33)
The first result is printed using 'pprint'
The results are converted to a Pandas DataFrame and displays the first 10 rows

Question 3: What are the top 5 establishments with a RatingValue of 5, sorted by lowest hygiene score, nearest to the new restaurant added, "Penang Flavours"?

Answer: BusinessName: Howe and Co Fish and Chips - Van 17

RatingValue: 5 Hygiene: 0

Answer: BusinessName: Atlantic Fish Bar

RatingValue: 5 Hygiene: 0

Answer: BusinessName: Plumstead Manor Nursery

RatingValue: 5 Hygiene: 0

Answer: BusinessName: Iceland

RatingValue: 5 Hygiene: 0

Answer: BusinessName: Volunteer

RatingValue: 5 Hygiene: 0

A query is correctly performed to find the establishments within 0.01 degree of the "Penang Flavours" restaurant
The query also limits the results to establishments with a 'RatingValue' of 5
The query uses the 'sort()' method in 'PyMongo' to sort in ascending order on the hygiene score
The query uses the 'limit()' method in 'PyMongo' to limit the results to 5
All five results are printed using 'pprint'
The results are converted to a Pandas DataFrame and displayed

Question 4: How many establishments in each Local Authority area have a hygiene score of 0? Sort the results from highest to lowest, and print out the top ten local authority areas.

Answer: 55 establishments

An aggregation pipeline is built to include a 'match' query, 'group', and 'sort'
The 'match' query matches documents with a hygiene score of 0
The 'group' step of the pipeline is grouped on 'LocalAuthorityName' and counts the number of documents
The 'sort' step of the pipeline sorts the count of the documents in descending order
The aggregation pipeline is correctly sent to the 'aggregate()' method
The results from the aggregation query is cast as a list and then saved to a variable
The first ten results are printed using 'pprint'
The results are converted to a Pandas DataFrame and displays the first 10 rows

Deployment and Submission

Submit a link to a GitHub repository that’s cloned to your local machine and contains your files
Use the command line to add your files to the repository
Include appropriate commit messages in your files

Comments

Be well commented with concise, relevant notes that other developers can understand

====================================================================

CODING_PROCESS

I referenced the class activity files for this module to understand syntax and code structure. These resources provided a foundation for resolving various challenges encountered during the assignment.

Setup File

To update two fields (latitude and longitude) and convert their values to doubles, I looked up the correct setup for the code:

   establishments.update_many( {},    [{
  '$set': {
      'geocode.longitude': {'$toDouble': '$geocode.longitude'},
      'geocode.latitude': {'$toDouble': '$geocode.latitude'}
  }    }])

To inspect the data types of specific fields within the collection, I retrieved one document and iterated through its fields:

  document = uk_food.establishments.find_one()

  for field, value in document.items():
  print(f"{field}: {type(value).__name__}")

To look at specific field dataypes in a cllection, I reasearch online what code to use:

  document = uk_food.establishments.find_one()

  for field, value in document.items():
  print(f"{field}: {type(value).__name__}")

Analysis File

While constructing hygiene_count, I encountered asyntax issue, which I resolved by putting the hygiene_query in parenthesis:
```
  hygiene_count = establishments.count_documents(hygiene_query)
```

When building the long_lat_query, I had trouble with the code structure and how best to create it:

  long_lat_query = {'geocode.longitude': {'$gte': longitude - degree_search, '$lte': longitude + degree_search},
          'geocode.latitude': {'$gte': latitude - degree_search, '$lte': latitude + degree_search},
          'RatingValue' : 5}

To group the results by LocalAuthorityName, I initially encountered errors due to omitting the underscore in _id. After troubleshooting, I used the following corrected code:

  match_hygiene_score = {'$match': {'scores.Hygiene': {'$eq': 0}}}

  group_local_authority = {
      '$group': {
          '_id': '$LocalAuthorityName', 
          'count': { '$sum': 1 }}}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Resources		Resources
.gitignore		.gitignore
NoSQL_analysis.ipynb		NoSQL_analysis.ipynb
NoSQL_setup.ipynb		NoSQL_setup.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

12_nosql-challenge

Deliverable(s)

Instructions

Requirements

Part 1: Database and Jupyter Notebook Set Up

Part 2: Update the Database

Part 3: Exploratory Analysis

Question 1: Which establishments have a hygiene score equal to 20?

Question 2: Which establishments in London have a RatingValue greater than or equal to 4?

Question 3: What are the top 5 establishments with a RatingValue of 5, sorted by lowest hygiene score, nearest to the new restaurant added, "Penang Flavours"?

Question 4: How many establishments in each Local Authority area have a hygiene score of 0? Sort the results from highest to lowest, and print out the top ten local authority areas.

Deployment and Submission

Comments

CODING_PROCESS

Setup File

Analysis File

About

Releases

Packages

Languages

wrighang/12_nosql-challenge

Folders and files

Latest commit

History

Repository files navigation

12_nosql-challenge

Deliverable(s)

Instructions

Requirements

Part 1: Database and Jupyter Notebook Set Up

Part 2: Update the Database

Part 3: Exploratory Analysis

Question 1: Which establishments have a hygiene score equal to 20?

Question 2: Which establishments in London have a RatingValue greater than or equal to 4?

Question 3: What are the top 5 establishments with a RatingValue of 5, sorted by lowest hygiene score, nearest to the new restaurant added, "Penang Flavours"?

Question 4: How many establishments in each Local Authority area have a hygiene score of 0? Sort the results from highest to lowest, and print out the top ten local authority areas.

Deployment and Submission

Comments

CODING_PROCESS

Setup File

Analysis File

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages