Skip to content

Fix #21356: JSON nested_to_record Silently Drops Top-Level None Values #21363

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 13 commits into from
Jun 8, 2018

Conversation

daminisatya
Copy link
Contributor

@daminisatya daminisatya commented Jun 7, 2018

@daminisatya
Copy link
Contributor Author

@WillAyd Kindly review, this is my first PR to pandas!

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good for a first PR. Just some minor comments on the comments. Otherwise can you add a whatsnew note for 0.24?

@@ -242,11 +242,13 @@ def test_missing_field(self, author_missing_data):
# unnest records where only the first record has a None value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can delete this comment (tbh I don't really understand what it means even with the current behavior)

@@ -351,7 +353,7 @@ def test_json_normalize_errors(self):
errors='raise'
)

def test_nonetype_dropping(self):
def test_nonetype(self):
# GH20030: Checks that None values are dropped in nested_to_record
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can certainly delete this comment, though replace with a reference to the GitHub issue you are working on

@@ -238,15 +238,16 @@ def test_non_ascii_key(self):
tm.assert_frame_equal(result, expected)

def test_missing_field(self, author_missing_data):
# GH20030: Checks for robustness of json_normalize - should
# unnest records where only the first record has a None value
# GH20030: Checks for robustness of json_normalize
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd Can you verify this comment once again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the GH issue number is sufficient (like you have below)

# GH20030: Checks that None values are dropped in nested_to_record
# to prevent additional columns of nans when passed to DataFrame
def test_nonetype(self):
# GH21356
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, i just added the issue number.

@@ -80,6 +80,8 @@ Documentation Changes
Bug Fixes
~~~~~~~~~

- The top level None value is not dropped but rather preserved along with lower levels for consistency (:issue:`21356`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the bugfix in the whats new note.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change of plans here - can you add to the "Fixed Regressions" section in v0.23.1 introduced via #21368?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also needs to mention json_normalize, add double backticks on None.

finally, as a user this is not really clear on what has changed, can you re-word a bit.

Copy link
Contributor Author

@daminisatya daminisatya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the requested changes.

@jorisvandenbossche jorisvandenbossche added the IO JSON read_json, to_json, json_normalize label Jun 7, 2018
@jorisvandenbossche jorisvandenbossche added this to the 0.23.1 milestone Jun 7, 2018
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor edits otherwise lgtm

@@ -80,6 +80,8 @@ Documentation Changes
Bug Fixes
~~~~~~~~~

- The top level None value is not dropped but rather preserved along with lower levels for consistency (:issue:`21356`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change of plans here - can you add to the "Fixed Regressions" section in v0.23.1 introduced via #21368?

@@ -238,15 +238,16 @@ def test_non_ascii_key(self):
tm.assert_frame_equal(result, expected)

def test_missing_field(self, author_missing_data):
# GH20030: Checks for robustness of json_normalize - should
# unnest records where only the first record has a None value
# GH20030: Checks for robustness of json_normalize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the GH issue number is sufficient (like you have below)

@codecov
Copy link

codecov bot commented Jun 7, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@636dd01). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #21363   +/-   ##
=========================================
  Coverage          ?   91.85%           
=========================================
  Files             ?      153           
  Lines             ?    49562           
  Branches          ?        0           
=========================================
  Hits              ?    45525           
  Misses            ?     4037           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.25% <ø> (?)
#single 41.86% <ø> (?)
Impacted Files Coverage Δ
pandas/io/json/normalize.py 96.87% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 636dd01...b7bda86. Read the comment docs.

@@ -80,6 +80,8 @@ Documentation Changes
Bug Fixes
~~~~~~~~~

- The top level None value is not dropped but rather preserved along with lower levels for consistency (:issue:`21356`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also needs to mention json_normalize, add double backticks on None.

finally, as a user this is not really clear on what has changed, can you re-word a bit.

def test_nonetype_dropping(self):
# GH20030: Checks that None values are dropped in nested_to_record
# to prevent additional columns of nans when passed to DataFrame
def test_nonetype(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this test name more informative

@jreback
Copy link
Contributor

jreback commented Jun 8, 2018

@WillAyd if you have a few, can you rebase this

@WillAyd
Copy link
Member

WillAyd commented Jun 8, 2018

@jreback think its good to go. First time doing that so let me know if anything looks off

@jorisvandenbossche jorisvandenbossche merged commit ff26632 into pandas-dev:master Jun 8, 2018
@jorisvandenbossche
Copy link
Member

@daminisatya Thanks a lot!

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Jun 12, 2018
TomAugspurger pushed a commit that referenced this pull request Jun 12, 2018
david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSON nested_to_record Silently Drops Top-Level None Values
6 participants