Skip to content

Add raw json output to JsonLdExtractor #103

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

Granitosaurus
Copy link

Sometimes jsonld schema is prefered in raw json format rather than python dict - this PR implementes as_json kwarg bool to determine whether to return python dictionary or json string.

Some use cases:

1. Use alternative json library
2. Use json.loads kwargs
3. Use object hook functions

My personal use case is to strip away @ keys with object hook:

def strip_jsonld(data):
    return {k: v for k, v in data.items() if not k.startswith('@')}

data = json.loads(script, object_hook=strip_jsonld)

@codecov
Copy link

codecov bot commented Jan 25, 2019

Codecov Report

Merging #103 into master will decrease coverage by 0.32%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #103      +/-   ##
==========================================
- Coverage    87.3%   86.98%   -0.33%     
==========================================
  Files          11       11              
  Lines         457      461       +4     
  Branches       97       98       +1     
==========================================
+ Hits          399      401       +2     
- Misses         52       53       +1     
- Partials        6        7       +1
Impacted Files Coverage Δ
extruct/jsonld.py 100% <100%> (ø) ⬆️
extruct/rdfa.py 91.3% <60%> (-8.7%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3ab5592...ff22c72. Read the comment docs.

Copy link
Member

@lopuhin lopuhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Granitosaurus I think this is a useful feature to have (although #69 still has it's place), left some comments below.

Copy link
Member

@lopuhin lopuhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Granitosaurus thanks 👍 Left a few minor comments. Also would be nice to mention this feature in the README.

granitosaurus added 2 commits January 29, 2019 01:39
@Granitosaurus
Copy link
Author

Also added this feature to RDFA extractor and updated readme with short examples.

@Gallaecio
Copy link
Member

This looks mostly ready. What about adding a test covering the usage of parse_json=False?

@Granitosaurus
Copy link
Author

Sorry that it took me so long to attend to this but the tests gave me a bit of an headache :D

Should be good to go now!

@Gallaecio
Copy link
Member

TestRDFa.test_wikipedia_xhtml_rdfa_no_prefix seems to be failing after your changes.

@Granitosaurus
Copy link
Author

TestRDFa.test_wikipedia_xhtml_rdfa_no_prefix seems to be failing after your changes.

It has been failing master for me too; I actually update the fixtures in this PR to prevent it failing but the test just seems to be flawed in some sense - it keeps either breakin on travis or on my machine locally. 🤷‍♂️

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants