Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Python API: Allow to cache object metadata and location #2009

Merged

Conversation

AymericDu
Copy link
Member

SUMMARY

Allow to cache object metadata

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME
  • Python API
SDS VERSION
openio 4.8.3.dev5

@AymericDu AymericDu requested review from murlock and fvennetier March 25, 2020 21:08
@AymericDu AymericDu force-pushed the python-api-allow-to-cache-object-metadata branch from e2147d0 to 8984c29 Compare March 26, 2020 09:05
@AymericDu AymericDu changed the title Python API: Allow to cache object metadata Python API: Allow to cache object metadata and location Mar 26, 2020
@codecov-io
Copy link

codecov-io commented Mar 26, 2020

Codecov Report

Merging #2009 into 4.x will decrease coverage by 0.15%.
The diff coverage is 95.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##              4.x    #2009      +/-   ##
==========================================
- Coverage   80.84%   80.69%   -0.14%     
==========================================
  Files         359      375      +16     
  Lines       69015    70709    +1694     
==========================================
+ Hits        55788    57053    +1265     
- Misses      13227    13534     +307     
- Partials        0      122     +122     
Impacted Files Coverage Δ
oio/common/storage_functions.py 90.07% <50.00%> (-1.02%) ⬇️
oio/common/cache.py 93.75% <93.75%> (ø)
oio/container/client.py 84.45% <95.00%> (+0.53%) ⬆️
tests/functional/api/test_objectstorage.py 97.46% <98.43%> (+0.02%) ⬆️
oio/api/object_storage.py 80.82% <100.00%> (+0.50%) ⬆️
meta0v2/meta0_remote.c 74.29% <0.00%> (-8.57%) ⬇️
tests/functional/crawler/test_integrity_crawler.py 78.85% <0.00%> (-3.84%) ⬇️
meta1v2/meta1_prefixes.c 75.78% <0.00%> (-3.09%) ⬇️
oio/common/http_urllib3.py 85.00% <0.00%> (-2.50%) ⬇️
oio/conscience/client.py 81.82% <0.00%> (-2.47%) ⬇️
... and 52 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35366c4...f780d55. Read the comment docs.

cid = cid or cid_from_name(account, container)
cid = cid.upper()
return '/'.join(("meta", cid, obj))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be we should add a helper function _del_metadata_cache since lot of code is related to remove entry from cache

chunks[float(position)] = chunk

if meta is None or chunks is None:
meta = kwargs.get('meta')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why saving meta if line below it is overwritten ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an oversight, this line should have been deleted.

meta = None
chunks = None
cache = kwargs.get('cache')
if cache is not None and not version:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if cache and not version ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cache (dict) is empty, this condition remains true.

cid=None):
if not obj:
raise ValueError('Missing object name to use the cache')
cid = cid or cid_from_name(account, container)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to measure impact of cid_from_name since it will always compute a SHA256

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cid_from_name is already used in _object_prepare and object_fetch computes it (unless there was the cid)

else:
chunk_method = meta['chunk_method']
storage_method = STORAGE_METHODS.load(chunk_method)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be nice to split this code in few function but I don't have idea

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the cache management to container/client.py, so finally this code is no longer modified.

@AymericDu AymericDu force-pushed the python-api-allow-to-cache-object-metadata branch 4 times, most recently from 3e4f9f1 to f780d55 Compare March 27, 2020 10:38
content_meta = content_meta.copy()
if properties:
metadata['properties'] = content_meta['properties']
content_meta['properties'] = dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this. Why is there metadata['properties'] and metadata['meta']['properties']?

By the way, when reading this, it's hard to get what is content_meta and what is metadata...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_value['meta']['properties'] is always an empty dictionary. While cache_value['properties'] only exists if the object properties have been retrieved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_value['meta']['properties'] is always an empty dictionary.

Then why do we need it?

@AymericDu AymericDu force-pushed the python-api-allow-to-cache-object-metadata branch from f780d55 to 83e7f88 Compare March 27, 2020 15:15
@AymericDu AymericDu force-pushed the python-api-allow-to-cache-object-metadata branch from 83e7f88 to 6a58d85 Compare March 27, 2020 15:26
@@ -89,12 +91,15 @@ def __init__(self, namespace, logger=None, perfdata=None, **kwargs):
conf = {"namespace": self.namespace}
self.logger = logger or get_logger(conf)
self.perfdata = perfdata
self.cache = cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this field. Also, I don't think cache needs to appear in this method's signature. Adding 'cache' to EXTRA_KEYWORDS would do the trick.

content_meta = content_meta.copy()
if properties:
metadata['properties'] = content_meta['properties']
content_meta['properties'] = dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_value['meta']['properties'] is always an empty dictionary.

Then why do we need it?

@fvennetier fvennetier merged commit 9ee11ba into open-io:4.x Mar 31, 2020
@AymericDu AymericDu deleted the python-api-allow-to-cache-object-metadata branch April 2, 2020 07:52
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants