-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Authentication and read endpoints #2
Conversation
- some wording changes, mainly to push some issues to the forum
- organise now since there'll probably be a few modules
Hi @lognaturel , this is about ready for a closer look.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SO exciting to see all this come together. Liking many of the decisions you took here. I have a few questions inline. I'm also curious to hear more about why you chose to start with the designed calls rather than the dynamic query builder.
@samwel2000, @Thaliehln your thoughts on the initial shape of this are also very much appreciated. 🙏
README.md
Outdated
Available endpoints on `Client`: | ||
|
||
- Projects | ||
- read |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you pick this read
/read_all
convention (e.g. is it common in similar Python libs)? How do you imagine it extending to creation? How do you see these calls interacting with the dynamic query builder we've discussed?
I'm not sure I would have separated Submissions and OData in this custom part of the API (vs. the dynamic query builder). I don't think "OData" is particularly meaningful to users -- it's just a means to get a json representation of their data. Do you think that separation is important? The OData call could be something like client.submissions.get_json
I'm asking mostly to have a record of the design decisions so we can continue being consistent!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you pick this read/read_all convention
It's following CRUD which is a plain and protocol-agnostic way to refer to manipulating objects. Any of the below may be IO targets for pyodk (e.g. cache or helpers to SQL [sqlite or postgres] or files).
CRUD | HTTP | SQL | Files |
---|---|---|---|
create | post | insert | write |
read | get | select | read |
update | put/patch | update | write/append |
delete | delete | delete | delete |
The all
suffix is in preference to 1) a 5th verb list
, or 2) a suffix many
which implies specifying more than one ID to read (but it accepts no parameters and returns all IDs). Specific formats get a suffix too e.g. read_xml
, read_all_table
. I've started mapping this out in the file at docs/api_index.csv
.
separated Submissions and OData in this custom part of the API
I agree the name OData
isn't very obvious, but I used it to follow the central docs. The exercise of making the API index doc helped clarify that these fit well under forms
and submissions
so I've moved them and deleted the odata
endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to propose a more freeform and descriptive approach to naming. With this part of the library's API we're not aiming for 1:1 parity with Central's API or full coverage so I think we should aim for as descriptive of names as we can and to also carefully design what parameters can be passed in.
The guiding question I'd like to use for naming is something like
How do people talk about the action that’s being performed? How do ODK docs and Central frontend talk about it?
E.g. we have the concept of “submission edits” so use client.submissions.edit rather than update. In the case of these reads, I think someone would say something like "ok, now I need to get this form's metadata". That leads me to client.forms.get_metadata(projectid)
and client.forms.get_metadata(projectid, formid)
.
This we mean we value expressiveness and consistency with ODK concepts over consistency between method names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me to get this merged with the current names and then to focus on naming in another PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think someone would say something like "ok, now I need to get this form's metadata".
Right, but before that I think they'd say "let me list all of the forms in this project". So maybe it's client.forms.list(projectid)
and client.forms.get_metadata(projectid, formid, extended=False)
.
I know I just said I'd value consistency between method names less than other things but it would be nice if this were consistent across resources. I think the following work equally well:
client.submissions.list(projectid, formid, extended=False)
,client.submissions.get_metadata(projectid, formid, extended=False)
client.app_users.list(projectid, extended=False)
(no way to get metadata for a single one)
And wonderfully enough, that's exactly what the API docs say: "listing all xyz" and "getting xyz metadata".
@lindsay-stevens this is already exciting, so far I like the decision of the design (The client is a context manager who does login and the Central API endpoints are attached to it), I just have a few comments.
|
- "odata" not particularly meaningful so move endpoints to form (service, metadata) and submission (tables). - test_auth tests were manipulating default files so fixed to use temp files instead.
Thanks for the review @lognaturel and comments @samwel2000 !
Main reasons: 1) the style of services and entities added here are building blocks for it, 2) it was lower down the priority list 😄, 3) limiting scope for this PR, 4) I'm not sure about the exact implementation strategy for it yet. |
- fluent style for chaining calls through the object hierarchy - remove "Entity" suffixes to avoid confusion with upcoming ODK feature with the same name, or weirdness like an "EntityEntity" - add pydantic for decent validation and de/serialization
- rename methods: read -> get etc. following feedback (naming). - subclass requests.Session to make error handling cleaner - build URL from base_url to make get/post/etc simpler - add client open/close following feedback (indentation).
- following feedback, it is not a desired approach.
- default is otherwise requests version
Thanks, @lindsay-stevens! Here's a quick summary of some of the behind-the-scenes conversations @lindsay-stevens an I had:
I'm going to merge this now and follow up with a couple of small proposed changes. Let's try to get a first release out next week. @samwel2000, please chime in here or file issues if you have any suggestions to make. |
[ ] Maybe some API path building fanciness like BirdyWhy is this the best possible solution? Were any other approaches considered?
Simple structure for reading use cases. The Client is a context manager which does login on entry. The session token is cached in a file so multiple client instances could share the same token. Service objects for the Central API endpoints are attached to the Client object. Service classes retrieve data and (generally) return it Python objects with JSON data mapped into them.
Avoided fanciness in the structure at this stage: 1) in the interests of time, 2) the initial target audience is programmers so they probably don't need much help, 3) simple stuff is easier to maintain.
About the data model classes
Initially I looked at
dataclasses
to take care of the boilerplate around the domain data types (currentlyProject
,Form
,Submission
). Althoughdataclasses
has the benefit of being built-in, it became awkward when considering things such as:I then looked at
attrs
, which was OK but was a bit verbose when it came to defining validations and data conversions, because it ignores type annotations so you have to write it out again via the fields API. Serialisation was not really included, except forasdict
andastuple
, where again defining how to convert types back to JSON is an exercise for the user.I then looked at
marshmallow
, which was better for serialisation, but ultimately it's a library about validating against a schema, i.e. the class defined for the data schema isn't used as an instance containing the actual data. So you have to write another dataclass that looks the same, to put the data in after it's been validated, or use a third-party library to generate such a class.Lastly I looked at
pydantic
, which uses type annotations to work out what to validate (type, optional or not) and how to convert data for de/serialisation e.g. string to date incoming, date to string outgoing. It can also generate a json schema for the class, but at the moment that doesn't work with the manager field (it probably can, somehow).About the fluent interface
The approach is to put an attribute
m
on the data model class, which is a model manager (like indjango
). The manager remembers the project_id and/or form_id, and is a namespace for services with those defaults set, thereby making it easier to interact with related endpoints further down the object hierarchy.For example, given a
Project
instance, we may want to manipulate relatedForm
s (or otherProject
s), or given aForm
we may want manipulate relatedSubmission
s (or otherForm
s). So a user can whip through the hierarchy like this:What are the regression risks?
Hopefully none, it's brand new!
Does this change require updates to documentation?
Yes and the Readme is updated.
Before submitting this PR, please make sure you have:
tests
nosetests
and verified all tests passblack pyodk tests
to format code