-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Allow clustering of a nested path #1007
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sweet!
'texts': [ | ||
{ | ||
'text': 'Can you provide a short summary of the following text', | ||
'text_cluster': {'cluster_id': 0, 'membership_prob': 1.0, 'topic': 'summarization'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my understanding where does "_cluster" come from? maybe this should be text.cluster
if auto-gen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's avoid having periods because that makes it unclear if cluster
is a child of text
, or a column name with a dot in it. If we do a "dot", we will now forever have to escape it with double quotes in order to select that field in text form
select_rows(['texts.*."text.cluster"'])
vs simply select_rows(['texts.*.text_cluster'])
. Obviously a tuple will work, but the correct tuple will be tricky to write when using public api in notebooks.
Clustering of
something.*.*
is not yet supported. The path must end in a field name for now.