-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Product spec for Views #22
Conversation
@dmos62 @ghislaineguerin @mathemancer @pavish @seancolsen @silentninja I've assigned all of you to review. As usual, please unassign yourself when you have left feedback and don't intend to leave any more or if everything looks good to you. |
## Filters | ||
Views can have filters applied. Unlike Tables, view filters are not necessarily related to the columns that are present in the view. | ||
|
||
Using the example table above, imagine a view created from the query `SELECT ID, Title FROM Movies WHERE Year > 2000;` This will return this view: which is filtered by Year even though it's not a column in the View. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we intend to show this filter in the 'filters dropdown' on the frontend?
I think it's best if we consider this just as part of the query for the view, and not as a filter that can be manipulated.
Reasoning:
- The view is specifically created by applying this filter on the parent table.
- When the user wants to apply filters on the view, I assume they'll want to only further filter the view with the columns shown rather than being able to edit the value for Year.
- Based on our UX, it would not be possible to add this filter for Year back if the user removes it, unless they close and reopen the view.
- It would also confuse non-technical users if a random filter is present on a view which does not contain that column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 'filters dropdown' on the frontend?
[...]
Based on our UX
Please don't assume any particular design for Views as yet. This spec is aimed at clarifying product requirements which will then influence the requirements for design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. see this UX for adding a filter in Chart.io https://chartio.com/docs/visual-sql/start-a-query/visual-mode/#add-a-filter.
Even if we don't allow the user to manipulate the filters, it seems like it would be useful to show the filter applied (without them having to look at the query or understand SQL), otherwise non-technical users may get confused about why the View is not showing all the data they expect. And if we're figuring out how to represent the filter, that's most of the work, so why not let them change it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense.
I've been explaining to people lately that tables are a structured representation of raw data and views are reports generated from those tables. When I look at views as reports, it kind of makes them seem immutable, which is true from a db standpoint.
But considering that we intend to represent views as mutable entities on the frontend, allowing users to edit the base filter makes sense.
I'm only concerned on how well we would be able to represent the differences between tables and views to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, I couldn't see this conversation while doing my own review, so there's some overlap. To reiterate what I said in other spots, I think we're missing clarity between the filter that defines a view, and a filter applied to a view. I think we need both, and since from SQLAlchemy a view is a table (more-or-less), filtering a previously created view is the same as filtering a table. I think we should be crystal clear on this point, lest we create confusion for users.
This applies to all transformations that can be used to define a view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Saved" filters are part of the query. "Unsaved" filters are only visible in the UI.
I don't see how this would get rid of the concept of views. Could you elaborate?
Views are more than just saved filters. They can be referenced by other queries/views, can have their own set of filters/sorts, more of hierarchical structures which are closer to a table in characteristics.
If we call them unsaved filters, it would be misleading as unsaved filter
looks like something that cannot be extended up, something that either can be persisted or can't be and if persisted can be concatenated to existing filters
In the case of very deep Views(table1-> view1->view2), they need to show that hierarchy and users should be able to determine where certain operations like filter/sort took place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Views are saved filters, among other things. Views can be used as:
Hierarchy and composition are the reason why I would not call them saved filters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@silentninja said
Hierarchy and composition are the reason why I would not call them saved filters.
This is a succinct way to put it; I agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can rely on users always naming views well. I'm also not sure how this point relates to the product spec or filters.
Views should be treated as similar to a table. so a view name could help with understanding what the data the view holds instead of having to look where or how it was generated. This is more of a convention rather an accurate description. If named properly, we don't have to worry about showing how the view was generated unless required, where we could do complex diagrams like dependency graphs.
Generally, when working with a View
I wouldn't worry about how it was generated, rather would be focusing on what to do with the data it holds. This again is a convention, so it wouldn't apply to all the situations just like you mentioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean here by "a dependency chart" or "cluttered filters". Could you explain or maybe do a quick wireframe of what you're imagining for both of those things?
As for cluttered filters, I don't have visualisation other than our existing filter dropdown, but the reason why I think it would be cluttered is that we could end up with too many filters as a list(maybe along with the entity name they came from) in case of a deep view
it might be useful to look through Chart.io docs while reviewing the spec, I took some inspiration from them. they have the a good example of breaking down SQL query building visually, which is essentially what we're doing to create and interact with Views: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most of this makes sense. I've noted some spots that may be some trouble in specific comments.
|
||
# Introduction | ||
|
||
Fundamentally, **Views** are saved database queries. This means that in order to work with Views in Mathesar, we need to translate every concept that can be used in [PostgreSQL queries](https://www.postgresql.org/docs/14/queries.html) to our end users in a user-friendly way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite true. You could query a view, and further modify it (e.g., by adding filters, joining, choosing a subset of columns, etc.) without knowing how the view was created. This would give us some flexibility for working with views previously defined by some DB query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured we'd be able to break down views previously defined by some DB query into the concepts defined in this spec (as long as we have access to the query, which I assume we would if we had access to the view). Is this inaccurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be possible, but would be very (very) difficult under all but the simplest circumstances. Moreover, my point is that even if we could do that, we don't need to in order to be able to work with a view. Having access to the underlying query isn't necessary, since we can treat a view as a table for any of the operations we support.
I'm making the following assumptions in the rest of the spec about how we want to work with Views in Mathesar. | ||
|
||
- We **do not** need to support creating or editing Views based on every conceivable database query in Mathesar. We will be focusing on allowing common use cases. | ||
- We **do** need to support viewing Views based on any conceivable database query correctly, even if they can't be edited. Users should be able to connect a database with existing Views to Mathesar and have those Views show up correctly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply being able to view the definition of the view somehow, or just the actual tabular output? I.e., if they combine something we don't support with something we do (e.g., a filter we understand), do we need to try to pick out the filter and show that in the UI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the idea. If there's a filter we don't understand, I think we would show it as an unknown filter in the UI and not allow them to edit that part.
|
||
- We **do not** need to support creating or editing Views based on every conceivable database query in Mathesar. We will be focusing on allowing common use cases. | ||
- We **do** need to support viewing Views based on any conceivable database query correctly, even if they can't be edited. Users should be able to connect a database with existing Views to Mathesar and have those Views show up correctly. | ||
- At the moment, we **only** care about the final output of the views. If a view uses a subquery, CTE, union, intersection, etc. internally, we will not be representing those to the user in the UI (unless they look at the underlying SQL query). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To double-check: I thought if a user creates a view, the creation would be visible (insofar as it was created in Mathesar). I.e., if they filter and group a table to create the view, they'd be able to see what those elements were in the UI. Is this incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. But if a view was created outside of Mathesar and involved a CTE creating some derived columns that don't reflect in the final tabular output of the view, the user will not see anything about those derived columns.
dateCreated: 2022-01-13T19:49:54Z | ||
--- | ||
|
||
Here's how I think we should model views in our API and UI. Each heading represents an attribute of Views. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On most/all of the sections in this page, we have the possibility for confusion between the transformations that create a view, and then further transformations applied to that view. For example, I can create a view by filtering based on some row. Then, looking at that view, I can filter to further reduce the dataset without saving the result as a view. I think this multiple-level transformation isn't very clear from the way these sections are laid out. This is probably going to be challenging to portray in a sensible manner.
Note that I'm thinking in the abstract here, not necessarily in our current UI (I don't recall if we have that functionality or not). I do think that being able to manipulate / sort / group / filter the data in a persisted view would be useful for most users, though, and that comes with the problem I mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see comment about "saved" and "unsaved" filters above.
### Data Sources | ||
- **Definition**: This is the set of source columns that are used to generate the data in the current View column. | ||
- **Allowed values**: references to other Table or View columns, including other columns in the same View. | ||
- **Optional**: This could be empty for purely calculated columns (e.g. using the Postgres `random()` function and putting the output in a column) | ||
|
||
### Data Formula |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure Data Sources and Data Formula should be separate. The formula must include the sources (if there are any) to make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how you made that illustration, Kriti. Creative. I'm leaning towards dropping Data Sources, and querying the Data Formula for what columns it references. That way there would be a single source of truth for this. Having Data Sources at the same level of interface as Data Formula is a bit in conflict with the fact that one is derived from the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't always have a Formula, e.g. a view that's built on the query SELECT Title, Release Year, Rating FROM Movies;
won't have any Formula for Title
, Release Year
or Rating
. It will have a Source for each column, though.
We also may not have a Source for a column with a Formula - e.g. a column might be generated using the RANDOM()
function, which would be the Formula. There's no source data for it, though.
## Columns | ||
- **For alpha release**: Users should be able to see all columns associated with a view. Each column should show: | ||
- Data Type (non-editable) | ||
- Data Sources (editable if there's no Formula) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the goal of making the data sources editable. Would the intention be to drop a different column in in place of the one shown? Wouldn't it be simpler / smoother to remove that column and add a different one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if users want to reuse a view w a different table, this might come in handy. I think removing/adding a new column is probably easier w SQL than it is through the interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I definitely don't understand. Do you want them to use sort of the same processing, but on a different data set? I.e., maybe apply the same sequence of filters and grouping? Do you have an example of what you're envisioning there (i.e., with actual table names and fake data)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can reuse views with different tables, at least not in any easy way that's worth doing for the alpha release.
I'll remove data sources being editable, it would be better to drop and create new columns.
- **For alpha release**: | ||
- Users should be able to see the rows associated with a view. | ||
- If a cell is a direct representation of a record, users should be able to edit that record via that cell. | ||
- If a column is a direct representation of a record, users should be able to add a new record via that cell. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To double-check; the intent would be to pop up some "record input" UI when trying to edit the cell, correct? I.e,. since the entire table wouldn't be there, they'd need some interface to input the rest of the row in the associated table. What if the column is the join column? Would they edit the record in both tables, or just one? (Or does that not count as a direct representation?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the idea.
I think if it's a join column, it should be updated in both tables. I'll specify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, on second thought, join columns should not count as direct representations. Only columns with a single data source count.
- Users should be able to see what filters are applied to their View. | ||
- Users should be able to edit and delete filters applied to their view, including basic use cases for columns that are not visible in the View. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand this. Does this refer to editing the underlying query via the filter UI, or filtering the already-created view?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both, I'll add more details about "saved" vs. "unsaved".
- Users should be able to see what sorts are applied to their View. | ||
- Users should be able to edit and delete sorts applied to their view, including basic use cases for columns that are not visible in the View. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as filters
3e2ee13
to
95b0c95
Compare
Here's how I think we should model view columns in our API and UI. Each heading represents an attribute of a View Column. | ||
|
||
### Data Type | ||
- **Definition**: This is the final data type of the content of the column after any computations etc. are applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot always determine the exact data type of a view's column, as it could be a function that returns a polymorphic type. So we should be taking this into consideration and support having polymorphic column types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can create a function like that, and even use it to generate the column, but if PostgreSQL can't figure out a non-pseudo-type for the column, it'll throw an error. All polymorphic types are pseudo types.
https://www.postgresql.org/docs/13/datatype-pseudo.html
So, you can't have a column of type, e.g., ANYARRAY
- **Required**. Data type should always be set, at the very least, we can treat unknown data types as text. | ||
|
||
### Sources | ||
- **Definition**: This is the set of source columns that are used to generate the data in the current View column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be just referring to the parent of view ignoring the ancestors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will reference the immediate parent. If the parent is another view, you'll have to go look at that view to find the source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like the idea of a dependency graph eventually but I don't think it's worth doing before the alpha release.
## Filters | ||
- **For alpha release**: | ||
- Users should be able to see what filters are applied to their View. | ||
- Users should be able to edit and delete filters applied to their view in the UI, including basic use cases for columns that are not visible in the View. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "including basic use cases for columns that are not visible in the View" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming SELECT ID, Title FROM Movies WHERE Year > 2000;
is the query for the view, here the "filter" would be Year > 2000
. Users should be able to edit that even though Year
is not a visible column in the View.
@silentninja @mathemancer I've updated the spec to have a new page for modeling View Queries. Queries have their own filters, sorts, and aggregations, which should only be accessible when you're editing the query (if the query is editable). This is separate from filters. sorts, and groups on view data. The "saved filter" concept has been removed, although there will still be functionality to apply whatever filters you have in the UI to the view query. |
I'm going to go ahead and merge this. I'll open a new discussion once it's up on the wiki for any further thoughts. |
I was thinking through how to represent Views in Mathesar and ended up having a lot of thoughts on the data model and how to translate DB queries to Views. I figured we could have the most granular discussion here instead of a GitHub Discussion.
Previous discussion and context
Additional notes
I also fixed a bug with the organize images script, it couldn't handle images with spaces in the filename.