Product spec for Views #22

kgodey · 2022-01-13T20:57:30Z

I was thinking through how to represent Views in Mathesar and ended up having a lot of thoughts on the data model and how to translate DB queries to Views. I figured we could have the most granular discussion here instead of a GitHub Discussion.

Previous discussion and context

Meta issue for current Views issues: Query Builder and Views design mathesar#442
Column types for Views mathesar#838

Additional notes

I also fixed a bug with the organize images script, it couldn't handle images with spaces in the filename.

kgodey · 2022-01-13T20:59:15Z

@dmos62 @ghislaineguerin @mathemancer @pavish @seancolsen @silentninja I've assigned all of you to review. As usual, please unassign yourself when you have left feedback and don't intend to leave any more or if everything looks good to you.

pavish · 2022-01-13T21:12:19Z

product/specs/2022-01-views/02-modeling-views.md

+## Filters
+Views can have filters applied. Unlike Tables, view filters are not necessarily related to the columns that are present in the view.
+
+Using the example table above, imagine a view created from the query `SELECT ID, Title FROM Movies WHERE Year > 2000;` This will return this view: which is filtered by Year even though it's not a column in the View.


Do we intend to show this filter in the 'filters dropdown' on the frontend?

I think it's best if we consider this just as part of the query for the view, and not as a filter that can be manipulated.

Reasoning:

The view is specifically created by applying this filter on the parent table.

When the user wants to apply filters on the view, I assume they'll want to only further filter the view with the columns shown rather than being able to edit the value for Year.

Based on our UX, it would not be possible to add this filter for Year back if the user removes it, unless they close and reopen the view.

It would also confuse non-technical users if a random filter is present on a view which does not contain that column.

the 'filters dropdown' on the frontend?
[...]
Based on our UX

Please don't assume any particular design for Views as yet. This spec is aimed at clarifying product requirements which will then influence the requirements for design.

e.g. see this UX for adding a filter in Chart.io https://chartio.com/docs/visual-sql/start-a-query/visual-mode/#add-a-filter.

Even if we don't allow the user to manipulate the filters, it seems like it would be useful to show the filter applied (without them having to look at the query or understand SQL), otherwise non-technical users may get confused about why the View is not showing all the data they expect. And if we're figuring out how to represent the filter, that's most of the work, so why not let them change it?

Okay, that makes sense.

I've been explaining to people lately that tables are a structured representation of raw data and views are reports generated from those tables. When I look at views as reports, it kind of makes them seem immutable, which is true from a db standpoint.

But considering that we intend to represent views as mutable entities on the frontend, allowing users to edit the base filter makes sense.

I'm only concerned on how well we would be able to represent the differences between tables and views to the user.

Argh, I couldn't see this conversation while doing my own review, so there's some overlap. To reiterate what I said in other spots, I think we're missing clarity between the filter that defines a view, and a filter applied to a view. I think we need both, and since from SQLAlchemy a view is a table (more-or-less), filtering a previously created view is the same as filtering a table. I think we should be crystal clear on this point, lest we create confusion for users.

This applies to all transformations that can be used to define a view.

"Saved" filters are part of the query. "Unsaved" filters are only visible in the UI.

I don't see how this would get rid of the concept of views. Could you elaborate?

Views are more than just saved filters. They can be referenced by other queries/views, can have their own set of filters/sorts, more of hierarchical structures which are closer to a table in characteristics.

If we call them unsaved filters, it would be misleading as unsaved filter looks like something that cannot be extended up, something that either can be persisted or can't be and if persisted can be concatenated to existing filters

In the case of very deep Views(table1-> view1->view2), they need to show that hierarchy and users should be able to determine where certain operations like filter/sort took place.

Views are saved filters, among other things. Views can be used as:

Hierarchy and composition are the reason why I would not call them saved filters.

@silentninja said

Hierarchy and composition are the reason why I would not call them saved filters.

This is a succinct way to put it; I agree.

I don't think we can rely on users always naming views well. I'm also not sure how this point relates to the product spec or filters.

Views should be treated as similar to a table. so a view name could help with understanding what the data the view holds instead of having to look where or how it was generated. This is more of a convention rather an accurate description. If named properly, we don't have to worry about showing how the view was generated unless required, where we could do complex diagrams like dependency graphs.

Generally, when working with a View I wouldn't worry about how it was generated, rather would be focusing on what to do with the data it holds. This again is a convention, so it wouldn't apply to all the situations just like you mentioned

I'm not sure what you mean here by "a dependency chart" or "cluttered filters". Could you explain or maybe do a quick wireframe of what you're imagining for both of those things?

Dependency graph:

As for cluttered filters, I don't have visualisation other than our existing filter dropdown, but the reason why I think it would be cluttered is that we could end up with too many filters as a list(maybe along with the entity name they came from) in case of a deep view

product/specs/2022-01-views/02-modeling-views.md

product/specs/2022-01-views/03-modeling-view-columns.md

product/specs/2022-01-views/04-ui-requirements-for-views.md

kgodey · 2022-01-13T23:11:39Z

it might be useful to look through Chart.io docs while reviewing the spec, I took some inspiration from them. they have the a good example of breaking down SQL query building visually, which is essentially what we're doing to create and interact with Views:

mathemancer

I think most of this makes sense. I've noted some spots that may be some trouble in specific comments.

mathemancer · 2022-01-19T10:15:28Z

product/specs/2022-01-views.md

+
+# Introduction
+
+Fundamentally, **Views** are saved database queries. This means that in order to work with Views in Mathesar, we need to translate every concept that can be used in [PostgreSQL queries](https://www.postgresql.org/docs/14/queries.html) to our end users in a user-friendly way.


This isn't quite true. You could query a view, and further modify it (e.g., by adding filters, joining, choosing a subset of columns, etc.) without knowing how the view was created. This would give us some flexibility for working with views previously defined by some DB query.

I figured we'd be able to break down views previously defined by some DB query into the concepts defined in this spec (as long as we have access to the query, which I assume we would if we had access to the view). Is this inaccurate?

That might be possible, but would be very (very) difficult under all but the simplest circumstances. Moreover, my point is that even if we could do that, we don't need to in order to be able to work with a view. Having access to the underlying query isn't necessary, since we can treat a view as a table for any of the operations we support.

mathemancer · 2022-01-19T14:38:38Z

product/specs/2022-01-views/01-assumptions.md

+I'm making the following assumptions in the rest of the spec about how we want to work with Views in Mathesar.
+
+- We **do not** need to support creating or editing Views based on every conceivable database query in Mathesar. We will be focusing on allowing common use cases.
+- We **do** need to support viewing Views based on any conceivable database query correctly, even if they can't be edited. Users should be able to connect a database with existing Views to Mathesar and have those Views show up correctly.


Does this imply being able to view the definition of the view somehow, or just the actual tabular output? I.e., if they combine something we don't support with something we do (e.g., a filter we understand), do we need to try to pick out the filter and show that in the UI?

Yes, that's the idea. If there's a filter we don't understand, I think we would show it as an unknown filter in the UI and not allow them to edit that part.

mathemancer · 2022-01-19T14:41:35Z

product/specs/2022-01-views/01-assumptions.md

+
+- We **do not** need to support creating or editing Views based on every conceivable database query in Mathesar. We will be focusing on allowing common use cases.
+- We **do** need to support viewing Views based on any conceivable database query correctly, even if they can't be edited. Users should be able to connect a database with existing Views to Mathesar and have those Views show up correctly.
+- At the moment, we **only** care about the final output of the views. If a view uses a subquery, CTE, union, intersection, etc. internally, we will not be representing those to the user in the UI (unless they look at the underlying SQL query).


To double-check: I thought if a user creates a view, the creation would be visible (insofar as it was created in Mathesar). I.e., if they filter and group a table to create the view, they'd be able to see what those elements were in the UI. Is this incorrect?

That is correct. But if a view was created outside of Mathesar and involved a CTE creating some derived columns that don't reflect in the final tabular output of the view, the user will not see anything about those derived columns.

mathemancer · 2022-01-19T14:47:33Z

product/specs/2022-01-views/02-modeling-views.md

+dateCreated: 2022-01-13T19:49:54Z
+---
+
+Here's how I think we should model views in our API and UI. Each heading represents an attribute of Views.


On most/all of the sections in this page, we have the possibility for confusion between the transformations that create a view, and then further transformations applied to that view. For example, I can create a view by filtering based on some row. Then, looking at that view, I can filter to further reduce the dataset without saving the result as a view. I think this multiple-level transformation isn't very clear from the way these sections are laid out. This is probably going to be challenging to portray in a sensible manner.

Note that I'm thinking in the abstract here, not necessarily in our current UI (I don't recall if we have that functionality or not). I do think that being able to manipulate / sort / group / filter the data in a persisted view would be useful for most users, though, and that comes with the problem I mentioned.

Please see comment about "saved" and "unsaved" filters above.

mathemancer · 2022-01-19T14:53:30Z

product/specs/2022-01-views/03-modeling-view-columns.md

+### Data Sources
+ - **Definition**: This is the set of source columns that are used to generate the data in the current View column.
+- **Allowed values**: references to other Table or View columns, including other columns in the same View.
+- **Optional**: This could be empty for purely calculated columns (e.g. using the Postgres `random()` function and putting the output in a column)
+
+### Data Formula


I'm not sure Data Sources and Data Formula should be separate. The formula must include the sources (if there are any) to make sense.

The formula would include the sources, yes.

I'm borrowing Element's UI to illustrate the idea, imagine Matrix channels are data sources.

Formula would be something like:

Sources would just be list of variables used in the formula:

I like how you made that illustration, Kriti. Creative. I'm leaning towards dropping Data Sources, and querying the Data Formula for what columns it references. That way there would be a single source of truth for this. Having Data Sources at the same level of interface as Data Formula is a bit in conflict with the fact that one is derived from the other.

We won't always have a Formula, e.g. a view that's built on the query SELECT Title, Release Year, Rating FROM Movies; won't have any Formula for Title, Release Year or Rating. It will have a Source for each column, though.

We also may not have a Source for a column with a Formula - e.g. a column might be generated using the RANDOM() function, which would be the Formula. There's no source data for it, though.

mathemancer · 2022-01-19T15:00:45Z

product/specs/2022-01-views/04-ui-requirements-for-views.md

+## Columns
+- **For alpha release**: Users should be able to see all columns associated with a view. Each column should show:
+	- Data Type (non-editable)
+	- Data Sources (editable if there's no Formula)


I don't understand the goal of making the data sources editable. Would the intention be to drop a different column in in place of the one shown? Wouldn't it be simpler / smoother to remove that column and add a different one?

I think that if users want to reuse a view w a different table, this might come in handy. I think removing/adding a new column is probably easier w SQL than it is through the interface.

Now I definitely don't understand. Do you want them to use sort of the same processing, but on a different data set? I.e., maybe apply the same sequence of filters and grouping? Do you have an example of what you're envisioning there (i.e., with actual table names and fake data)?

I don't think we can reuse views with different tables, at least not in any easy way that's worth doing for the alpha release.

I'll remove data sources being editable, it would be better to drop and create new columns.

mathemancer · 2022-01-19T15:04:10Z

product/specs/2022-01-views/04-ui-requirements-for-views.md

+- **For alpha release**: 
+	- Users should be able to see the rows associated with a view.
+	- If a cell is a direct representation of a record, users should be able to edit that record via that cell.
+	- If a column is a direct representation of a record, users should be able to add a new record via that cell.


To double-check; the intent would be to pop up some "record input" UI when trying to edit the cell, correct? I.e,. since the entire table wouldn't be there, they'd need some interface to input the rest of the row in the associated table. What if the column is the join column? Would they edit the record in both tables, or just one? (Or does that not count as a direct representation?)

Yes, that's the idea.

I think if it's a join column, it should be updated in both tables. I'll specify.

Actually, on second thought, join columns should not count as direct representations. Only columns with a single data source count.

mathemancer · 2022-01-19T15:05:03Z

product/specs/2022-01-views/04-ui-requirements-for-views.md

+	- Users should be able to see what filters are applied to their View.
+	- Users should be able to edit and delete filters applied to their view, including basic use cases for columns that are not visible in the View.


I don't quite understand this. Does this refer to editing the underlying query via the filter UI, or filtering the already-created view?

Both, I'll add more details about "saved" vs. "unsaved".

mathemancer · 2022-01-19T15:05:31Z

product/specs/2022-01-views/04-ui-requirements-for-views.md

+	- Users should be able to see what sorts are applied to their View.
+	- Users should be able to edit and delete sorts applied to their view, including basic use cases for columns that are not visible in the View.


Same question as filters

silentninja · 2022-01-24T08:53:30Z

product/specs/2022-01-views/03-modeling-view-columns.md

+Here's how I think we should model view columns in our API and UI. Each heading represents an attribute of a View Column.
+
+### Data Type
+- **Definition**: This is the final data type of the content of the column after any computations etc. are applied.


We cannot always determine the exact data type of a view's column, as it could be a function that returns a polymorphic type. So we should be taking this into consideration and support having polymorphic column types

You can create a function like that, and even use it to generate the column, but if PostgreSQL can't figure out a non-pseudo-type for the column, it'll throw an error. All polymorphic types are pseudo types.

https://www.postgresql.org/docs/13/datatype-pseudo.html

So, you can't have a column of type, e.g., ANYARRAY

silentninja · 2022-01-24T09:12:34Z

product/specs/2022-01-views/03-modeling-view-columns.md

+- **Required**. Data type should always be set, at the very least, we can treat unknown data types as text.
+
+### Sources
+ - **Definition**: This is the set of source columns that are used to generate the data in the current View column.


Will it be just referring to the parent of view ignoring the ancestors?

Yes, it will reference the immediate parent. If the parent is another view, you'll have to go look at that view to find the source.

I do like the idea of a dependency graph eventually but I don't think it's worth doing before the alpha release.

silentninja · 2022-01-24T10:39:22Z

product/specs/2022-01-views/04-ui-requirements-for-views.md

+## Filters
+- **For alpha release**:
+	- Users should be able to see what filters are applied to their View.
+	- Users should be able to edit and delete filters applied to their view in the UI, including basic use cases for columns that are not visible in the View.


What does "including basic use cases for columns that are not visible in the View" mean?

Assuming SELECT ID, Title FROM Movies WHERE Year > 2000; is the query for the view, here the "filter" would be Year > 2000. Users should be able to edit that even though Year is not a visible column in the View.

… views_spec

This reverts commit 35e2683.

This reverts commit 1b15c78.

kgodey · 2022-01-24T21:18:17Z

@silentninja @mathemancer I've updated the spec to have a new page for modeling View Queries. Queries have their own filters, sorts, and aggregations, which should only be accessible when you're editing the query (if the query is editable). This is separate from filters. sorts, and groups on view data.

The "saved filter" concept has been removed, although there will still be functionality to apply whatever filters you have in the UI to the view query.

This reverts commit 601ae44.

kgodey · 2022-01-24T22:46:19Z

I'm going to go ahead and merge this. I'll open a new discussion once it's up on the wiki for any further thoughts.

Initial commit of Views spec.

4f711ea

kgodey requested a review from a team as a code owner January 13, 2022 20:57

github-actions bot requested review from eito-fis, ghislaineguerin, mathemancer, pavish, seancolsen and silentninja and removed request for a team January 13, 2022 20:57

kgodey requested review from dmos62 and a team and removed request for eito-fis January 13, 2022 20:57

kgodey assigned seancolsen, kgodey, ghislaineguerin, silentninja, mathemancer, pavish and dmos62 Jan 13, 2022

kgodey added the status: review In review label Jan 13, 2022

pavish reviewed Jan 13, 2022

View reviewed changes

kgodey mentioned this pull request Jan 14, 2022

Design for List data type mathesar-foundation/mathesar#978

Closed

seancolsen approved these changes Jan 18, 2022

View reviewed changes

seancolsen removed their assignment Jan 18, 2022

pavish removed their assignment Jan 18, 2022

mathemancer reviewed Jan 19, 2022

View reviewed changes

ghislaineguerin approved these changes Jan 19, 2022

View reviewed changes

kgodey unassigned ghislaineguerin Jan 20, 2022

Clarified Views spec based on review comments.

a634ef0

kgodey requested a review from mathemancer January 20, 2022 21:51

Fixed image links

49149f1

kgodey force-pushed the views_spec branch 2 times, most recently from 3e2ee13 to 95b0c95 Compare January 20, 2022 22:29

Fixed issue with incorrectly identifying unused images

a3af069

kgodey force-pushed the views_spec branch from 1515cac to a3af069 Compare January 20, 2022 22:35

Organize images

1b15c78

dmos62 removed their request for review January 21, 2022 12:19

kgodey unassigned dmos62 Jan 22, 2022

silentninja reviewed Jan 24, 2022

View reviewed changes

kgodey and others added 6 commits January 24, 2022 15:49

Added view query modeling.

91e8fac

Merge branch 'views_spec' of github.com:centerofci/mathesar-wiki into…

28a8baf

… views_spec

Organize images

35e2683

Revert "Organize images"

dd659bb

This reverts commit 35e2683.

Revert "Organize images"

062bd76

This reverts commit 1b15c78.

Organize images

494f1cd

silentninja removed their assignment Jan 24, 2022

kgodey and others added 2 commits January 24, 2022 16:14

Clarified UI requirements for Views

2a59099

Organize images

6ca5c8d

kgodey and others added 6 commits January 24, 2022 17:11

Added wireframes to illustrate UI ideas.

ce3941b

Organize images

f6b26e9

Fixed images.

cc4e6d0

Organize images

601ae44

Revert "Organize images"

bde64db

This reverts commit 601ae44.

Fixed images

500bd75

kgodey merged commit 5639935 into master Jan 24, 2022

kgodey deleted the views_spec branch January 24, 2022 22:34


		# Introduction

		Fundamentally, Views are saved database queries. This means that in order to work with Views in Mathesar, we need to translate every concept that can be used in [PostgreSQL queries](https://www.postgresql.org/docs/14/queries.html) to our end users in a user-friendly way.

		- Users should be able to see what filters are applied to their View.
		- Users should be able to edit and delete filters applied to their view, including basic use cases for columns that are not visible in the View.

		- Users should be able to see what sorts are applied to their View.
		- Users should be able to edit and delete sorts applied to their view, including basic use cases for columns that are not visible in the View.

Product spec for Views #22

Product spec for Views #22

Conversation

kgodey commented Jan 13, 2022 • edited Loading

Previous discussion and context

Additional notes

kgodey commented Jan 13, 2022

pavish Jan 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

silentninja Jan 24, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mathemancer Jan 24, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kgodey commented Jan 13, 2022

mathemancer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mathemancer Jan 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kgodey commented Jan 24, 2022

kgodey commented Jan 24, 2022

kgodey commented Jan 13, 2022 •

edited

Loading

pavish Jan 13, 2022 •

edited

Loading

silentninja Jan 24, 2022 •

edited

Loading

mathemancer Jan 24, 2022 •

edited

Loading

mathemancer Jan 20, 2022 •

edited

Loading