Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Consider pagination of canned queries #856

Open
simonw opened this issue Jun 20, 2020 · 3 comments
Open

Consider pagination of canned queries #856

simonw opened this issue Jun 20, 2020 · 3 comments
Labels

Comments

@simonw
Copy link
Owner

simonw commented Jun 20, 2020

The new canned_queries() plugin hook from #852 combined with plugins like https://github.com/simonw/datasette-saved-queries could mean that some installations end up with hundreds or even thousands of canned queries. I should consider pagination or some other way of ensuring that this doesn't cause performance problems for Datasette.

@bram2000
Copy link

bram2000 commented May 18, 2021

Hi Simon, I'm using a canned query to do some geospatial stuff, but it maxes out at 1000 rows returned. I can't see any Link headers to follow to get the next page of data. Is there any way currently to work around this 1000 row limit for canned queries?

Thanks,
Jon

@simonw
Copy link
Owner Author

simonw commented May 18, 2021

Tables and views get "stream all rows" at the moment, so one workaround is to define a SQL view for your query - this only works for queries that don't take any parameters though (although you may be able to define a view and then pass it extra fields using the Datasette table interface, like on https://latest.datasette.io/fixtures/paginated_view?content_extra__contains=9)

I've explored this problem in a bit more detail in https://githu.com/simonw/django-sql-dashboard and I think I have a pattern that could work.

For your canned query, you could implement the pattern yourself by setting up two canned queries that look something like this:

https://github-to-sqlite.dogsheep.net/github?sql=select+rowid%2C+sha%2C+author_date+from+commits+order+by+rowid+limit+1000

select rowid, sha, author_date from commits order by rowid limit 1000

That gets you the first set of 1,000 results. The important thing here is to order by a unique column, in this case rowid - because then subsequent pages can be loaded by a separate canned query that looks like this:

select rowid, sha, author_date from commits where rowid > :after order by rowid limit 1000

https://github-to-sqlite.dogsheep.net/github?sql=select+rowid%2C+sha%2C+author_date+from+commits+where+rowid+%3E+%3Aafter+order+by+rowid+limit+1000&after=1000

You then need to write code which knows how to generate these queries - start with the first query with no where clause (or if you are using rowid you can just use the second query and pass it ?after=0 for the first call) - then keep calling the query passing in the last rowid you recieved as the after parameter.

Basically this is an implementation of keyset pagination with a smart client. When Datasette grows the ability to do this itself it will work by executing this mechanism inside the Python code, which is how the "stream all rows" option for tables works at the moment.

@bram2000
Copy link

Thanks Simon this is working very well.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants