-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Suggest to use hash index, instead of btree index in migration. #94
Conversation
Upps, let me fix this :) |
IMO the added complexity in the installation command and keeping two migration files is a bit too much here. How about ensuring that the create_table :solid_cache_entries do |t|
t.binary :key, null: false, limit: 1024, index: { unique: true, using: :hash }
t.binary :value, null: false, limit: 512.megabytes
t.datetime :created_at, null: false
# Uncomment the btree index below if you want to use `SolidCache#Entry.delete_matched`
# t.index :key, unique: true, index: using: :btree
end |
Thanks, @rafaelsales I agree with your suggestion. Just for visibility’s sake, I'll post @djmb response from slack
I'll wait for final decision from @djmb and when I'll fix this PR. But it seems reasonable to completely remove support for #delete_matched and have only one migration that uses hash index in PG, but btree in MySQL. |
Small note on this, per Postgres' documentation:
So a hash index cannot have uniqueness enforcement. |
@djmb would be great to hear, which direction you want to take with "hash index" for PG. I'll be happy to adjust this PR accordingly. Not sure, if there is a point to fix current implementation. |
Hi @skatkov! Sorry for the delay on this! I'm planning to add a separate Maybe we could still make it a hash index on postgresql, but that would depend on whether we need it to also be a unique index. We could keep the unique index on Also the performance difference between btree and hash indexes may also not be as significant for a 64 bit integer column compared to a 1K blob. Another consideration is the changes needed to allow us to estimate the cache size. If those work, then the This would require an index on I'll be working on this over the next few weeks, but it needs some investigation. Once we have the columns and indexes in place for everything we can see where a postgresql hash index fits into the mix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor indent related issue
@@ -0,0 +1,11 @@ | |||
class CreateSolidCacheEntries < ActiveRecord::Migration[7.0] | |||
def change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 4 spaces instead of 2 spaces
@@ -0,0 +1,11 @@ | |||
class CreateSolidCacheEntries < ActiveRecord::Migration[7.0] | |||
def change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
t.binary :value, null: false, limit: 512.megabytes | ||
t.datetime :created_at, null: false | ||
|
||
t.index :key, unique: true, using: :hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error encountered with message
PG::FeatureNotSupported: ERROR: access method "hash" does not support unique indexes
From doc
https://www.postgresql.org/docs/16/indexes-unique.html
Currently, only B-tree indexes can be declared unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I attempted to workaround it by removing unique: true
, I would encounter another error
ArgumentError - No unique index found for key
from
https://github.com/rails/rails/blob/v7.1.2/activerecord/lib/active_record/insert_all.rb#L161
Edit: And then I read comments above and it's already reported sorry~
I'll drop this for now. Happy to take another stab at this once @djmb will decide on a approach he wants to take here. |
Good day. Thanks for amazing gem.
I experimented with storing cache in a relational database as well. From my findings, it seems like a good idea to suggest using hash index instead of btree as a possible alternative.
Hash index doesn't have to descend through index to find a right entry, instead it efficiently finds cache entry with O(1) operation. This small improvement can yield 40-60% lookup speed improvement, but it depends on a size of cache.
This optimization doesn't bring many drawbacks, only one - #delete_matched method can't be used in this case. So I'm throwing an error, if hash index will be detected. But to be frank, not a lot of people use this method anyways - so it seems like a manageable drawback.
Just to add a little bit more context to this PR, I'm linking interesting benchmark that compares btree with hash index.
https://evgeniydemin.medium.com/postgresql-indexes-hash-vs-b-tree-84b4f6aa6d61