Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

exec_batch does not release GVL. #287

Open
ioquatix opened this issue Apr 3, 2020 · 9 comments
Open

exec_batch does not release GVL. #287

ioquatix opened this issue Apr 3, 2020 · 9 comments

Comments

@ioquatix
Copy link

ioquatix commented Apr 3, 2020

#!/usr/bin/env ruby

require 'sqlite3'

Thread.new do
	while true
		sleep 0.1
		$stdout.write '.'
	end
end

s = "SELECT 1;"*500000
db = SQLite3::Database.new(':memory:')

sleep 1

$stdout.write '>'
db.execute_batch2(s)
$stdout.write '<'

sleep 1

Gives

koyoko% ./sqlite3.rb
..........>.<..........

Expected:

koyoko% ./sqlite3.rb
..........>............<..........

Or something to that effect.

cc @ko1 @tenderlove @jeremyevans

@tenderlove
Copy link
Member

execute_batch2 calls back in to Ruby for each row.

We would have to unlock the GVL before calling sqlite3_exec, then re-acquire the lock each time we process a row. AFAIK releasing / acquiring the lock isn't free, so this could slow down overall throughput (especially if done for every row). On top of that, I don't think anyone actually uses this API, let alone in a multi-threaded application. Another challenge is that SQLite3 is only optionally thread safe, so we would have to add conditionals around whether or not the release the GVL (and those conditionals would have to exactly match SQLite3 which lets you pick thread safetyness at compile and/or runtime). Finally, if someone were actually using this method in a threaded environment, and we made a mistake with the GVL lock / unlock we could corrupt their data.

For the above reasons, (high risk, low reward) I'm not particularly motivated to fix this. So if someone wants to try tackling it, that would be great. Otherwise I'm going to wait for user demand.

@ioquatix
Copy link
Author

ioquatix commented Apr 18, 2020

I think what you proposed is the correct approach as locking the VM is definitely the wrong approach. The overhead of the per-row lock is to be worn by the user of the interface and is a natural consequence of the GVL.

Sqlite3 should not be used on multiple threads IMHO, it should be a separate connection/instance per thread. Does that work? Otherwise, a per-connection mutex would make sense to me. What happens if the row callback tries to execute another query?

@tenderlove
Copy link
Member

I think what you proposed is the correct approach as locking the VM is definitely the wrong approach.

I don't understand? My proposal is to do nothing unless there is demand for the feature by people that use it.

@tenderlove
Copy link
Member

I'm also open to deprecating and deleting the method too. But AFAIK it's not hurting any users the way it's implemented today

@ioquatix
Copy link
Author

ioquatix commented Apr 18, 2020

We would have to unlock the GVL before calling sqlite3_exec, then re-acquire the lock each time we process a row.

This implementation is the correct one. However, I agree it's inefficient.

If sqlite3 is not thread safe, you should protect it with a mutex, which would also prevent using the same connection for a 2nd query while enumerating the results of a different query.

@ioquatix
Copy link
Author

Do any of the methods in this code base release the GVL?

@bradgessler
Copy link

Greetings! I'm working on an project at https://github.com/fly-apps/railsite that would make Sqlite a real option for smaller production Rails deployments, which would ideally work with a threaded Puma server.

Since it's been a while, here's what the results look like in ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21] with sqlite3 1.5.4.

.........>.<.........

This confirms this is still an issue with the latest versions of Ruby and the Gem.

My question is whether or not the flag at

# Was sqlite3 compiled with thread safety on?
def self.threadsafe?; threadsafe > 0; end
would solve the thread safety problems discussed above. If so, then this may be a documentation issue? I'm guessing there's more work that needs to be done around implementing a Mutex in Rubyland.

@dbackeus
Copy link

dbackeus commented Dec 20, 2023

AFAICT this is not only an issue with execute_batch but any execution using this gem. This effectively prevents any kind of Thread / Fiber based concurrency from working as expected.

Now that "SQLite in production" is trending, this is going to pose obvious performance issues for anyone using thread / fiber based servers / job processors like puma / falcon / sidekiq etc for their apps.

The extralite gem is an alternative SQLite client which releases the GVL during blocking, see note on concurrency here: https://github.com/digital-fabric/extralite?tab=readme-ov-file#concurrency. It is both significantly faster than this gem in general and doesn't have concurrency issues. Maybe that could be a source of inspiration for how to solve this? (I'd love to help out myself, but alas I have no experience programming in C)

I'm currently messing with a POC for allowing extralite to be used with ActiveRecord's SQLite adapter: fractaledmind/activerecord-enhancedsqlite3-adapter#2. The implementation hacky right now, but it works and solves the concurrency issues. Would be nice if this gem could solve this as well, to avoid the rigamarole of migrating the backing client for ActiveRecord.

@jhawthorn
Copy link
Member

It is both significantly faster than this gem in general and doesn't have concurrency issues. Maybe that could be a source of inspiration for how to solve this? (I'd love to help out myself, but alas I have no experience programming in C)

I took a quick look and unfortunately, no, that does not look like an implementation to look up to. It seems like by default they release the GVL for 1 in 1000 rows returned which is essentially just an overly complex way to call Thread.pass while actually holding the GVL for 99.9+% of work.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants