Add built in connection pool #108

fishcakez · 2017-12-11T00:57:55Z

Adds a built in connection pool that will increase throughput by 3-7% for most single queries and load levels in Ecto. Other workloads may get 0%-10% improvements.

The pool uses an ETS queue and some other ETS trickier to avoid tracking busy connections/clients. The later feature (if this was only pool style) would allow us to stop using the pdict so that we can ensure no data leaks for non-closure transactions.

We might be able to get better performance by removing either or both of these ETS use cases.

IO overload handling is better (and simpler) than Sojourn pool. In future we could improve the case where the database is down or other side of netsplit because we can know when there are no connections.

CPU overload handling is not as good as Sojourn pool but could be improved by limiting the queue size and dropping all new checkouts above that size.

There is no overflow. This is something we could do in future if we have a simple and good algorithm.

Idle ping handling is improved over other pools because pings will never synchronize and happen as rarely as possible.

Timeout handling is improved over other pools because an absolute deadline can be used (useful for multiple repo or multi service calls) and the queue time is included in the timeout if not absolute. Also queue timeouts are DBConnection.ConnectionError errors, like Sojourn pool, so easier to handle.

Processes hibernation occurs efficiently and does not affect performance. It could be more efficient if hibernate on backoff.

Overhead for pool_size: 1 is insignificant compared to DBConnection.Connection so could replace it.

josevalim · 2017-12-11T10:55:24Z

This looks great! What is the plan here? Do we plan to remove support for at least the Sojourn pool then? I think we should come up with a plan to remove code before adding even more. :)

I also recall you saying this supports ownership out of the box. Should we force this pool to be used whenever we want the ownership style tests?

fishcakez · 2017-12-11T18:47:48Z

What is the plan here?

There isn't a plan here yet, just that I knew we could have a faster pool.

Do we plan to remove support for at least the Sojourn pool then?

I would like to do this very much. I would actually like to remove all the pools and just use this one and ownership. Or some other big simplification.

An alternative would be to always supervise connections away from the actual pool process like this so that most of the logic is the same shared off-tree worker, with light messaging passing/queuing loop in the pooled worker. By connecting in another process we can improve the queuing in all existing pools (except Sojourn that already does it). I think it would also allow us to simplify Ownership by not having a nested pool and always running pool_size proxy processes.

I also recall you saying this supports ownership out of the box. Should we force this pool to be used whenever we want the ownership style tests?

🤔 this doesn't actually work with ownership because of #106 (nb this is issue trivial to solve for most cases by having a watcher per pool type). If we stopped allowing custom pool behaviors it could work by using the same message passing semantics, i.e. contract on messages rather than functions. This would also solve the issue of having to pass pool option.

josevalim · 2017-12-11T20:21:28Z

I would like to do this very much. I would actually like to remove all the pools and just use this one and ownership. Or some other big simplification.

👍

If we stopped allowing custom pool behaviors it could work by using the same message passing semantics

👍

fishcakez · 2018-01-08T04:11:13Z

We should add an expire_timeout or similar option that disconnects a connection on checkin/poll if it is older than a timeout.

michalmuskala · 2018-01-30T19:37:52Z

Today @DavidAntaramian asked on slack about gathering some statistics about the pool - in particular stats about saturation and load of the pool for optimising and monitoring the application.

If we're designing a new pool, maybe it would make sense to also include some APIs to gather this data or at least provide some hooks so it could be integrated with some metrics library?

michalmuskala · 2018-01-30T19:43:48Z

Also, regarding overflow. I think for database connections actually a way to dynamically change the pool size rather than poolboy-style overflow would be more useful, because of the connection cost.

If we had a way to gather metrics, it would become possible to have a component regulating the pool size (this could be integrated as an external library), growing it during overload and decreasing during quiet time, but smoothly.

fishcakez · 2018-01-30T19:55:51Z

If we're designing a new pool, maybe it would make sense to also include some APIs to gather this data or at least provide some hooks so it could be integrated with some metrics library?

Yes definitely but for metrics we can worry after the implementation I think. It will be easier to do once we have removed poolboy/sojourn, we do not need to get ahead of ourselves :D.

If we had a way to gather metrics, it would become possible to have a component regulating the pool size (this could be integrated as an external library), growing it during overload and decreasing during quiet time, but smoothly.

This is what the Sojourn pool does already FWIW (and provides hookable metrics). I am slightly 👎 on any overflow feature to be honest because of the complexity.

Not required to monitor as give away and heir handles tracking caller

fishcakez · 2018-02-12T01:14:54Z

@michalmuskala I removed the monitoring as its not required, unsure how minimal the increase in performance is but it simplifies the code slightly.

michalmuskala · 2018-02-12T06:35:30Z

@fishcakez I got some failures when I tried the changes:

07:34:36.676 [error] GenServer #PID<0.187.0> terminating
** (CaseClauseError) no case clause matching: {-576460751509, #Reference<0.4040485667.3719692292.85506>}
    (db_connection) lib/db_connection/connection_pool.ex:155: DBConnection.ConnectionPool.handle_info/2
    (stdlib) gen_server.erl:616: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:686: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message: {:timeout, #Reference<0.4040485667.3719561220.85547>, {-576460750509, -576460751509}}
State: {:ready, #Reference<0.4040485667.3719692292.85497>, %{delay: 0, idle: #Reference<0.4040485667.3719561220.85548>, idle_interval: 1000, interval: 1000, next: -576460750509, poll: #Reference<0.4040485667.3719561220.85547>, slow: false, target: 50}}

fishcakez · 2018-02-12T07:13:49Z

Oh there was a subtle bug in the old code that didn't get realized until that change.

michalmuskala · 2018-02-12T08:05:06Z

Here are some benchmarks with raw db_connection, without actually connecting to a database:

https://gist.github.com/michalmuskala/e503d650718d122fbdb478372b83c46a

michalmuskala · 2018-02-12T08:14:21Z

And here are some with postgres connecting to a real database:

https://gist.github.com/michalmuskala/00d163ddd0c7eff1770b27cd28ad8dfb

redink · 2018-04-03T00:41:42Z

Any updates?

fishcakez · 2018-04-25T05:41:35Z

I've moved to raw message instead of GenServer call. This would be ready to merge, and then work on changes to support the raw message, instead of option in later PRs

michalmuskala · 2018-04-25T06:33:13Z

lib/db_connection/connection_pool.ex

+  def handle_info({:db_connection, from, {:checkout, now, queue?}}, {:busy, queue, _} = busy) do
+    case queue? do
+      true ->
+        :ets.insert(queue, {{now, System.unique_integer(), from}})


What if instead of unique integers we would hold something like a logical clock in the server? Wouldn't it be slightly less overhead?

Why would it be less overhead? I think the code is simpler without changing the server's state.

I'm not sure how expensive the unique_integer() call is - that's all

The zero arity one is very efficient so we should be good. Perhaps if it produces bignums it might make other parts slower. {monotonic_time, unique_integer} is the preferred way to create a unique timestamp, we can move this to the caller if it proves to be cheaper.

michalmuskala · 2018-04-25T06:35:01Z

lib/db_connection/connection_pool.ex

+        checkout_holder(holder, from, queue) and :ets.delete(queue, key)
+        {:noreply, ready}
+      :"$end_of_table" ->
+        handle_info(checkout, put_elem(ready, 0, :busy))


I think it would be clearer if we had handle_info(checkout, {:busy, queue, codel}) here. I think the put_elem call is a bit distracting

put_elem is faster, we could have helpers for updating the state and inline them.

I don't think the perf difference will be noticeable here 😛

Will refactor this part for clarity and avoid some pattern matching.

michalmuskala · 2018-04-25T06:41:04Z

lib/db_connection/connection_pool.ex

+    for {sent, from} <- :ets.select(queue, select_slow) do
+      drop(time - sent, from)
+    end
+    :ets.select_delete(queue, [{match, guards, [true]}])


I wonder if it would be better to select the full key in the previous pass and then do the deletes by key instead.

By better do you mean faster or clearer? I think ETS should be able to optimize this so its faster as its an ordered set so becomes o(n) where n is length to delete, whereas delete each would be o(nlog(m)). However code is likely clearer.

josevalim · 2018-04-25T18:25:25Z

Unique integer should be fast if it is not monotonic. Basically a simple counter per scheduler. -- *José Valimwww.plataformatec.com.br <http://www.plataformatec.com.br/>Founder and Director of R&D*

fishcakez · 2018-06-10T19:09:17Z

Merging to unblock work on moving ownership to use this pool. We can improve this pool further later on.

fishcakez added 5 commits December 3, 2017 16:22

Initial connection pool implementation

9d9c236

Support multiple connections

42738a2

Fix incorrect status change

4053442

Add the missing modules

c27e88b

Fix time unit on pre 1.3

2817ab4

fishcakez assigned michalmuskala Jan 7, 2018

This was referenced Jan 7, 2018

Smarter hibernate #103

Closed

Randomise ping interval to prevent synchronisation in poolboy pool #83

Closed

Add expiration timeout that limits max life-time of a connection #99

Closed

Skip monitoring in ConnectionPool

cb408cb

Not required to monitor as give away and heir handles tracking caller

Handle poll/idle when ready/busy queue active

668b0c9

fishcakez mentioned this pull request Apr 23, 2018

More helpful ConnectionError message on query timeout #119

Closed

fishcakez added 2 commits April 24, 2018 22:24

Switch to raw message in connection pool

ccc8548

Support Elixir 1.2

b58d49a

michalmuskala reviewed Apr 25, 2018

View reviewed changes

tsloughter mentioned this pull request May 14, 2018

Use pool library instead of epgsql_pool or emysql dbi-beam/dbi#12

Closed

michalmuskala mentioned this pull request May 19, 2018

Ownership pool can not start Sojourn pool #106

Closed

ericmj mentioned this pull request May 20, 2018

Add high-level API with multi-host pooling elixir-mint/mint#32

Closed

michalmuskala added this to the v2.0 milestone May 23, 2018

fishcakez merged commit db1571e into master Jun 10, 2018

fishcakez deleted the jf-conn-pool branch June 10, 2018 19:12

michalmuskala mentioned this pull request Jun 12, 2018

Remove Sojourn pool #125

Merged

tsloughter mentioned this pull request Jan 27, 2020

Why PGO is better than epgsql + poolboy? dbi-beam/dbi_pgo#1

Closed

wojtekmach mentioned this pull request Oct 28, 2024

Add :timeout and :deadline options wojtekmach/req#428

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add built in connection pool #108

Add built in connection pool #108

fishcakez commented Dec 11, 2017 •

edited

Loading

josevalim commented Dec 11, 2017

fishcakez commented Dec 11, 2017

josevalim commented Dec 11, 2017

fishcakez commented Jan 8, 2018

michalmuskala commented Jan 30, 2018 •

edited

Loading

michalmuskala commented Jan 30, 2018

fishcakez commented Jan 30, 2018 •

edited

Loading

fishcakez commented Feb 12, 2018

michalmuskala commented Feb 12, 2018

fishcakez commented Feb 12, 2018

michalmuskala commented Feb 12, 2018

michalmuskala commented Feb 12, 2018

redink commented Apr 3, 2018

fishcakez commented Apr 25, 2018

michalmuskala Apr 25, 2018

fishcakez Apr 25, 2018

michalmuskala Apr 25, 2018

fishcakez Apr 25, 2018

michalmuskala Apr 25, 2018

fishcakez Apr 25, 2018

michalmuskala Apr 25, 2018

fishcakez Apr 25, 2018

michalmuskala Apr 25, 2018

fishcakez Apr 25, 2018

josevalim commented Apr 25, 2018 via email

fishcakez commented Jun 10, 2018

Add built in connection pool #108

Add built in connection pool #108

Conversation

fishcakez commented Dec 11, 2017 • edited Loading

josevalim commented Dec 11, 2017

fishcakez commented Dec 11, 2017

josevalim commented Dec 11, 2017

fishcakez commented Jan 8, 2018

michalmuskala commented Jan 30, 2018 • edited Loading

michalmuskala commented Jan 30, 2018

fishcakez commented Jan 30, 2018 • edited Loading

fishcakez commented Feb 12, 2018

michalmuskala commented Feb 12, 2018

fishcakez commented Feb 12, 2018

michalmuskala commented Feb 12, 2018

michalmuskala commented Feb 12, 2018

redink commented Apr 3, 2018

fishcakez commented Apr 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josevalim commented Apr 25, 2018 via email

fishcakez commented Jun 10, 2018

fishcakez commented Dec 11, 2017 •

edited

Loading

michalmuskala commented Jan 30, 2018 •

edited

Loading

fishcakez commented Jan 30, 2018 •

edited

Loading