-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Smarter hibernate #103
Smarter hibernate #103
Conversation
alias TestPool, as: P | ||
alias TestAgent, as: A | ||
|
||
@tag :idle_hibernate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should have the :idle_timeout
tag because it relies on idle timeout being active. The tags are used for test filtering, so if a pool doesn't support idle timeout it can't hibernate after an idle timeout. If there is a custom pool that needs this tag then we can leave it on. I think we should also move this test to the idle_test.exs file.
opts = [agent: agent, parent: self(), idle_timeout: 50, idle_hibernate: true] | ||
{:ok, pool} = P.start_link(opts) | ||
assert_receive {:hi, conn} | ||
assert_receive {:pong, ^conn} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can test the process hibernates in someway.
integration_test/tests.exs
Outdated
@@ -13,3 +13,4 @@ Code.require_file "cases/queue_test.exs", __DIR__ | |||
Code.require_file "cases/stream_test.exs", __DIR__ | |||
Code.require_file "cases/transaction_execute_test.exs", __DIR__ | |||
Code.require_file "cases/transaction_test.exs", __DIR__ | |||
Code.require_file "cases/hibernate_test.exs", __DIR__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alphabetically please because once we had a test file that wasn't being tested and that makes it easier to locate ;)
@@ -153,7 +155,7 @@ defmodule DBConnection.Connection do | |||
end) | |||
done_lock(regulator, lock) | |||
{timeout, backoff} = Backoff.backoff(backoff) | |||
{:backoff, timeout, %{s | lock: nil, backoff: backoff}} | |||
maybe_hibernate(idle_hibernate, {:backoff, timeout, %{s | lock: nil, backoff: backoff}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should test this part with the other backoff handling
I added a way for test the hibernate, I have no idea but only this one. If you have better one, give me help please. Thanks. |
@redink we could read the memory in the callback and send it in the message to the test process and then in a recursive loop wait for it to go down. Alternatively in a similar recursive loop we could wait for the We should consider limiting the loop by a time limit using |
{:ok, _} = P.start_link(opts) | ||
assert_receive {:error, conn} | ||
assert_receive {:hi, ^conn} | ||
|
||
if test_hibernate? and Mix.env() != :sojourn do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the :sojourn
check is required then it means we are missing some code for the sojourn pool. We shouldn't have a Mix.env
call, this context should be handled by tags or config.
We are missing the hibernate at this location:
db_connection/lib/db_connection/connection.ex
Line 544 in 03ea53f
continue_ask(%{s | idle_time: 0, state: state}) |
ping
succeeds and the :sbroker
async requests has been sent).
lib/db_connection/connection.ex
Outdated
{:noreply, s, idle_timeout} | ||
defp handle_timeout(%{client: nil, idle_timeout: idle_timeout, | ||
idle_hibernate: idle_hibernate} = s) do | ||
Process.send_after(self(), :timeout, idle_timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use a timer then we need to be careful to cancel the timer and/or have the timeout message uniquely identifiable (include reference) because the :timeout
could arrive late and trigger early timeout and out of sync timeouts.
This module includes few examples of :erlang.start_timer
to do this in similar way. Given timers have better behavior here (timers give stricter timeout than a timeout return value) its a good change. This might complicate code, I think that might be why didn't implemented like that originally.
Seems like we really need gen_statem
here but cant until otp 20+.
{:ok, _} = P.start_link(opts) | ||
assert_receive {:error, conn} | ||
assert_receive {:hi, ^conn} | ||
|
||
if test_hibernate? and Mix.env() != :sojourn do | ||
assert {:current_function, {:erlang, :hibernate, 3}} == | ||
Process.info(conn, :current_function) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may have a race condition as we can't guarantee the process has hibernated because of the async nature of processes. However we can leave this until its observed.
|
||
opts = [agent: agent, parent: self(), backoff_min: 10] | ||
defp execute_test_backoff_after_failed(agent, opts, test_hibernate? \\ false) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If theres enough logic to share a function we could try to simplify the test when hibenating and not share function. We might have overlapping tests.
lib/db_connection/connection.ex
Outdated
hibernate_timer: hibernate_timer} = s) do | ||
cancel_timer(hibernate_timer) | ||
hibernate_timer = Process.send_after(self(), :timeout, idle_timeout) | ||
# hibernate_timer = start_timer(self(), idle_timeout, :trigger_hibernate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NEED HELP !
The test will failed if I use start_timer
, but it works well if I use Process.send_after/3
.
The error information:
1) test ping after idle timeout using hibernate (TestIdle)
integration_test/cases/idle_test.exs:15
No message matching {:pong, ^conn} after 500ms.
The following variables were pinned:
conn = #PID<0.561.0>
The process mailbox is empty.
code: execute_test_case(agent, opts, true)
stacktrace:
integration_test/cases/idle_test.exs:53: TestIdle.execute_test_case/3
integration_test/cases/idle_test.exs:18: (test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the timer arrives we need to call the logic that is currently done by handle_info(:timeout, ...)
because the timer is replacing that timeout. The reason it fails is because we don't call apply(mod, :ping, [state])
anymore.
lib/db_connection/connection.ex
Outdated
|
||
defp cancel_timer(nil), do: :ok | ||
defp cancel_timer(timer) do | ||
case :erlang.cancel_timer(timer) do | ||
false -> flush_timer(timer) | ||
false -> :erlang.read_timer(timer) and flush_timer(timer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read_timer
will always be false
here so wont flush, we don't need this part.
lib/db_connection/connection.ex
Outdated
@@ -593,11 +605,15 @@ defmodule DBConnection.Connection do | |||
defp start_timer(pid, timeout) do | |||
:erlang.start_timer(timeout, self(), {__MODULE__, pid, timeout}) | |||
end | |||
defp start_timer(_, :infinity, _), do: nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets reuse the existing timer function by calling start_timer(:idle, timeout)
so the message carries some information!
lib/db_connection/connection.ex
Outdated
@@ -606,6 +622,8 @@ defmodule DBConnection.Connection do | |||
receive do | |||
{:timeout, ^timer, {__MODULE__, _, _}} -> | |||
:ok | |||
{:timeout, ^timer, :trigger_hibernate} -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't need this clause see above!
lib/db_connection/connection.ex
Outdated
@@ -109,6 +109,7 @@ defmodule DBConnection.Connection do | |||
|
|||
s = %{mod: mod, opts: opts, state: nil, client: :closed, broker: broker, | |||
regulator: regulator, lock: nil, queue: queue, timer: nil, | |||
hibernate_timer: nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use idle_timer
as the timer is for idle and not hibernating, but hibernate is a possible side effect of going idle.
@@ -10,6 +10,7 @@ defmodule BackoffTest do | |||
execute_test_backoff_after_failed(agent, opts) | |||
end | |||
|
|||
@tag :idle_hibernate_backoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not require this tag, all pools should support this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually lets keep this for now, ignore the above comment.
integration_test/cases/idle_test.exs
Outdated
@@ -11,7 +11,7 @@ defmodule TestIdle do | |||
execute_test_case(agent, opts) | |||
end | |||
|
|||
@tag :idle_timeout | |||
@tag :idle_hibernate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should support this for all pools that can do idle_timeout
, I left a comment on how to achieve this previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually lets keep this for now, ignore the above comment.
lib/db_connection/connection.ex
Outdated
hibernate_timer: hibernate_timer} = s) do | ||
cancel_timer(hibernate_timer) | ||
hibernate_timer = Process.send_after(self(), :timeout, idle_timeout) | ||
# hibernate_timer = start_timer(self(), idle_timeout, :trigger_hibernate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the timer arrives we need to call the logic that is currently done by handle_info(:timeout, ...)
because the timer is replacing that timeout. The reason it fails is because we don't call apply(mod, :ping, [state])
anymore.
lib/db_connection/connection.ex
Outdated
@@ -380,7 +386,7 @@ defmodule DBConnection.Connection do | |||
do_handle_info(msg, s) | |||
end | |||
def handle_info(:timeout, %{idle_hibernate: idle_hibernate} = s) do | |||
maybe_hibernate(idle_hibernate, {:noreply, s}) | |||
maybe_hibernate(idle_hibernate, {:noreply, %{s | hibernate_timer: nil}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should never reach this handle_info/2
clause. I think we can delete this clause.
lib/db_connection/connection.ex
Outdated
@@ -355,6 +356,11 @@ defmodule DBConnection.Connection do | |||
end | |||
end | |||
|
|||
def handle_info({:timeout, timer, :trigger_hibernate}, | |||
%{idle_hibernate: idle_hibernate} = s) when is_reference(timer) do | |||
maybe_hibernate(idle_hibernate, {:noreply, %{s | hibernate_timer: nil}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't hibernate here, first we need to call the current logic in handle_info(:timeout, ..)
that follows this clause: https://github.com/elixir-ecto/db_connection/pull/103/files/5b3c7cbd2333f15b8ca30bd06bcd9533820c2db0..e1590494cd81e376261aa7d9592a709c13028e85#diff-794f06ecb2c5f02e8a11e735c3ee2339R364
@redink I left some comments: the main takeaway is that the timer is for |
|
@redink the last changes look good. It might be that |
Hello, Any updates ? Or any more comments ? |
@redink there is a performance regression so need to look in it. |
Closing in favor of #108. |
from #101