Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Smarter hibernate #103

Closed
wants to merge 9 commits into from
Closed

Smarter hibernate #103

wants to merge 9 commits into from

Conversation

redink
Copy link

@redink redink commented Nov 20, 2017

from #101

alias TestPool, as: P
alias TestAgent, as: A

@tag :idle_hibernate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should have the :idle_timeout tag because it relies on idle timeout being active. The tags are used for test filtering, so if a pool doesn't support idle timeout it can't hibernate after an idle timeout. If there is a custom pool that needs this tag then we can leave it on. I think we should also move this test to the idle_test.exs file.

opts = [agent: agent, parent: self(), idle_timeout: 50, idle_hibernate: true]
{:ok, pool} = P.start_link(opts)
assert_receive {:hi, conn}
assert_receive {:pong, ^conn}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can test the process hibernates in someway.

@@ -13,3 +13,4 @@ Code.require_file "cases/queue_test.exs", __DIR__
Code.require_file "cases/stream_test.exs", __DIR__
Code.require_file "cases/transaction_execute_test.exs", __DIR__
Code.require_file "cases/transaction_test.exs", __DIR__
Code.require_file "cases/hibernate_test.exs", __DIR__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alphabetically please because once we had a test file that wasn't being tested and that makes it easier to locate ;)

@@ -153,7 +155,7 @@ defmodule DBConnection.Connection do
end)
done_lock(regulator, lock)
{timeout, backoff} = Backoff.backoff(backoff)
{:backoff, timeout, %{s | lock: nil, backoff: backoff}}
maybe_hibernate(idle_hibernate, {:backoff, timeout, %{s | lock: nil, backoff: backoff}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test this part with the other backoff handling

@redink
Copy link
Author

redink commented Nov 21, 2017

I added a way for test the hibernate, I have no idea but only this one.

If you have better one, give me help please. Thanks.

@fishcakez
Copy link
Member

@redink we could read the memory in the callback and send it in the message to the test process and then in a recursive loop wait for it to go down. Alternatively in a similar recursive loop we could wait for the Process.info(self(), current_function) to change to the appropriate function.

We should consider limiting the loop by a time limit using System.monotonic_time. Also we could sleep between each check to allow other processes to run.

{:ok, _} = P.start_link(opts)
assert_receive {:error, conn}
assert_receive {:hi, ^conn}

if test_hibernate? and Mix.env() != :sojourn do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the :sojourn check is required then it means we are missing some code for the sojourn pool. We shouldn't have a Mix.env call, this context should be handled by tags or config.

We are missing the hibernate at this location:

continue_ask(%{s | idle_time: 0, state: state})
(i.e. after ping succeeds and the :sbroker async requests has been sent).

{:noreply, s, idle_timeout}
defp handle_timeout(%{client: nil, idle_timeout: idle_timeout,
idle_hibernate: idle_hibernate} = s) do
Process.send_after(self(), :timeout, idle_timeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use a timer then we need to be careful to cancel the timer and/or have the timeout message uniquely identifiable (include reference) because the :timeout could arrive late and trigger early timeout and out of sync timeouts.

This module includes few examples of :erlang.start_timer to do this in similar way. Given timers have better behavior here (timers give stricter timeout than a timeout return value) its a good change. This might complicate code, I think that might be why didn't implemented like that originally.

Seems like we really need gen_statem here but cant until otp 20+.

{:ok, _} = P.start_link(opts)
assert_receive {:error, conn}
assert_receive {:hi, ^conn}

if test_hibernate? and Mix.env() != :sojourn do
assert {:current_function, {:erlang, :hibernate, 3}} ==
Process.info(conn, :current_function)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may have a race condition as we can't guarantee the process has hibernated because of the async nature of processes. However we can leave this until its observed.


opts = [agent: agent, parent: self(), backoff_min: 10]
defp execute_test_backoff_after_failed(agent, opts, test_hibernate? \\ false) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If theres enough logic to share a function we could try to simplify the test when hibenating and not share function. We might have overlapping tests.

hibernate_timer: hibernate_timer} = s) do
cancel_timer(hibernate_timer)
hibernate_timer = Process.send_after(self(), :timeout, idle_timeout)
# hibernate_timer = start_timer(self(), idle_timeout, :trigger_hibernate)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NEED HELP !

The test will failed if I use start_timer, but it works well if I use Process.send_after/3.

The error information:


  1) test ping after idle timeout using hibernate (TestIdle)
     integration_test/cases/idle_test.exs:15
     No message matching {:pong, ^conn} after 500ms.
     The following variables were pinned:
       conn = #PID<0.561.0>
     The process mailbox is empty.
     code: execute_test_case(agent, opts, true)
     stacktrace:
       integration_test/cases/idle_test.exs:53: TestIdle.execute_test_case/3
       integration_test/cases/idle_test.exs:18: (test)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the timer arrives we need to call the logic that is currently done by handle_info(:timeout, ...) because the timer is replacing that timeout. The reason it fails is because we don't call apply(mod, :ping, [state]) anymore.


defp cancel_timer(nil), do: :ok
defp cancel_timer(timer) do
case :erlang.cancel_timer(timer) do
false -> flush_timer(timer)
false -> :erlang.read_timer(timer) and flush_timer(timer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_timer will always be false here so wont flush, we don't need this part.

@@ -593,11 +605,15 @@ defmodule DBConnection.Connection do
defp start_timer(pid, timeout) do
:erlang.start_timer(timeout, self(), {__MODULE__, pid, timeout})
end
defp start_timer(_, :infinity, _), do: nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets reuse the existing timer function by calling start_timer(:idle, timeout) so the message carries some information!

@@ -606,6 +622,8 @@ defmodule DBConnection.Connection do
receive do
{:timeout, ^timer, {__MODULE__, _, _}} ->
:ok
{:timeout, ^timer, :trigger_hibernate} ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't need this clause see above!

@@ -109,6 +109,7 @@ defmodule DBConnection.Connection do

s = %{mod: mod, opts: opts, state: nil, client: :closed, broker: broker,
regulator: regulator, lock: nil, queue: queue, timer: nil,
hibernate_timer: nil,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use idle_timer as the timer is for idle and not hibernating, but hibernate is a possible side effect of going idle.

@@ -10,6 +10,7 @@ defmodule BackoffTest do
execute_test_backoff_after_failed(agent, opts)
end

@tag :idle_hibernate_backoff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not require this tag, all pools should support this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually lets keep this for now, ignore the above comment.

@@ -11,7 +11,7 @@ defmodule TestIdle do
execute_test_case(agent, opts)
end

@tag :idle_timeout
@tag :idle_hibernate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should support this for all pools that can do idle_timeout, I left a comment on how to achieve this previously.

Copy link
Member

@fishcakez fishcakez Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually lets keep this for now, ignore the above comment.

hibernate_timer: hibernate_timer} = s) do
cancel_timer(hibernate_timer)
hibernate_timer = Process.send_after(self(), :timeout, idle_timeout)
# hibernate_timer = start_timer(self(), idle_timeout, :trigger_hibernate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the timer arrives we need to call the logic that is currently done by handle_info(:timeout, ...) because the timer is replacing that timeout. The reason it fails is because we don't call apply(mod, :ping, [state]) anymore.

@@ -380,7 +386,7 @@ defmodule DBConnection.Connection do
do_handle_info(msg, s)
end
def handle_info(:timeout, %{idle_hibernate: idle_hibernate} = s) do
maybe_hibernate(idle_hibernate, {:noreply, s})
maybe_hibernate(idle_hibernate, {:noreply, %{s | hibernate_timer: nil}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should never reach this handle_info/2 clause. I think we can delete this clause.

@@ -355,6 +356,11 @@ defmodule DBConnection.Connection do
end
end

def handle_info({:timeout, timer, :trigger_hibernate},
%{idle_hibernate: idle_hibernate} = s) when is_reference(timer) do
maybe_hibernate(idle_hibernate, {:noreply, %{s | hibernate_timer: nil}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fishcakez
Copy link
Member

@redink I left some comments: the main takeaway is that the timer is for idling and for hibernating. The hibernation occurs after the idle handling (calling the ping callback), and it is possible we won't hibernate. The new option is to go idle after we ping or after we fail to connect.

@redink
Copy link
Author

redink commented Nov 21, 2017

sojourn test still have problem, I will work on it later.

@fishcakez
Copy link
Member

@redink the last changes look good. It might be that Sojourn doesn't play well with hibernate because of how the pool works. I suggest we don't implement hibernate for it yet and add the @idle_hibernate tag for sojourn tests.

@redink
Copy link
Author

redink commented Nov 27, 2017

Hello, Any updates ? Or any more comments ?

@fishcakez
Copy link
Member

@redink there is a performance regression so need to look in it.

@fishcakez
Copy link
Member

Closing in favor of #108.

@fishcakez fishcakez closed this Jan 7, 2018
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants