Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Problem when a memcached server changes IP address #68

Closed
jorcasso opened this issue Dec 21, 2017 · 15 comments · Fixed by #81
Closed

Problem when a memcached server changes IP address #68

jorcasso opened this issue Dec 21, 2017 · 15 comments · Fixed by #81
Assignees
Labels
Milestone

Comments

@jorcasso
Copy link

jorcasso commented Dec 21, 2017

We have to configure our client with failureMode=true, and we are getting errors like this when a memcached server gets down and comes back with a different IP:

net.rubyeye.xmemcached.exception.MemcachedException: Session(192.168.1.41:11211) has been closed
	at net.rubyeye.xmemcached.impl.MemcachedConnector.send(MemcachedConnector.java:512)
	at net.rubyeye.xmemcached.XMemcachedClient.sendCommand(XMemcachedClient.java:317)
	at net.rubyeye.xmemcached.XMemcachedClient.fetch0(XMemcachedClient.java:644)
	at net.rubyeye.xmemcached.XMemcachedClient.get0(XMemcachedClient.java:1085)
	at net.rubyeye.xmemcached.XMemcachedClient.get(XMemcachedClient.java:1043)
	at net.rubyeye.xmemcached.XMemcachedClient.get(XMemcachedClient.java:1054)
	at net.rubyeye.xmemcached.XMemcachedClient.get(XMemcachedClient.java:1076)

The old IP was 192.168.1.41 and the new one is 192.168.1.44. When the server recovers with the new IP we can see logs like:

com.google.code.yanf4j.core.impl.AbstractController:? Add a session: 192.168.1.44:11211

However, the client is still using sessions with the old IP that are closed.

I have been debugging a bit and found that in the class net.rubyeye.xmemcached.impl.MemcachedConnector there is an attribute called sessionMap that contains sessions with both the old and new IP, because the new sessions do not override the old ones, and then all those sessions are passed to the session locator in the method updateSessions(). I think the session locator should receive only the sessions with the new IP.

Please could you take a look?

@killme2008 killme2008 added the bug label Dec 22, 2017
@killme2008 killme2008 self-assigned this Dec 22, 2017
@killme2008
Copy link
Owner

I will look into it, thanks.

@flozano
Copy link

flozano commented Jan 10, 2018

Is there any progress on this issue?

@killme2008
Copy link
Owner

I am sorry , i didn't have too much time on this project in these days, but i will try to look into it this weekend.

killme2008 added a commit that referenced this issue Feb 22, 2018
@killme2008
Copy link
Owner

I've fixed this issue, it happens in failure mode, the new release will be delivered ASAP.

@killme2008 killme2008 modified the milestones: 2.4.1, 2.4.2 Feb 22, 2018
@killme2008
Copy link
Owner

Released 2.4.2, it may takes sometime to be synced into maven central repo.

https://github.com/killme2008/xmemcached/releases/tag/xmemcached-2.4.2

@saschat
Copy link
Contributor

saschat commented Apr 9, 2018

I am not convinced this is fixed. I just tried out version 2.4.2 both with failureMode=true and failureMode=false and when I change the DNS record of the server to point to a different IP and then restart the old server (52.57.227.183) the client will not connect to the new server (52.59.110.98). Here is the log output:

252885 [Xmemcached-Reactor-2] ERROR net.rubyeye.xmemcached.impl.MemcachedHandler - XMemcached network layout exception
java.io.IOException: Connection reset by peer
	at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
	at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
	at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 	at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:197)
 	at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:382)
	at com.google.code.yanf4j.nio.impl.NioTCPSession.readFromBuffer(NioTCPSession.java:209)
	at com.google.code.yanf4j.nio.impl.AbstractNioSession.onRead(AbstractNioSession.java:196)
 	at com.google.code.yanf4j.nio.impl.AbstractNioSession.onEvent(AbstractNioSession.java:339)
 	at com.google.code.yanf4j.nio.impl.SocketChannelController.dispatchReadEvent(SocketChannelController.java:56)
	at com.google.code.yanf4j.nio.impl.NioController.onRead(NioController.java:159)
	at com.google.code.yanf4j.nio.impl.Reactor.dispatchEvent(Reactor.java:328)
	at com.google.code.yanf4j.nio.impl.Reactor.run(Reactor.java:183)
252913 [Xmemcached-Reactor-2] INFO com.google.code.yanf4j.core.impl.AbstractController - Remove a session: 52.57.227.183:11211
Memcached error during get or set: There is no available connection at this moment
... (2 seconds later)
Memcached error during get or set: There is no available connection at this moment
254920 [Heal-Session-Thread] INFO com.google.code.yanf4j.core.impl.AbstractController - Trying to connect to 52.59.110.98:11211 for 1 times
254944 [Xmemcached-Reactor-0] INFO com.google.code.yanf4j.core.impl.AbstractController - Add a session: 52.59.110.98:11211
254945 [Xmemcached-Reactor-0] WARN com.google.code.yanf4j.core.impl.AbstractController - Memcached node mc3.c1.ec2.test62.memcachier.com/52.57.227.183:11211 is resolved into mc3.c1.ec2.test62.memcachier.com/52.59.110.98:11211.
255014 [Xmemcached-Reactor-3] ERROR net.rubyeye.xmemcached.impl.MemcachedHandler - XMemcached network layout exception
java.io.IOException: Connection reset by peer
	at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
	at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
	at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
	at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:197)
	at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:382)
	at com.google.code.yanf4j.nio.impl.NioTCPSession.readFromBuffer(NioTCPSession.java:209)
	at com.google.code.yanf4j.nio.impl.AbstractNioSession.onRead(AbstractNioSession.java:196)
	at com.google.code.yanf4j.nio.impl.AbstractNioSession.onEvent(AbstractNioSession.java:339)
	at com.google.code.yanf4j.nio.impl.SocketChannelController.dispatchReadEvent(SocketChannelController.java:56)
	at com.google.code.yanf4j.nio.impl.NioController.onRead(NioController.java:159)
	at com.google.code.yanf4j.nio.impl.Reactor.dispatchEvent(Reactor.java:328)
	at com.google.code.yanf4j.nio.impl.Reactor.run(Reactor.java:183)
255015 [Xmemcached-Reactor-3] INFO com.google.code.yanf4j.core.impl.AbstractController - Remove a session: 52.59.110.98:11211

Note: I use the binary protocol with authentication. Could it be that XMemcached does not authenticate with the new server?

@killme2008
Copy link
Owner

@saschat do you make sure the DNS changes take effect at your client machine? And JVM has DNS cache, you may have to disable it , see

https://stackoverflow.com/questions/1256556/any-way-to-make-java-honor-the-dns-caching-timeout-ttl

@saschat
Copy link
Contributor

saschat commented Apr 10, 2018

@killme2008 yes, the DNS changes take effect. See the log message I posted above. It removes the session with IP 52.57.227.183 and then adds a session with IP 52.59.110.98.

@killme2008
Copy link
Owner

@saschat It looks that the client has resolved the server to new IP address and connect to it, but the connection was lost again. If you doubt it was an authentication problem, can you try it without the authentication?

@saschat
Copy link
Contributor

saschat commented Apr 10, 2018

That will be difficult. I work for MemCachier and we do not operate servers without authentication.

Note that it is not an authentication problem in the sense that the credentials do not work. If I restart the app it connects (and authenticates) to the new server. I think that maybe when XMemcached is trying to heal the session it does not authenticate or has a different problem regarding authentication.

The reason I say that is because when I connect to the server the first time I get a log message that it authenticated before it adds the session. Here is the log message for a normal operation:

284 [Thread-5] INFO net.rubyeye.xmemcached.auth.AuthTask - Authentication to mc1.c1.ec2.test62.memcachier.com/52.57.227.183:11211 successfully
284 [Xmemcached-Reactor-0] INFO com.google.code.yanf4j.core.impl.AbstractController - Add a session: 52.57.227.183:11211

But in the logs I posted earlier there is no message regarding authentication.

@killme2008
Copy link
Owner

@saschat All right, i will try to reproduce it in my machine this weekend or later, but it may take sometime, i am really busy in these days, sorry.

@killme2008 killme2008 reopened this Apr 10, 2018
@saschat
Copy link
Contributor

saschat commented Apr 10, 2018

@killme2008 Thanks!

I will also look into it and if I figure it out I will submit a PR.

@saschat
Copy link
Contributor

saschat commented Apr 11, 2018

@killme2008 I found the bug. I will try to find a good way to solve it and submit a PR.

@killme2008
Copy link
Owner

@saschat Great! welcome to submit a PR.

Raiv added a commit to Raiv/xmemcached that referenced this issue Mar 4, 2021
@Raiv
Copy link
Contributor

Raiv commented Mar 4, 2021

This issue appears again due to a minor bug in initialization order ( see my pr) #129

killme2008 added a commit that referenced this issue Sep 22, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
5 participants