Skip to content

ReconnectingClient prints "Lost connection" for old nodes when another one is added - while requests go through just fine #119

@alexeyOnGitHub

Description

@alexeyOnGitHub

I assume this is just a logging issue rather than a bug affecting client behavior.
I see this regularly -

say, I boot up 2 Memcached nodes: A and B. they are used by several app nodes (3 in my latest fault+perf test).
the total traffic sent to Memcached from these app nodes is 3000 RPS.
then I boot up one more Memcached node (C) and bring it into Load Balancer.
we use 3 seconds Dns refresh period in Memcached client setting, so the client recognizes that there are changes in the Memcached nodes list pretty quickly.

I see "Successfully connected" and "client connected: BinaryMemcacheClient(com.spotify.folsom.ketama.SrvKetamaClient" in logs, so that is all good.

but -
even though Memcached client keeps working fine (I can see requests in memcached tracking dashboards, no timeouts in the app, all good), I see these worrying messages in logs about 60 seconds after the new node is picked by the client. these messages refer to original Memcached nodes A and B, but not C.

INFO [2019-02-12 14:33:10,641] [folsom-default-scheduled-executor] [com.spotify.folsom.reconnect.ReconnectingClient] [] - Lost connection to A:11211
INFO [2019-02-12 14:33:10,641] [folsom-default-scheduled-executor] [com.spotify.folsom.reconnect.ReconnectingClient] [] - Lost connection to A:11211
INFO [2019-02-12 14:33:10,643] [folsom-default-scheduled-executor] [com.spotify.folsom.reconnect.ReconnectingClient] [] - Lost connection to B:11211
INFO [2019-02-12 14:33:10,643] [folsom-default-scheduled-executor] [com.spotify.folsom.reconnect.ReconnectingClient] [] - Lost connection to B:11211

we use 2 connections per host, this is why you see two "lost connection" lines per node.

in other words, ReconnectingClient prints "Lost connection" for old nodes after its nodes list was refreshed and another node was picked up.

I am guessing there is no problem with the current client connections and those messages are just some delayed cleanup messages from old connections. if that is the case, it would be nice to change the logging to be more explicit. "lost connection" looks scary

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions