Skip to content

HttpContactPointBootstrap always dead after probing timeout #209

@Roiocam

Description

@Roiocam

Explain

In the cluster bootstrapping, we will create a child actor for handling HTTP probing, this actor will use the config probingFailureTimeout as the deadline time:

/**
* If probing keeps failing until the deadline triggers, we notify the parent,
* such that it rediscover again.
*/
private var probingKeepFailingDeadline: Deadline = settings.contactPoint.probingFailureTimeout.fromNow

At the same time, we are using the same configuration probingFailureTimeout as probing future timeout too.

log.debug("Probing [{}] for seed nodes...", probeRequest.uri)
val reply = http.singleRequest(probeRequest, settings = connectionPoolWithoutRetries).flatMap(handleResponse)
val afterTimeout = after(settings.contactPoint.probingFailureTimeout, context.system.scheduler)(replyTimeout)
Future.firstCompletedOf(List(reply, afterTimeout)).pipeTo(self)

There is only one way to handle these timeouts and deadlines, As you can see, because of the existence of a deadline, the else logic will never be executed.

case Status.Failure(cause) =>
log.warning("Probing [{}] failed due to: {}", probeRequest.uri, cause.getMessage)
if (probingKeepFailingDeadline.isOverdue()) {
log.error("Overdue of probing-failure-timeout, stop probing, signaling that it's failed")
context.parent ! BootstrapCoordinator.Protocol.ProbingFailed(contactPoint, cause)
context.stop(self)
} else {
// keep probing, hoping the request will eventually succeed
scheduleNextContactPointProbing()
}

Discuss

I think we may need two configurations for deadline and timeout. In such cases, when there is network latency for the contact point node, theHttpContactPointBootstrap actor does not need to be frequently destroyed and created. At least we have some buffer time.

wdyt @pjfanning @He-Pin @mdedetrich @samueleresca

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions