ConnectionError: (‘Connection aborted.’, BadStatusLine(“””)) – OpenStack

I’m attempting to run through the OSP10 -> OSP13 fast-forward upgrade process on my home lab. Unfortunately I kept running into an error when prepping for the fast forward upgrade:

Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. 
Execution ID: 3fb62e82-9025-4057-a5ff-8e2189e42a99
Plan updated.
Processing templates in the directory /tmp/tripleoclient-rjkHaX/tripleo-heat-templates
Unable to establish connection to https://192.168.0.162:13989/v2/action_executions: ('Connection aborted.', BadStatusLine("''",))

This error arises from httplib in the Python standard library. In short, it’s telling you that the remote end of the connection terminated without sending an HTTP status code. In this case it’s reporting an empty string.

The culprit in this instance is haproxy and its timeout for the server side of the connection. It happens because (in this case) the Mistral API is waiting for a response to a Rabbit message passed to the Mistral engine. Being a virtual undercloud running on a heavily subscribed KVM host the Mistral engine takes longer than the haproxy socket timeout to finish and return a response to the Mistral API.

haproxy then times out waiting to receive data from the server and so kills the client connection as well. The client interprets this as ‘Connection aborted’.

The solution: increase the server timeout in /etc/haproxy/haproxy.cfg. It defaults to 2m; on my deployment (KVM-based) things take longer to run, so I’ve pumped it up to 5m timeout:

defaults 
 log  global
 maxconn  4096
 mode  tcp
 retries  3
 timeout  http-request 10s
 timeout  queue 2m
 timeout  connect 10s
 timeout  client 2m
 timeout  server 5m
 timeout  check 10s

Reload the haproxy service (systemctl reload haproxy) to make the config change take effect. haproxy will now wait five minutes for the server side of a connection to return data to the client before terminating the connection.

You may also need to increase the DEFAULT/rpc_response_timeout parameter in /etc/mistral/mistral.conf; mine is set to 240 seconds (default is 60 seconds).

The result:

Started Mistral Workflow tripleo.package_update.v1.package_update_plan. Execution ID: 1dafecc3-aa65-4ad9-b9fb-5fd74ef2c463
Waiting for messages on queue 'tripleo' with no timeout.
2019-01-29 10:27:19Z [DeploymentServerBlacklistDict]: CREATE_COMPLETE  state changed
2019-01-29 10:27:20Z [overcloud-ServiceNetMap-a45lzxobzujw]: UPDATE_IN_PROGRESS  Stack UPDATE started
2019-01-29 10:27:20Z [RabbitCookie]: UPDATE_IN_PROGRESS  state changed
2019-01-29 10:27:20Z [RabbitCookie]: UPDATE_COMPLETE  state changed

Leave a Reply

Your email address will not be published. Required fields are marked *