Keystone, LDAP domains, and “An Error Occurred Authenticating”

When integrating LDAP with OpenStack Keystone, you might see an error like so when you attempt to sign in with Horizon:

“An error occurred authenticating. Please try again later”

You would also see HTTP 500 response codes when attempting to use the CLI.

This has bitten me a couple of times. Three things to check:

  • If using LDAPS, is the CA for the LDAP server present in the Keystone container? To do so, ensure it’s part of the CAMap that is copied onto the host.
  • Does the password for Keystone’s LDAP user (i.e. the one it binds with to conduct searches) have any $ symbols? These are considered as replacement variables for Oslo Config, so when it attempts to read the password from /etc/keystone/domains/keystone.<domain>.conf it will trigger an exception. Escape any $ symbols in your TripleO template like so: “pa\\$sw0rd”. Note: unescaped dollar signs will cause a failure to authenticate for any domain, so even if you aren’t attempting to sign into an LDAP-backed domain, check this anyway.
  • Are the credentials for the Keystone user correct? Attempt an authenticated bind, similar to the below, to be sure. -W to prompt for password, -D to specify the distinguished name you are binding with, -H for the host with protocol, and then the search at the end:
ldapsearch -W -H ldaps://my.ldap.host -D "uid=openstack,cn=users,cn=accounts,dc=my,dc=ldap,dc=host" "uid=someuser"

OpenStack Queens split control plane gotcha #3 – split OVN services

So here’s the third gotcha I’ve run into. For background, I’m trying to deploy a ‘split’ OpenStack control plane in OpenStack Queens using ML2+OVN – 3x nodes running all Pacemaker-managed services, and 3x nodes running the non-Pacemaker services (e.g. Keystone, Neutron Server, etc).

The problem is that the Neutron API service – which deploys the neutron-server-ovn image – requires the NeutronCorePlugin service. This in turn runs a puppet manifest that will fail because of a missing piece of hieradata – ovn::northbound::port.

neutron-server-ovn (the OS::TripleO::Services::NeutronAPI service) needs the northbound port in order to set up the logical flows in the northbound DB. When running in a monolithic controller model this is already available, thanks to the OS::TripleO::Services::OVNDBs service.

In a split model OVNDBs is split away from NeutronAPI, necessitating a change.

Read more “OpenStack Queens split control plane gotcha #3 – split OVN services”

OpenStack Queens, Split Deployment Gotcha #2: role ordering for split control plane

The last post to this described how without the ‘primary’ and ‘controller’ tags on your role running haproxy, your overcloud_endpoint.pem certificate file is never copied to those nodes, causing haproxy startup to fail.

This post documents a second gotcha – the ordering of your split roles in roles_data.yaml determines if some of your bootstrap tasks are run.

Read more “OpenStack Queens, Split Deployment Gotcha #2: role ordering for split control plane”

OpenStack role tags: ‘primary’ and ‘controller’

You’ll see this in the roles_data.yaml file and might be wondering what they’re for. This post answers that question, but also outlines a ‘gotcha’ where the NodeTLSData resource will not be created for a role if that roles does not have the primary and controller tags set.

This applies to OpenStack Queens – in Rocky the NodeTLSData resource was changed to use Ansible for deployment of the public TLS certificate, and therefore this restriction doesn’t apply anymore.

Read more “OpenStack role tags: ‘primary’ and ‘controller’”

Applying TLS Everywhere to an existing OpenStack 13 (Queens) cloud

TLS-Everywhere was introduced in the Queens cycle to provide TLS security over pretty much all communication paths within OpenStack. Not just the public endpoints – that’s been present for a while – but also the internal endpoints, admin endpoints, RabbitMQ bus and Galera replication/connections too.

Unfortunately, out of the box you cannot apply the TLS everywhere environment files on an existing OSP13 cloud and expect it to just work. The TLS everywhere feature in Queens, and indeed Rocky, is based on the assumption that you are deploying a fresh cloud.

After some work over the last few days with some colleagues, there’s a solution to applying TLS-everywhere retrospectively on an OSP13 deployment. But be warned: it’s messy.

Read more “Applying TLS Everywhere to an existing OpenStack 13 (Queens) cloud”

OpenStack, Identity Management/IPA and TLS-Everywhere

novajoin is a WSGI application that serves dynamic vendordata to overcloud nodes (and instances, if you wish) via the cloud-init process. it’s purpose is to register a host to an IPA server and create any necessary services in IPA so certificates for them can be created on the hosts.

Thie purpose of this host is to describe how the “TLS Everywhere” functionality in OSP13 onwards operates. In particular, I wanted to answer these questions:

  • What does novajoin do?
  • How and when does novajoin register hosts in IPA?
  • What changes does novajoin make to IPA?
  • When does the host enrol to IPA, and how does it get its configuration?
Read more “OpenStack, Identity Management/IPA and TLS-Everywhere”

ConnectionError: (‘Connection aborted.’, BadStatusLine(“””)) – OpenStack

I’m attempting to run through the OSP10 -> OSP13 fast-forward upgrade process on my home lab. Unfortunately I kept running into an error when prepping for the fast forward upgrade:

Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. 
Execution ID: 3fb62e82-9025-4057-a5ff-8e2189e42a99
Plan updated.
Processing templates in the directory /tmp/tripleoclient-rjkHaX/tripleo-heat-templates
Unable to establish connection to https://192.168.0.162:13989/v2/action_executions: ('Connection aborted.', BadStatusLine("''",))

This error arises from httplib in the Python standard library. In short, it’s telling you that the remote end of the connection terminated without sending an HTTP status code. In this case it’s reporting an empty string.

Read more “ConnectionError: (‘Connection aborted.’, BadStatusLine(“””)) – OpenStack”

Neutron security groups and OVS, Part 3: tracing OVS’s hooks and claws…

So far we’ve looked at:

  • What tap interfaces are and why the VMs require them for network connectivity. (Part 1)
  • How security groups are implemented as iptables rules. (Part 2)
  • The implementation detail that iptables is just a front-end to the netfilter framework within the kernel, a framework that operates at layer 3.

None of that explains why we need the linux bridge in the middle, however.

Read more “Neutron security groups and OVS, Part 3: tracing OVS’s hooks and claws…”

Neutron security groups and OVS, part 2: security groups implementation

Security groups provide IP traffic filtering for your VM instances. You can specify ingress and egress rules and filter traffic based on port, source address, destination address, etc. Here’s a shot from my lab, with some basic security group rules assigned to my demo project (click to enlarge):

Under the hood, when using the iptables_hybrid firewall driver, these are all implemented as iptables rules on every compute node where an instance is running with this security group assigned.

Read more “Neutron security groups and OVS, part 2: security groups implementation”

Neutron security groups and OVS, part 1: tap interfaces and VM connectivity

Today I did a little digging into the implementation of security groups when using OpenVSwitch. In particular, I was curious about this: why is it that security groups require the creation of a linux bridge on the compute node? Why can’t we just attach the VM directly to the OVS integration bridge (br-int) and set iptables rules on the VM interface like we otherwise would?

This applies when using the iptables_hybrid firewall driver for Neutron with the ML2+OVS subsystem. If you use the openvswitch firewall driver, these firewall rules are implemented entirely by OpenFlow rules that use the conntrack module in the Kernel.

This was originally going to be one post but I ended up rambling on for so long I opted to split it into a few related posts. This is the first!

Read more “Neutron security groups and OVS, part 1: tap interfaces and VM connectivity”