vmware-tanzu-labs/cf-mgmt

SAML with LDAP groups

pcf-user opened this issue · 24 comments

Our vmware Tanzu version: 2.9.X

We use SAML as the "authentication and enterprise SSO" provider and use cf-mgmt in "SAML only" mode --> https://github.com/vmwarepivotallabs/cf-mgmt/blob/master/docs/config/README.md#saml-configuration

So far so good.

Does "SAML with LDAP groups" enable us use LDAP groups/users authenticate to "cf api" while "authentication and enterprise SSO" provider is still SAML? https://github.com/vmwarepivotallabs/cf-mgmt/blob/master/docs/config/README.md#saml-configuration-with-ldap-group-lookups

Got this error when we ran "cf-mgmt update-org-users".

LDAP credentials are good and 'ldapserach' command with those credentials come back just fine.

error: LDAP Result Code 200 "Network Error": remote error: tls: handshake failure

We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.

The labels on this github issue will be updated when the story is started.

Hi @pcf-user!

Does "SAML with LDAP groups" enable us use LDAP groups/users authenticate to "cf api" while "authentication and enterprise SSO" provider is still SAML?

The "SAML with LDAP groups" options means it uses SAML for the authentication, but it allows you to define access to orgs/spaces using the users in your LDAP groups assuming they match the users in your SAML. You can control which attribute of the LDAP user matches the SAML username by checking your ldap.yml configuration.

For example, if I have an LDAP group called "app-team", then I can use cf-mgmt-config update-space --org=ORG --space=SPACE --developer-ldap-group=app-team to add app-team to my ORG/SPACE/spaceConfig.ymls space-developers ldap group.

The next time the pipeline ran and updated space users, it would ask the configured LDAP who is in the "app-team" group, and then add those users individually as space-developers in CF. It would be the equivalent of going through each member of the ldap group and running cf set-space-role user1@app-team.com ORG SPACE SpaceDeveloper.

As for the TLS handshake error, that's a little trickier to debug. My first recommendations would be to double check your ldap.yml to ensure it all looks good and to make sure that any changes to your ldap.yml have been committed and pushed if you're using a Concourse pipeline. It may also be worthwhile to use a tool like yq to parse your ldap.yml to ensure the fields are formatted correctly. A field like the ca-cert can easily get malformed.

You can check out how the LDAP connection is made in case that helps you debug the problem as well.

Hope that helps!

@pcf-user If you set insecure_skip_verify: true does it work? If so this suggests there is an issue with the ca chain and that would help with troubleshooting

With LOG_LEVEL=debug

with use_tls=false

bash-4.4# cf-mgmt update-org-users
2020/11/04 21:56:18 D1104 21:56:18.656439 42 initialize.go:52] Using Version: [1.0.47], Commit: [1ed639f] of cf-mgmt
2020/11/04 21:56:18 D1104 21:56:18.893669 42 connection.go:68] Connecting to ldap.OUR-DOMAIN.com:636
LDAP Request: (Universal, Constructed, Sequence and Sequence of) Len=72 ""
MessageID: (Universal, Primitive, Integer) Len=1 "1"
Bind Request: (Application, Constructed, 0x00) Len=67 ""
Version: (Universal, Primitive, Integer) Len=1 "3"
User Name: (Universal, Primitive, Octet String) Len=45 "uid=OUR_LDAP_USER,ou=OUR_OU,dc=OUR-DOMAIN,dc=com"
Password: (Context, Primitive, 0x00) Len=15 "PASSWORD"
2020/11/04 21:56:18 flags&startTLS = 0
2020/11/04 21:56:18 1: returning
2020/11/04 21:56:18 1: waiting for response
2020/11/04 21:56:18 Sending message 1
2020/11/04 22:02:18 reader error: unexpected EOF
2020/11/04 22:02:18 Sending quit message and waiting for confirmation
2020/11/04 22:02:18 Shutting down - quit message received
2020/11/04 22:02:18 Closing channel for MessageID 1
2020/11/04 22:02:18 Closing network connection
2020/11/04 22:02:18 1: got response 0x0
error: cannot bind with uid=OUR_LDAP_USER,ou=OUR_OU,dc=OUR-DOMAIN,dc=com: unable to read LDAP response packet: unexpected EOF

with use_tls=true

bash-4.4# cf-mgmt update-org-users
2020/11/04 22:02:06 D1104 22:02:06.371107 25 initialize.go:52] Using Version: [1.0.47], Commit: [1ed639f] of cf-mgmt
2020/11/04 22:02:06 D1104 22:02:06.572658 25 connection.go:68] Connecting to ldap.OUR-DOMAIN.com:636
error: LDAP Result Code 200 "Network Error": remote error: tls: handshake failure

Just a heads up @pcf-user, those screenshots can be pretty easily manipulated to extract some information about your configuration, even with the blacked out bits. I'd highly recommend you remove the posted screenshots and replace them with code blocks that have the text redacted/changed manually. Or make sure that when you do redact the information, you're doing so with a tool that is completely opaque.

We're still looking into the issue and hopefully can propose a solution, just wanted to let you know ASAP!

@ryanmattcollins : Thank you. I removed the screenshots.

So since it's failing with use_tls=false, my guess is there is an issue with the bindDN or bindPassword since the error message is being produced from these lines in the code:

if err = connection.Bind(config.BindDN, config.BindPassword); err != nil {
	connection.Close()
	return nil, fmt.Errorf("cannot bind with %s: %v", config.BindDN, err)
}

I noticed in your ldap.yml there was no bindPassword field though, is it possible you're missing that field?

@ryanmattcollins. We are passing LDAP_PASSWORD from pipeline --> https://github.com/vmwarepivotallabs/cf-mgmt/blob/fc3f42e5ceab28f52df100c9b7c7d25e0105544e/generated/files/pipeline.yml#L242

Also, ldap.yml doesn't have any bindPassword entry --> https://github.com/vmwarepivotallabs/cf-mgmt/blob/master/docs/config/README.md#saml-configuration-with-ldap-group-lookups

It shouldn't have one either if it is to be on Github or some other code hosting sites.

For additional context you can see what the TLS certificate chain that is being presented back from ldap with the following command.

openssl s_client -host your.ldap.server -port 636 -showcerts </dev/null 2>/dev/null

Also since 636 is a TLS endpoint you need to keep that enabled. So the settings to see if it's just a cert trust would be the following in ldap.yml

use_tls: true
insecure_skip_verify: true
ldapPort: 636

if above works then it's a problem with the following field having incorrect information

ca_cert: |

if you want to test without TLS then you need to use port 389 (typical non TLS ldap port) with the following settings:

use_tls: false
ldapPort: 389 #or whatever you non-TLS ldap port is

OK. I get a cert chain and cert from the SSL command. Should I use that cert or the chain in my ldap.yml?

You can use the entire chain but should only really need the root ca part of the chain. So would start with entire and then remove things from the top until it doesn't work to see what you really need to complete the chain.

Did this configuration work?

use_tls: true
insecure_skip_verify: true
ldapPort: 636

@calebwashburn

Had set configuration to this and used root and chain certs as seen with the openssl command. Still failing with the same TLS error.
use_tls: true
insecure_skip_verify: true
ldapPort: 636

Only other thing I can think of is that your dns name you are using to connect is not one of the subject alternative names.

openssl s_client -host "ldap.yourserver.com" -port 636 -showcerts </dev/null 2>/dev/null | openssl x509 -outform PEM | openssl x509 -text -noout

and making sure the DNS entry is present in the certificate output.

@calebwashburn : We are checking this.

We rechecked all our settings. They all appear to be OK. Chose the option "insecure_verify=true" as well. None of these worked. It is still failing with that TLS error.

We did some additional testing and here is what we found.

With ldapsearch and no cert, works

With python and no cert, works

We used the Go ldap module with Dial() method and it doesn't work

Hi,

Updating this issue. We recreated the ldap connection portion:

package main

import "github.com/go-ldap/ldap"
import "log"
import "crypto/tls"

func main() {
    ldapURL := "ldaps://ldapserver.example.com:636"
    l, err := ldap.DialURL(ldapURL, ldap.DialWithTLSConfig(&tls.Config{InsecureSkipVerify: true}))
    if err != nil {
            log.Fatal(err)
    }
    defer l.Close()
}

This resulted in the similar output described above:
LDAP Result Code 200 "Network Error": remote error: tls: handshake failure

Am I missing additional parameters to pass in to the ldap client?

This is the code that runs which is slightly different than above.

package main

import (
	"crypto/tls"
	"fmt"
	"log"

	"github.com/go-ldap/ldap"
)

func main() {
	ldapURL := "ldap.yourdomain.com:636"
	connection, err := ldap.DialTLS("tcp", ldapURL, &tls.Config{InsecureSkipVerify: true})
	if err != nil {
		log.Fatal(err)
	} else {
		fmt.Println("Connection is successful")
	}
	if connection != nil {
		connection.Close()
	}
}

To test you have the right certificate you can do this....

package main

import (
	"crypto/tls"
	"crypto/x509"
	"fmt"
	"log"

	"github.com/go-ldap/ldap"
)

var CACert string = `
-----BEGIN CERTIFICATE-----
<the root ca>
-----END CERTIFICATE-----
`

func main() {
	host := "ldap.yourserver.com"
	ldapURL := fmt.Sprintf("%s:636", host)
	rootCAs, _ := x509.SystemCertPool()
	if rootCAs == nil {
		rootCAs = x509.NewCertPool()
	}

	// Append our cert to the system pool
	if ok := rootCAs.AppendCertsFromPEM([]byte(CACert)); !ok {
		log.Println("No certs appended, using system certs only")
	}

	// Trust the augmented cert pool in our client
	tlsConfig := &tls.Config{
		RootCAs:    rootCAs,
		ServerName: host,
	}

	connection, err := ldap.DialTLS("tcp", ldapURL, tlsConfig)
	if err != nil {
		log.Fatal(err)
	} else {
		fmt.Println("Connection is successful")
	}
	if connection != nil {
		connection.Close()
	}
}

Got it to work but we had to specify the min/max tls versions.

l, err := ldap.DialTLS("tcp", ldapURL, &tls.Config{InsecureSkipVerify: true, MinVersion: tls.VersionTLS12, MaxVersion: tls.VersionTLS12,})

The settings below leads to TLS handshake errors

l, err := ldap.DialTLS("tcp", ldapURL, &tls.Config{InsecureSkipVerify: true, MinVersion: tls.VersionTLS12, MaxVersion: tls.VersionTLS13,})

@calebwashburn : We see that you updated the code to use a min and max TLS version. When are we planning to release a new binary with this updated code?

I'll do that now

With the release of v1.0.48 this can probably be marked closed. Thanks for your incredible support @calebwashburn.