Showing posts with label LDAP. Show all posts
Showing posts with label LDAP. Show all posts

Tuesday, November 22, 2011

LDAP Tool of the Day - getrootDSE

I'm an LDAP guy.  I'm not even sure what that means, but I am one.  I spend a lot of my work time looking at LDAPs.  For the purists, I look at directories.  LDAP is just an interface to the directory.  If I look at the protocol with Wireshark, does that mean I am looking at LDAPs, 'cuase I do that too.  Can you really look at a directory?  I've never been to our data centers.  Where was I?

There are a lot of great tools for working with LDAP, but there is always room for one more, right?   A common task for me is to need to look at the contents of the Root DSE and verify the SSL certificate being used, if SSL is used.

For those not familiar with the root DSE, it is an entry offered by all LDAP servers.  Its DN is null or empty, depending on how you interpret the RFCs.  It almost always accepts un-authenticated connections and lists information about the contents and capabilities of the LDAP server.  It will usually list the supported LDAP controls, authentication types offered, and often the naming contexts is holds.  Different vendors list different data, and it is this data that I am often interested in.

Here are a few typical entries:
Active Directory Domain Controller 

dn:
currentTime: 20111123000138.0Z
subschemaSubentry: CN=Aggregate,CN=Schema,CN=Configuration,DC=example,DC=org
dsServiceName: CN=NTDS Settings,CN=mydomcontr08,CN=Servers,CN=Food,CN=Sites,CN=Co
 nfiguration,DC=example,DC=org
namingContexts: DC=example,DC=org
namingContexts: CN=Configuration,DC=example,DC=org
namingContexts: CN=Schema,CN=Configuration,DC=example,DC=org
defaultNamingContext: DC=example,DC=org
schemaNamingContext: CN=Schema,CN=Configuration,DC=example,DC=org
configurationNamingContext: CN=Configuration,DC=example,DC=org
rootDomainNamingContext: DC=example,DC=org
supportedControl: 1.2.840.113556.1.4.319
supportedControl: 1.2.840.113556.1.4.801
supportedControl: 1.2.840.113556.1.4.473
supportedControl: 1.2.840.113556.1.4.528
supportedControl: 1.2.840.113556.1.4.417
supportedControl: 1.2.840.113556.1.4.619
supportedControl: 1.2.840.113556.1.4.841
supportedControl: 1.2.840.113556.1.4.529
supportedControl: 1.2.840.113556.1.4.805
supportedControl: 1.2.840.113556.1.4.521
supportedControl: 1.2.840.113556.1.4.970
supportedControl: 1.2.840.113556.1.4.1338
supportedControl: 1.2.840.113556.1.4.474
supportedControl: 1.2.840.113556.1.4.1339
supportedControl: 1.2.840.113556.1.4.1340
supportedControl: 1.2.840.113556.1.4.1413
supportedControl: 2.16.840.1.113730.3.4.9
supportedControl: 2.16.840.1.113730.3.4.10
supportedControl: 1.2.840.113556.1.4.1504
supportedControl: 1.2.840.113556.1.4.1852
supportedControl: 1.2.840.113556.1.4.802
supportedControl: 1.2.840.113556.1.4.1907
supportedControl: 1.2.840.113556.1.4.1948
supportedControl: 1.2.840.113556.1.4.1974
supportedControl: 1.2.840.113556.1.4.1341
supportedControl: 1.2.840.113556.1.4.2026
supportedLDAPVersion: 3
supportedLDAPVersion: 2
supportedLDAPPolicies: MaxPoolThreads
supportedLDAPPolicies: MaxDatagramRecv
supportedLDAPPolicies: MaxReceiveBuffer
supportedLDAPPolicies: InitRecvTimeout
supportedLDAPPolicies: MaxConnections
supportedLDAPPolicies: MaxConnIdleTime
supportedLDAPPolicies: MaxPageSize
supportedLDAPPolicies: MaxQueryDuration
supportedLDAPPolicies: MaxTempTableSize
supportedLDAPPolicies: MaxResultSetSize
supportedLDAPPolicies: MaxNotificationPerConn
supportedLDAPPolicies: MaxValRange
highestCommittedUSN: 124867805
supportedSASLMechanisms: GSSAPI
supportedSASLMechanisms: GSS-SPNEGO
supportedSASLMechanisms: EXTERNAL
supportedSASLMechanisms: DIGEST-MD5
dnsHostName: mydomcontr08.example.ORG
ldapServiceName: example.ORG:mydomcontr08$@example.ORG
serverName: CN=mydomcontr08,CN=Servers,CN=Food,CN=Sites,CN=Configuration,DC=examp
 le,DC=org
supportedCapabilities: 1.2.840.113556.1.4.800
supportedCapabilities: 1.2.840.113556.1.4.1670
supportedCapabilities: 1.2.840.113556.1.4.1791
supportedCapabilities: 1.2.840.113556.1.4.1935
isSynchronized: TRUE
isGlobalCatalogReady: TRUE
domainFunctionality: 2
forestFunctionality: 2
domainControllerFunctionality: 3

Oracle Virtual Directory
dn:
namingContexts: ou=Groups,dc=example,dc=com
namingContexts: ou=admins,dc=example,dc=com
namingContexts: ou=employees,dc=example,dc=com
namingContexts: ou=IDMUsers,dc=idm.example,dc=com
namingContexts: ou=partners,dc=example,dc=com
namingContexts: OU=portal users,dc=example,dc=com
namingContexts: dc=example,dc=com
namingContexts: ou=OIDUsers,dc=idm.example,dc=com
objectClass: top
subschemaSubEntry: cn=schema
supportedSASLMechanisms: EXTERNAL
supportedSASLMechanisms: CRAM-MD5
supportedLDAPVersion: 2
supportedLDAPVersion: 3
supportedExtension: 1.3.6.1.4.1.1466.20037

Oracle Internet Directory
dn:
supportedsaslmechanisms: DIGEST-MD5
supportedldapversion: 2
supportedldapversion: 3
supportedextension: 2.16.840.1.113894.1.9.1
supportedextension: 1.3.6.1.4.1.1466.20037
supportedcontrol: 2.16.840.1.113730.3.4.2
supportedcontrol: 2.16.840.1.113894.1.8.1
supportedcontrol: 2.16.840.1.113894.1.8.2
supportedcontrol: 2.16.840.1.113894.1.8.3
supportedcontrol: 2.16.840.1.113894.1.8.4
supportedcontrol: 2.16.840.1.113894.1.8.5
supportedcontrol: 2.16.840.1.113894.1.8.6
supportedcontrol: 2.16.840.1.113894.1.8.7
supportedcontrol: 1.2.840.113556.1.4.473
supportedcontrol: 1.2.840.113556.1.4.319
supportedcontrol: 2.16.840.1.113894.1.8.14
supportedcontrol: 2.16.840.1.113894.1.8.16
supportedcontrol: 2.16.840.1.113894.1.8.23
supportedcontrol: 2.16.840.1.113894.1.8.29
subschemasubentry: cn=subschemasubentry
subregistrysubentry: cn=subregistrysubentry
subconfigsubentry: cn=subconfigsubentry
pwdpolicysubentry: cn=default,cn=pwdPolicies,cn=Common,cn=Products,cn=OracleCont
 ext
orclupgradeinprogress: FALSE
orcltimelimit: 3600
orclstatsperiodicity: 60
orclstatslevel: 0
orclstatsflag: 0
orclsizelimit: 100000
orclsimplemodchglogattributes: uniquemember
orclsimplemodchglogattributes: member
orclsimplemodchglogattributes: orcluserapplnprovstatus
orclsimplemodchglogattributes: orcluserapplnprovstatusdesc
orclsimplemodchglogattributes: orcluserprovfailurecount
orclservermode: rw
orclreplicaid: prdoidx401_poid1
orclreplagreements: cn=replication configuration
orcloptcontainsquery: 0
orclnormdn:: IA==
orclmaxtcpidleconntime: 120
orclmatchdnenabled: 0
orcllegacyoidsyncagent: cn=odisrv+orclhostname=prdoidx001,cn=Registered Instance
 s,cn=Directory Integration Platform,cn=Products,cn=OracleContext
orcllegacyoidsyncagent: cn=odisrv+orclhostname=prdoidx401,cn=Registered Instance
 s,cn=Directory Integration Platform,cn=Products,cn=OracleContext
orcllegacyoidsyncagent: cn=odisrv+orclhostname=prdoidx002,cn=Registered Instance
 s,cn=Directory Integration Platform,cn=Products,cn=OracleContext
orcllegacyoidsyncagent: cn=odisrv+orclhostname=prdoidx402,cn=Registered Instance
 s,cn=Directory Integration Platform,cn=Products,cn=OracleContext
orcleventlevel: 0
orclentrylevelaci: access to entry by * (browse, noadd, nodelete)
orclentrylevelaci: access to attr=(orclaci,orclguname,orclgupassword,orclprname,
 orclprpassword,orclcryptoscheme,orclsuname,orclsupassword) by * (none)
orclentrylevelaci: access to attr=(*) by * (search, read, nowrite, nocompare)
orclentrylevelaci: access to attr=(*) AppendToAll by group="cn=directoryadmingro
 up,cn=oracle internet directory" (search,read,write,compare)
orclentrylevelaci: access to entry AppendToAll by group="cn=directoryadmingroup,
 cn=oracle internet directory" (browse,add,delete)
orclentrylevelaci: access to attr=(orclstatsflag, orclstatsperiodicity,orclevent
 level) by dn="cn=emd admin,cn=oracle internet directory" (search,read,write,com
 pare) by * (search,read)
orclenablegroupcache: 1
orclecachemaxsize: 10000000
orclecachemaxentries: 25000
orclecacheenabled: 1
orcldirectoryversion: OID 10.1.4.3.0
orcldiprepository: FALSE
orcldebugop: 511
orcldebugflag: 0
orclcatalogentrydn: cn=catalogs
orclauditlevel: 0
orclanonymousbindsflag: 1
matchingrules: distinguishedNameMatch
matchingrules: caseIgnoreMatch
matchingrules: caseExactMatch
matchingrules: numericStringMatch
matchingrules: telephoneNumberMatch
changestatus: cn=changestatus
changelog: cn=changelog
authpassword;oid: {SASL/MD5}sHex432oGONWYembe52eKA==
authpassword;oid: {SASL/MD5-DN}UpdstrkdNdL5mxyQ8wFP5iQ==
authpassword;oid: {SASL/MD5-U}m0/awjpasdf346gaKaIHs9UQ==


One can get this all via the command line, with ldapsearch.  For windows, I use the OpenDS version.

>ldapsearch -h host.name.org  -p 389 -w "" -b "" -s base  objectclass=*

I often forget the command, and if you need SSL, then you need to add -Z -X.  Really, the -X is something that I'd complain about in most contexts, as it accepts any SSL certs.  In this case, I am meaning to investigate the cert as well.

This gets me the LDAP info, but, then I'd need to use openssl to get the SSL and cert info.

Using
>openssl s_client -connect ovd.internal.example.com:636
I get the connect info:

Loading 'screen' into random state - done
CONNECTED(0000017C)
depth=3 CN = Example USA Root CA
verify error:num=19:self signed certificate in certificate chain
verify return:0
---
Certificate chain
 0 s:/C=US/ST=Washington/L=Bothell/O=Example USA, Inc./OU=Internal Systems/CN=ovd.internal.Example.com
   i:/DC=org/DC=Example/CN=Example USA Issuer CA 02
 1 s:/DC=org/DC=Example/CN=Example USA Issuer CA 02
   i:/CN=Example USA Intermediate CA 01
 2 s:/CN=Example USA Intermediate CA 01
   i:/CN=Example USA Root CA
 3 s:/CN=Example USA Root CA
   i:/CN=Example USA Root CA
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIGFTCCBP2gAwIBAgIKN3rWsgABAAWq7TANBgkqhkiG9w0BAQUFADBSMRMwEQYK
CZImiZPyLGQBGRYDb3JnMRcwFQYKCZImiZPyLGQBGRYHZ3NtMTkwMDEiMCAGA1UE
AxMZVC1Nb2JpbGUgVVNBIElzc3VlciBDQSAwMjAeFw0xMTAxMzAwMTU2MjlaFw0x
MjAxMzAwMTU2MjlaMIGQMQswCQYDVQQGEwJVUzETMBEGA1UECBMKV2FzaGluZ3Rv
.
.
hpw6x12bNa5bqIzyCm70ENEWSZkVAIiYPgNlJEs4AjmzgHo9ixPqsRmzSbmryaAg
8Pw6PUYagF+4soVfKRTZm62m+6qHlmnsDeGjKh6YR7QSwySTH+LV0VXuPBecM7T9
tf90c4mUD/jKnk9o0up0yTxXDf/WKNQ9SHKkzxvqJ8+7DcVw5kjBe+x8H1/HK420
skqyrngmSrL/3xjN/9JdXJp3WRVa+2dYAg==
-----END CERTIFICATE-----
subject=/C=US/ST=Washington/L=Bothell/O=Example USA, Inc./OU=Internal Systems/CN=ovd.internal.Example.com
issuer=/DC=org/DC=Example/CN=Example USA Issuer CA 02
---
No client certificate CA names sent
---
SSL handshake has read 6038 bytes and written 368 bytes
---
New, TLSv1/SSLv3, Cipher is EDH-RSA-DES-CBC3-SHA
Server public key is 1024 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : EDH-RSA-DES-CBC3-SHA
    Session-ID: 4ECC3B5303F3EAB7AFDD9452D7671A08CA4E345DF07F7FF3A76A3B9C62B2DA10
    Session-ID-ctx:
    Master-Key: B074BCE20BBF51B4EF420994309A4CC3DD85DB48F9CB6C5305F984A936FD6B659588C942B63FBC0228EF570D7E05777F
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    Start Time: 1322007377
    Timeout   : 300 (sec)
    Verify return code: 19 (self signed certificate in certificate chain)
---

Last, I have to cut and paste the certificate into a file and use openssl to read it.
>openssl asn1parse -in "cert.pem"

    0:d=0  hl=4 l=1557 cons: SEQUENCE
    4:d=1  hl=4 l=1277 cons: SEQUENCE
    8:d=2  hl=2 l=   3 cons: cont [ 0 ]
   10:d=3  hl=2 l=   1 prim: INTEGER       :02
   13:d=2  hl=2 l=  10 prim: INTEGER       :377AD6B200010005AAED
   25:d=2  hl=2 l=  13 cons: SEQUENCE
   27:d=3  hl=2 l=   9 prim: OBJECT        :sha1WithRSAEncryption
   38:d=3  hl=2 l=   0 prim: NULL
   40:d=2  hl=2 l=  82 cons: SEQUENCE
   42:d=3  hl=2 l=  19 cons: SET
   44:d=4  hl=2 l=  17 cons: SEQUENCE
   46:d=5  hl=2 l=  10 prim: OBJECT        :domainComponent
   58:d=5  hl=2 l=   3 prim: IA5STRING     :org
   63:d=3  hl=2 l=  23 cons: SET
   65:d=4  hl=2 l=  21 cons: SEQUENCE
   67:d=5  hl=2 l=  10 prim: OBJECT        :domainComponent
   79:d=5  hl=2 l=   7 prim: IA5STRING     :gsm1900
   88:d=3  hl=2 l=  34 cons: SET
   90:d=4  hl=2 l=  32 cons: SEQUENCE
   92:d=5  hl=2 l=   3 prim: OBJECT        :commonName
   97:d=5  hl=2 l=  25 prim: PRINTABLESTRING   :Example USA Issuer CA 02
  124:d=2  hl=2 l=  30 cons: SEQUENCE
  126:d=3  hl=2 l=  13 prim: UTCTIME       :110130015629Z
  141:d=3  hl=2 l=  13 prim: UTCTIME       :120130015629Z
  156:d=2  hl=3 l= 144 cons: SEQUENCE
  159:d=3  hl=2 l=  11 cons: SET
  161:d=4  hl=2 l=   9 cons: SEQUENCE
  163:d=5  hl=2 l=   3 prim: OBJECT        :countryName
  168:d=5  hl=2 l=   2 prim: PRINTABLESTRING   :US
  172:d=3  hl=2 l=  19 cons: SET
  174:d=4  hl=2 l=  17 cons: SEQUENCE
  176:d=5  hl=2 l=   3 prim: OBJECT        :stateOrProvinceName
  181:d=5  hl=2 l=  10 prim: PRINTABLESTRING   :Washington
  193:d=3  hl=2 l=  16 cons: SET
  195:d=4  hl=2 l=  14 cons: SEQUENCE
  197:d=5  hl=2 l=   3 prim: OBJECT        :localityName
  202:d=5  hl=2 l=   7 prim: PRINTABLESTRING   :Bothell
  211:d=3  hl=2 l=  27 cons: SET
  213:d=4  hl=2 l=  25 cons: SEQUENCE
  215:d=5  hl=2 l=   3 prim: OBJECT        :organizationName
  220:d=5  hl=2 l=  18 prim: PRINTABLESTRING   :Example USA, Inc.
  240:d=3  hl=2 l=  25 cons: SET
  242:d=4  hl=2 l=  23 cons: SEQUENCE
  244:d=5  hl=2 l=   3 prim: OBJECT        :organizationalUnitName
  249:d=5  hl=2 l=  16 prim: PRINTABLESTRING   :Internal Systems
  267:d=3  hl=2 l=  34 cons: SET
  269:d=4  hl=2 l=  32 cons: SEQUENCE
  271:d=5  hl=2 l=   3 prim: OBJECT        :commonName
  276:d=5  hl=2 l=  25 prim: PRINTABLESTRING   :ovd.internal.Example.com
  303:d=2  hl=3 l= 159 cons: SEQUENCE
  306:d=3  hl=2 l=  13 cons: SEQUENCE
  308:d=4  hl=2 l=   9 prim: OBJECT        :rsaEncryption
  319:d=4  hl=2 l=   0 prim: NULL
  321:d=3  hl=3 l= 141 prim: BIT STRING
  465:d=2  hl=4 l= 816 cons: cont [ 3 ]
  469:d=3  hl=4 l= 812 cons: SEQUENCE
  473:d=4  hl=2 l=  29 cons: SEQUENCE
  475:d=5  hl=2 l=   3 prim: OBJECT        :X509v3 Subject Key Identifier
  480:d=5  hl=2 l=  22 prim: OCTET STRING      [HEX DUMP]:0414D5FBEBB564FC0855035A02C36F05D3BE6AB6D990
  504:d=4  hl=2 l=  31 cons: SEQUENCE
  506:d=5  hl=2 l=   3 prim: OBJECT        :X509v3 Authority Key Identifier
  511:d=5  hl=2 l=  24 prim: OCTET STRING      [HEX DUMP]:30168014688A27CD6281B170FAC4A241E1F84927278B3A00
  537:d=4  hl=4 l= 305 cons: SEQUENCE
  541:d=5  hl=2 l=   3 prim: OBJECT        :X509v3 CRL Distribution Points
  546:d=5  hl=4 l= 296 prim: OCTET STRING      [HEX DUMP]:  
  846:d=4  hl=4 l= 294 cons: SEQUENCE
  850:d=5  hl=2 l=   8 prim: OBJECT        :Authority Information Access
  860:d=5  hl=4 l= 280 prim: OCTET STRING      [HEX DUMP]:
 1144:d=4  hl=2 l=  12 cons: SEQUENCE
 1146:d=5  hl=2 l=   3 prim: OBJECT        :X509v3 Basic Constraints
 1151:d=5  hl=2 l=   1 prim: BOOLEAN       :255
 1154:d=5  hl=2 l=   2 prim: OCTET STRING      [HEX DUMP]:3000
 1158:d=4  hl=2 l=  11 cons: SEQUENCE
 1160:d=5  hl=2 l=   3 prim: OBJECT        :X509v3 Key Usage
 1165:d=5  hl=2 l=   4 prim: OCTET STRING      [HEX DUMP]:030205A0
 1171:d=4  hl=2 l=  62 cons: SEQUENCE
 1173:d=5  hl=2 l=   9 prim: OBJECT        :1.3.6.1.4.1.311.21.7
 1184:d=5  hl=2 l=  49 prim: OCTET STRING      [HEX DUMP]:302F06272B0601040182371508AFAB1B85DD9D4F82E999398785C52C83F1EE
23778AC8DE812D86D8DA8E14020164020106
 1235:d=4  hl=2 l=  19 cons: SEQUENCE
 1237:d=5  hl=2 l=   3 prim: OBJECT        :X509v3 Extended Key Usage
 1242:d=5  hl=2 l=  12 prim: OCTET STRING      [HEX DUMP]:300A06082B06010505070301
 1256:d=4  hl=2 l=  27 cons: SEQUENCE
 1258:d=5  hl=2 l=   9 prim: OBJECT        :1.3.6.1.4.1.311.21.10
 1269:d=5  hl=2 l=  14 prim: OCTET STRING      [HEX DUMP]:300C300A06082B06010505070301
 1285:d=1  hl=2 l=  13 cons: SEQUENCE
 1287:d=2  hl=2 l=   9 prim: OBJECT        :sha1WithRSAEncryption
 1298:d=2  hl=2 l=   0 prim: NULL
 1300:d=1  hl=4 l= 257 prim: BIT STRING


I can get 98% of what I need in one command, with my new tool


>getrootdse myORG.org 636 ssl


Performing a RootDSE search ...
supportedSASLMechanisms   is GSSAPI
supportedSASLMechanisms   is GSS-SPNEGO
supportedSASLMechanisms   is EXTERNAL
supportedSASLMechanisms   is DIGEST-MD5
defaultNamingContext   is DC=myORG,DC=org
domainControllerFunctionality   is 3
ldapServiceName   is myORG.org:myDomcontr01$@myORG.ORG
supportedLDAPVersion   is 3
supportedLDAPVersion   is 2
dsServiceName   is CN=NTDS Settings,CN=myDomcontr01,CN=Servers,CN=myORG-West,CN=Sites,CN=Configurati
on,DC=myORG,DC=org
subschemaSubentry   is CN=Aggregate,CN=Schema,CN=Configuration,DC=myORG,DC=org
supportedLDAPPolicies   is MaxPoolThreads
supportedLDAPPolicies   is MaxDatagramRecv
supportedLDAPPolicies   is MaxReceiveBuffer
supportedLDAPPolicies   is InitRecvTimeout
supportedLDAPPolicies   is MaxConnections
supportedLDAPPolicies   is MaxConnIdleTime
supportedLDAPPolicies   is MaxPageSize
supportedLDAPPolicies   is MaxQueryDuration
supportedLDAPPolicies   is MaxTempTableSize
supportedLDAPPolicies   is MaxResultSetSize
supportedLDAPPolicies   is MaxNotificationPerConn
supportedLDAPPolicies   is MaxValRange
isSynchronized   is TRUE
dnsHostName   is myDomcontr01.myORG.org
supportedControl   is 1.2.840.113556.1.4.319
supportedControl   is 1.2.840.113556.1.4.801
supportedControl   is 1.2.840.113556.1.4.473
supportedControl   is 1.2.840.113556.1.4.528
supportedControl   is 1.2.840.113556.1.4.417
supportedControl   is 1.2.840.113556.1.4.619
supportedControl   is 1.2.840.113556.1.4.841
supportedControl   is 1.2.840.113556.1.4.529
supportedControl   is 1.2.840.113556.1.4.805
supportedControl   is 1.2.840.113556.1.4.521
supportedControl   is 1.2.840.113556.1.4.970
supportedControl   is 1.2.840.113556.1.4.1338
supportedControl   is 1.2.840.113556.1.4.474
supportedControl   is 1.2.840.113556.1.4.1339
supportedControl   is 1.2.840.113556.1.4.1340
supportedControl   is 1.2.840.113556.1.4.1413
supportedControl   is 2.16.840.1.113730.3.4.9
supportedControl   is 2.16.840.1.113730.3.4.10
supportedControl   is 1.2.840.113556.1.4.1504
supportedControl   is 1.2.840.113556.1.4.1852
supportedControl   is 1.2.840.113556.1.4.802
supportedControl   is 1.2.840.113556.1.4.1907
supportedControl   is 1.2.840.113556.1.4.1948
supportedControl   is 1.2.840.113556.1.4.1974
supportedControl   is 1.2.840.113556.1.4.1341
supportedControl   is 1.2.840.113556.1.4.2026
isGlobalCatalogReady   is TRUE
forestFunctionality   is 2
supportedCapabilities   is 1.2.840.113556.1.4.800
supportedCapabilities   is 1.2.840.113556.1.4.1670
supportedCapabilities   is 1.2.840.113556.1.4.1791
supportedCapabilities   is 1.2.840.113556.1.4.1935
highestCommittedUSN   is 376377966
rootDomainNamingContext   is DC=myORG,DC=org
schemaNamingContext   is CN=Schema,CN=Configuration,DC=myORG,DC=org
namingContexts   is DC=myORG,DC=org
namingContexts   is CN=Configuration,DC=myORG,DC=org
namingContexts   is CN=Schema,CN=Configuration,DC=myORG,DC=org
configurationNamingContext   is CN=Configuration,DC=myORG,DC=org
serverName   is CN=myDomcontr01,CN=Servers,CN=myORG-West,CN=Sites,CN=Configuration,DC=myORG,DC=org


currentTime   is 20111123004547.0Z
domainFunctionality   is 2


RootDSE search completed.


SSL for encryption is enabled
SSL information:
    cipher strength: 128
    exchange strength: 1024
    protocol: Tls1Client
    hash strength: 160
    algorithm: Aes128
    KeyExAlgo: 41984


The certificate did NOT validate correctly
The cert information is:
  Subject: CN=myDomcontr01.myORG.org
 Issuer: CN=myORG USA Issuer CA 06 v1, DC=myORG, DC=org
 Expires: 8/21/2012 6:46:46 AM
 Hash: 6D8F0501B7881A0DCCC84E1DCF4E1DF0646A4479
 Public Key: 30818902818100C9D8ADE08D8CC893934C95AFF45DCFAB317B83CD0A93D659B181B8AB476D49954F94E2EE148C9A095C86592DCA458
B488DB3D5BDE5F14EAD3FBBB0D15A6DB1B48B587EB13984B15D27B2BEF4AF421BE8861B4A0C704A5510C5A2D431202675D65F9455573BDA2083D1DCD
6A2541FDA6CD6205FFACE670467366F9FC763B5C8B50203010001
 Serial: 1BC1C68D000000005EC9


Here's the c# code
https://s3.amazonaws.com/markgamache/doWork.cs

the .NET 2.0 assembly.
https://s3.amazonaws.com/markgamache/GetRootDSE.exe

Enjoy.

Monday, December 6, 2010

Tuning F5 Big-IP Performance to Ruin Your Performance

Or Configuring the F5 BigIP to shoot you in the foot.
Ever have bizarre connection problems when a client hits a Big-IP, but not when it hits the pool members directly? Is it happening to you right now and you don’t even know it? If your application is not stateless, this post may help you a lot. If nothing else, it covers some fundamentals that we've all let rust. Also, just because you aren't seeing the problem doesn't mean it is not there…
For years, we have seen a few LDAP connection problems that could not be explained, but the volume was so low that no one was willing to take the time to work with us to find out what was going on. With the addition of some new high volume systems, the problem has become more frequent and had greater impact, probably due to poor LDAP exception handling. Our service is an LDAP with approximately 20,000,000 million objects in it, each with around 28 attributes. We service 17 to 20 million LDAP requests a day. This an average of 231 LDAP request per second, if you do the math, but as most hours are slow, our peak is closer to 600 per second. This service is actually 3 LDAP servers fronted by a single Big-IP HA pair.

Before we dig in, the F5 Big-IP or Local Traffic Manager (LTM) is a modern marvel. The people who build these things and make the “magic” they do are amazing. They can solve some really complex availability and scaling challenges. Like most marvels, when applied wrong, they may not help, or even do harm. Try taking 20,000 grams of Amoxicillin for a headache. :-P
The Big-IP was originally created to load balance traffic to web sites that take very high volumes of traffic. If you have gotten this far, you probably already know that the Big-IP is optimized to spread the traffic to a pool of servers so that they all get a portion of the traffic. The tricky part is that there are hundreds of levers, knobs and switches, most of which default to the most common use case, HTTP traffic.
While HTTP uses TCP as its transport, HTTP IS STATELESS. HTTP does not care if you have already opened a TCP socket or not. It doesn’t care if the socket the browser uses is dead. That is, your web browser is coded to deal with every possible TCP use case fast and efficiently. On the other hand, some Application Layer Protocols, such as LDAP, are stateful and use the connection oriented nature of TCP to maintain state. This is at the heart of the problems I saw last week (and ongoing for years…).

LDAP, in the most common use case, requires a client connection to bind to the server using a username and password. The entire session is authenticated in that user’s context and all reads and writes are filtered through an access control list in the directory. Just a reminded, an LDAP server is a TCP listener that translates the LDAP protocol to an x.500 directory. An LDAP server is not a directory in and of itself, it is just a port and a set of communications rules optimized for access to an x.500 directory.
Here is a typical LDAP session with 192.168.0.15 as the server:
1 14:48:40.484934 192.168.0.20 192.168.0.15 TCP 12403 > ldap [SYN] Seq=4167455526 Win=65535 Len=0 MSS=1260
2 14:48:40.523621 192.168.0.15 192.168.0.20 TCP ldap > 12403 [SYN, ACK] Seq=4126863585 Ack=4167455527 Win=3780 Len=0
3 14:48:40.523683 192.168.0.20 192.168.0.15 TCP 12403 > ldap [ACK] Seq=4167455527 Ack=4126863586 Win=65535 Len=0
4 14:48:40.603744 192.168.0.20 192.168.0.15 LDAP bindRequest(1) "mgamach@testldap.org" simple
5 14:48:40.642792 192.168.0.15 192.168.0.20 LDAP bindResponse(1) success
6 14:48:40.649008 192.168.0.20 192.168.0.15 LDAP searchRequest(2) "DC=testldap,DC=org" baseObject
7 14:48:40.675407 192.168.0.15 192.168.0.20 LDAP searchResEntry(2) "DC=testldap,DC=org" | searchResDone(2) success
8 14:49:12.076343 192.168.0.20 192.168.0.15 LDAP unbindRequest(8)
9 14:49:12.076718 192.168.0.20 192.168.0.15 TCP 12403 > ldap [FIN, ACK] Seq=4167456246 Ack=4127490647 Len=0
10 14:49:12.100687 192.168.0.15 192.168.0.20 TCP ldap > 12403 [FIN, ACK] Seq=4127490647 Ack=4167456246 Len=0
11 14:49:12.100768 192.168.0.20 192.168.0.15 TCP 12403 > ldap [ACK] Seq=4167456247 Ack=4127490648 Len=0
12 14:49:12.105203 192.168.0.15 192.168.0.20 TCP ldap > 12403 [FIN, ACK] Seq=4127490647 Ack=4167456247 Len=0
The things worth noting are that A) the bind can’t occur until after the TCP 3-way handshake, B) when the client sends the unbind the server responds with a FIN, starting the TCP session tear-down, and C) this is one TCP session for the whole LDAP session. An LDAP client can run multiple LDAP session at once, in which case, each would have its own ephemeral port. While some LDAP server and client code can perform multiple asynchronous calls inside one TCP session, it is not very common and often not done well, but he client code developers. By far, the most common method is to treat each LDAP session as a synchronous session, request, then reply then request, then reply and so on until the unbind.
If the TCP session fails for any reason, the LDAP session is gone, there is no way to revive it or reconnect to it. Hopefully the client app and underlying systems are written well enough to deal with a failure and retry in a well considered manner. This is often not the case; consider the Java bug that does not inform the LDAP stack of a TCP reset for 15 seconds. That is one very painful timeout if a customer is waiting.
Back to the problem that we were seeing… At an application level our internal customers were seeing various connection errors in their logs. Different OSs and JVM versions seemed to throw slightly different errors, adding a bit to the confusion. It took quite some time to get the data we needed to diagnose the issue. We needed a packet capture on both sides of the BigIP as well as time stamped failure logs and IPs from the apps. With such high volume traffic, trying to find an error without somewhere to start is like trying to find one special needle in a needle stack, while wearing someone else’s bifocals.
Having the data in hand and using several WireShark tricks that I will blog on later, I found the root cause. The Big-IP config we were given was not vetted for awesome. We assumed that the networking team that creates the VIP and configures the pool analyzed our app and set up a config to optimize our performance. In fact, the config seemed a bit random.
We had already visited the config once as the F5 was introducing a 13x delay in response times. This turned out to be a TCP Profile setting that was set to WAN optimized. The settings in question optimized for bandwidth savings, over speed. Ooops! The culprit that we also fixed, prior to this, was that the TPC Idle timeout on the F5 did not match that of our LDAP servers. The F5 was set to close idle threads after 5 minutes, while the LDAP servers allow 2 hours. Finally, there isn’t just one TCP profile; there is one server (S-NAT) side and one client (VIP) side. Ours had different idle timeouts, leaving the F5 stuck between two parties who expected the session to be held open for different amounts of time. With those finds and fixes in place, we still see about a .5% failure rates, which is totally unacceptable with such a high volume service. That is still an average or more that once per second.
Here’s the last bit of fundamentals needed to understand why we were still seeing failures. There was only one IP in our F5 S-NAT pool, this means that it could, theoretically, have 65536 simulations connections to our pool members. However, the default setting for the use of those ports does not force the use of them all. The F5 setting TM.PreserveClientPort, by default has the S-NAT IP use the same ephemeral port as the upstream side of the conversation; meaning the client to VIP. The laws of TCP/IP say that the S-NAT IP can only make one connection on one port. This means if two of our clients make calls using the same ephemeral port, at the same time, the one with the established connection stays live and the other gets a nice TCP reset in its Christmas stocking. This is a bit more confusing for the app because the F5 accepts the VIP side TCP handshake while it is still getting ready to try the S-NAT side connection. Only then does it realize that the port is taken and it can’t connect. Once that happens, the F5 sends and unexpected TCP reset to the client when it is expecting to be able to do an LDAP bind.

At this point, you are thinking, “but Mark, what are the odds of two of your clients using the same ephemeral port at the same time? They must be crazy low”. Well, there’s more. Different OSs limit the number of possible ephemeral ports that can be used. BSD for instance used to only use 1024 to 3976. In our case most of our clients are on Solaris 10, which allows anything above 32768. We have around 20 clients that pseudo-randomly pick a port between 32768 and 65536. This is actually a probability problem exactly like the Birthday Problem. We just have 20 people (servers) with 32768 days (ports) to choose from instead of 365. We are looking to know what the probability is of any 2 of the servers using the same port. I will ignore the complexity of adding time as a factor. That is, some clients may hold a port open for longer than others, depending on the operations being performed. If we just assume that all the machines connect once using a random port, then our odds come up as .6% which is right next to our observed impact. As we will see in a minute, this number may or may not make sense. It is really just to make the point that it is a lot more likely to have a collision, than might seem intuitive.
While there are still a couple more factors to look at, let’s look at a failure in action.
1 0.000000 192.168.113.59 192.168.0.10 TCP 61325 > ldap [SYN] Seq=2258030186 Win=32768 Len=0 MSS=1460 WS=0
2 0.000005 192.168.0.10 192.168.113.59 TCP ldap > 61325 [SYN, ACK] Seq=2422958290 Ack=2258030187 Win=4380 Len=0 MSS=1460 WS=0
3 0.042001 192.168.113.59 192.168.0.10 LDAP bindRequest(1) "uid=appBindAcct,ou=applications,c=us,dc=testLDAP,dc=com" simple
4 0.042010 10.99.248.20 10.99.26.25 TCP 61325 > ovsessionmgr [SYN] Seq=4287903691 Win=4380 Len=0 MSS=1460 WS=0
5 0.042013 192.168.0.10 192.168.113.59 TCP ldap > 61325 [ACK] Seq=2422958291 Ack=2258030284 Win=4477 Len=0
6 0.042734 10.99.26.25 10.99.248.20 TCP ovsessionmgr > 61325 [SYN, ACK] Seq=1298286822 Ack=4287903692 Win=49640 Len=0 MSS=1460 WS=0
7 0.042738 10.99.248.20 10.99.26.25 TCP 61325 > ovsessionmgr [ACK] Seq=4287903692 Ack=1298286823 Win=4380 Len=0
8 0.042741 10.99.248.20 10.99.26.25 LDAP bindRequest(1) "uid=appBindAcct,ou=applications,c=us,dc=testLDAP,dc=com" simple
9 0.043455 10.99.26.25 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=1298286823 Ack=4287903789 Win=49543 Len=0
10 0.043699 10.99.26.25 10.99.248.20 LDAP bindResponse(1) success
11 0.043703 192.168.0.10 192.168.113.59 LDAP bindResponse(1) success
12 0.043706 10.99.248.20 10.99.26.25 TCP 61325 > ovsessionmgr [ACK] Seq=4287903789 Ack=1298286837 Win=4394 Len=0
13 0.088130 192.168.113.59 192.168.0.10 LDAP searchRequest(2) "mobile=2065551234,ou=subscribers,c=us,dc=testLDAP,dc=com" baseObject
14 0.088135 10.99.248.20 10.99.26.25 LDAP searchRequest(2) "mobile=2065551234,ou=subscribers,c=us,dc=testLDAP,dc=com" baseObject
15 0.088138 192.168.0.10 192.168.113.59 TCP ldap > 61325 [ACK] Seq=2422958305 Ack=2258030416 Win=4609 Len=0
16 0.088866 10.99.26.25 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=1298286837 Ack=4287903921 Win=49640 Len=0
17 0.089591 10.99.26.25 10.99.248.20 LDAP searchResEntry(2) "mobile=2065551234,ou=subscribers,c=us,dc=testLDAP,dc=com"
18 0.089595 192.168.0.10 192.168.113.59 LDAP searchResEntry(2) "mobile=2065551234,ou=subscribers,c=us,dc=testLDAP,dc=com"
19 0.089598 10.99.248.20 10.99.26.25 TCP 61325 > ovsessionmgr [ACK] Seq=4287903921 Ack=1298286926 Win=4483 Len=0
20 0.089600 10.99.26.25 10.99.248.20 LDAP searchResDone(2) success
21 0.089602 192.168.0.10 192.168.113.59 LDAP searchResDone(2) success
22 0.089605 10.99.248.20 10.99.26.25 TCP 61325 > ovsessionmgr [ACK] Seq=4287903921 Ack=1298286940 Win=4497 Len=0
23 0.130614 192.168.113.59 192.168.0.10 TCP 61325 > ldap [ACK] Seq=2258030416 Ack=2422958408 Win=32768 Len=0
24 0.130861 192.168.113.59 192.168.0.10 LDAP unbindRequest(3)
25 0.130865 10.99.248.20 10.99.26.25 LDAP unbindRequest(3)
26 0.130867 192.168.0.10 192.168.113.59 TCP ldap > 61325 [ACK] Seq=2422958408 Ack=2258030452 Win=4645 Len=0
27 0.131347 192.168.113.59 192.168.0.10 TCP 61325 > ldap [FIN, ACK] Seq=2258030452 Ack=2422958408 Win=32768 Len=0
28 0.131351 192.168.0.10 192.168.113.59 TCP ldap > 61325 [ACK] Seq=2422958408 Ack=2258030453 Win=4645 Len=0
29 0.131354 10.99.248.20 10.99.26.25 TCP 61325 > ovsessionmgr [FIN, ACK] Seq=4287903957 Ack=1298286940 Win=4497 Len=0
30 0.131356 10.99.26.25 10.99.248.20 TCP ovsessionmgr > 61325 [FIN, ACK] Seq=1298286940 Ack=4287903957 Win=49640 Len=0
31 0.131359 10.99.248.20 10.99.26.25 TCP 61325 > ovsessionmgr [FIN, ACK] Seq=4287903957 Ack=1298286941 Win=4497 Len=0
32 0.131362 192.168.0.10 192.168.113.59 TCP ldap > 61325 [FIN, ACK] Seq=2422958408 Ack=2258030453 Win=4645 Len=0
33 0.132083 10.99.26.25 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=1298286941 Ack=4287903958 Win=49640 Len=0
34 0.132088 10.99.26.25 10.99.248.20 TCP [TCP Dup ACK 33#1] ovsessionmgr > 61325 [ACK] Seq=1298286941 Ack=4287903958 Win=49640 Len=0
35 0.172359 192.168.113.59 192.168.0.10 TCP 61325 > ldap [ACK] Seq=2258030453 Ack=2422958409 Win=32768 Len=0
36 56.717368 192.168.113.58 192.168.0.10 TCP 61325 > ldap [SYN] Seq=3620104483 Win=32768 Len=0 MSS=1460 WS=0
37 56.717374 192.168.0.10 192.168.113.58 TCP ldap > 61325 [SYN, ACK] Seq=3332499193 Ack=3620104484 Win=4380 Len=0 MSS=1460 WS=0
38 56.758628 192.168.113.58 192.168.0.10 LDAP bindRequest(1) "uid=appBindAcct,ou=applications,c=us,dc=testLDAP,dc=com" simple
39 56.758638 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [SYN] Seq=258910960 Win=4380 Len=0 MSS=1460 WS=0
40 56.758640 192.168.0.10 192.168.113.58 TCP ldap > 61325 [ACK] Seq=3332499194 Ack=3620104581 Win=4477 Len=0
41 56.759124 10.99.26.22 10.99.248.20 TCP ovsessionmgr > 61325 [SYN, ACK] Seq=3696686467 Ack=258910961 Win=49640 Len=0 MSS=1460 WS=0
42 56.759128 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [ACK] Seq=258910961 Ack=3696686468 Win=4380 Len=0
43 56.759131 10.99.248.20 10.99.26.22 LDAP bindRequest(1) "uid=appBindAcct,ou=applications,c=us,dc=testLDAP,dc=com" simple
44 56.759605 10.99.26.22 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=3696686468 Ack=258911058 Win=49543 Len=0
45 56.760583 10.99.26.22 10.99.248.20 LDAP bindResponse(1) success
46 56.760588 192.168.0.10 192.168.113.58 LDAP bindResponse(1) success
47 56.760591 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [ACK] Seq=258911058 Ack=3696686482 Win=4394 Len=0
48 56.802823 192.168.113.58 192.168.0.10 LDAP modifyRequest(2) "mobile=4255552345,ou=subscribers,c=us,dc=testLDAP,dc=com"
49 56.802830 10.99.248.20 10.99.26.22 LDAP modifyRequest(2) "mobile=4255552345,ou=subscribers,c=us,dc=testLDAP,dc=com"
50 56.802833 192.168.0.10 192.168.113.58 TCP ldap > 61325 [ACK] Seq=3332499208 Ack=3620104727 Win=4623 Len=0
51 56.803309 10.99.26.22 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=3696686482 Ack=258911204 Win=49640 Len=0
52 56.814303 10.99.26.22 10.99.248.20 LDAP modifyResponse(2) success
53 56.814309 192.168.0.10 192.168.113.58 LDAP modifyResponse(2) success
54 56.814312 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [ACK] Seq=258911204 Ack=3696686496 Win=4408 Len=0
55 56.856552 192.168.113.58 192.168.0.10 LDAP unbindRequest(3)
56 56.856557 10.99.248.20 10.99.26.22 LDAP unbindRequest(3)
57 56.856560 192.168.0.10 192.168.113.58 TCP ldap > 61325 [ACK] Seq=3332499222 Ack=3620104763 Win=4659 Len=0
58 56.857061 10.99.26.22 10.99.248.20 TCP ovsessionmgr > 61325 [FIN, ACK] Seq=3696686496 Ack=258911240 Win=49640 Len=0
59 56.857066 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [ACK] Seq=258911240 Ack=3696686497 Win=4408 Len=0
60 56.857068 192.168.0.10 192.168.113.58 TCP ldap > 61325 [FIN, ACK] Seq=3332499222 Ack=3620104763 Win=4659 Len=0
61 56.857269 192.168.113.58 192.168.0.10 TCP 61325 > ldap [FIN, ACK] Seq=3620104763 Ack=3332499222 Win=32768 Len=0
62 56.898034 192.168.113.58 192.168.0.10 TCP 61325 > ldap [ACK] Seq=3620104764 Ack=3332499223 Win=32768 Len=0
63 57.381196 192.168.113.58 192.168.0.10 TCP 61325 > ldap [FIN, ACK] Seq=3620104763 Ack=3332499223 Win=32768 Len=0
64 57.381200 192.168.0.10 192.168.113.58 TCP ldap > 61325 [ACK] Seq=3332499223 Ack=3620104764 Win=4659 Len=0
65 57.381204 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [FIN, ACK] Seq=258911240 Ack=3696686497 Win=4408 Len=0
66 57.381931 10.99.26.22 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=3696686497 Ack=258911241 Win=49640 Len=0
67 70.622841 192.168.113.38 192.168.0.10 TCP 61325 > ldap [SYN] Seq=2324271576 Win=32768 Len=0 MSS=1460 WS=0
68 70.622846 192.168.0.10 192.168.113.38 TCP ldap > 61325 [SYN, ACK] Seq=151461930 Ack=2324271577 Win=4380 Len=0 MSS=1460 WS=0
69 70.664112 192.168.113.38 192.168.0.10 LDAP bindRequest(1) "uid=appBindAcct,ou=applications,c=us,dc=testLDAP,dc=com" simple
70 70.664121 10.99.248.20 10.99.26.22 TCP [TCP Port numbers reused] 61325 > ovsessionmgr [SYN] Seq=273484448 Win=4380 Len=0 MSS=1460 WS=0
71 70.664124 192.168.0.10 192.168.113.38 TCP ldap > 61325 [ACK] Seq=151461931 Ack=2324271674 Win=4477 Len=0
72 71.664114 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [SYN] Seq=273484448 Win=4380 Len=0 MSS=1460 WS=0
73 72.864329 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [SYN] Seq=273484448 Win=4380 Len=0 MSS=1460 WS=0
74 74.063807 10.99.248.20 10.99.26.22 TCP 61325 > ovsessionmgr [SYN] Seq=273484448 Win=4380 Len=0 MSS=1460
75 75.264023 192.168.0.10 192.168.113.38 TCP ldap > 61325 [RST, ACK] Seq=151461931 Ack=2324271674 Win=4477 Len=0
76 95.794344 192.168.113.37 192.168.0.10 TCP 61325 > ldap [SYN] Seq=136650233 Win=32768 Len=0 MSS=1460 WS=0
77 95.794349 192.168.0.10 192.168.113.37 TCP ldap > 61325 [SYN, ACK] Seq=3257078415 Ack=136650234 Win=4380 Len=0 MSS=1460 WS=0
78 95.835606 192.168.113.37 192.168.0.10 LDAP bindRequest(1) "uid=appBindAcct,ou=applications,c=us,dc=testLDAP,dc=com" simple
79 95.835616 10.99.248.20 10.99.26.28 TCP 61325 > ovsessionmgr [SYN] Seq=3796674135 Win=4380 Len=0 MSS=1460 WS=0
80 95.835619 192.168.0.10 192.168.113.37 TCP ldap > 61325 [ACK] Seq=3257078416 Ack=136650331 Win=4477 Len=0
81 95.836114 10.99.26.28 10.99.248.20 TCP ovsessionmgr > 61325 [SYN, ACK] Seq=3190364148 Ack=3796674136 Win=49640 Len=0 MSS=1460 WS=0
82 95.836118 10.99.248.20 10.99.26.28 TCP 61325 > ovsessionmgr [ACK] Seq=3796674136 Ack=3190364149 Win=4380 Len=0
83 95.836121 10.99.248.20 10.99.26.28 LDAP bindRequest(1) "uid=appBindAcct,ou=applications,c=us,dc=testLDAP,dc=com" simple
84 95.836578 10.99.26.28 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=3190364149 Ack=3796674233 Win=49543 Len=0
85 95.837317 10.99.26.28 10.99.248.20 LDAP bindResponse(1) success
86 95.837320 192.168.0.10 192.168.113.37 LDAP bindResponse(1) success
87 95.837323 10.99.248.20 10.99.26.28 TCP 61325 > ovsessionmgr [ACK] Seq=3796674233 Ack=3190364163 Win=4394 Len=0
88 95.879556 192.168.113.37 192.168.0.10 LDAP searchRequest(2) "mobile=3605551234,ou=subscribers,c=us,dc=testLDAP,dc=com" baseObject
89 95.879562 10.99.248.20 10.99.26.28 LDAP searchRequest(2) "mobile=3605551234,ou=subscribers,c=us,dc=testLDAP,dc=com" baseObject
90 95.879565 192.168.0.10 192.168.113.37 TCP ldap > 61325 [ACK] Seq=3257078430 Ack=136650861 Win=5007 Len=0
91 95.880280 10.99.26.28 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=3190364163 Ack=3796674763 Win=49640 Len=0
92 95.881746 10.99.26.28 10.99.248.20 LDAP searchResEntry(2) "mobile=3605551234,ou=subscribers,c=us,dc=testLDAP,dc=com"
93 95.881751 192.168.0.10 192.168.113.37 LDAP searchResEntry(2) "mobile=3605551234,ou=subscribers,c=us,dc=testLDAP,dc=com"
94 95.881754 10.99.248.20 10.99.26.28 TCP 61325 > ovsessionmgr [ACK] Seq=3796674763 Ack=3190364902 Win=5133 Len=0
95 95.881756 10.99.26.28 10.99.248.20 LDAP searchResDone(2) success
96 95.881758 192.168.0.10 192.168.113.37 LDAP searchResDone(2) success
97 95.881761 10.99.248.20 10.99.26.28 TCP 61325 > ovsessionmgr [ACK] Seq=3796674763 Ack=3190364916 Win=5147 Len=0
98 95.922763 192.168.113.37 192.168.0.10 TCP 61325 > ldap [ACK] Seq=136650861 Ack=3257079183 Win=32768 Len=0
99 95.923035 192.168.113.37 192.168.0.10 LDAP unbindRequest(3)
100 95.923038 10.99.248.20 10.99.26.28 LDAP unbindRequest(3)
101 95.923040 192.168.0.10 192.168.113.37 TCP ldap > 61325 [ACK] Seq=3257079183 Ack=136650897 Win=5043 Len=0
102 95.923504 192.168.113.37 192.168.0.10 TCP 61325 > ldap [FIN, ACK] Seq=136650897 Ack=3257079183 Win=32768 Len=0
103 95.923507 192.168.0.10 192.168.113.37 TCP ldap > 61325 [ACK] Seq=3257079183 Ack=136650898 Win=5043 Len=0
104 95.923509 10.99.248.20 10.99.26.28 TCP 61325 > ovsessionmgr [FIN, ACK] Seq=3796674799 Ack=3190364916 Win=5147 Len=0
105 95.923511 10.99.26.28 10.99.248.20 TCP ovsessionmgr > 61325 [FIN, ACK] Seq=3190364916 Ack=3796674799 Win=49640 Len=0
106 95.923515 10.99.248.20 10.99.26.28 TCP 61325 > ovsessionmgr [FIN, ACK] Seq=3796674799 Ack=3190364917 Win=5147 Len=0
107 95.923517 192.168.0.10 192.168.113.37 TCP ldap > 61325 [FIN, ACK] Seq=3257079183 Ack=136650898 Win=5043 Len=0
108 95.924223 10.99.26.28 10.99.248.20 TCP ovsessionmgr > 61325 [ACK] Seq=3190364917 Ack=3796674800 Win=49640 Len=0
109 95.924228 10.99.26.28 10.99.248.20 TCP [TCP Dup ACK 108#1] ovsessionmgr > 61325 [ACK] Seq=3190364917 Ack=3796674800 Win=49640 Len=0
110 95.964510 192.168.113.37 192.168.0.10 TCP 61325 > ldap [ACK] Seq=136650898 Ack=3257079184 Win=32768 Len=0
First off, all of this traffic is from ephemeral port 61325, whether it is from the client or the S-NAT side of the F5. All the traffic to and from the ldap port is on the VIP side of the F5. All of the traffic to and from the ovsessionmgr port is on the S-NAT side of the F5, to the LDAP servers.
LDAP VIP: 192.168.0.10
S-NAT IP: 10.99.248.20
LDAP Servers: 10.99.26.22, 10.99.26.25, 10.99.26.28
The rest of the IPs are clients.
First off, this capture only lasts 95 seconds, yet we see 4 different clients use the same ephemeral port in that short time. This 13 KB capture is, of course, massively filtered. The actual 95 second window is about 250 MB.
During this time, there is only one TCP reset sent (Frame 75), so we will analyze what happens there first, then go back to why there aren’t more failures.
In frames 55 and 56, we see a client send an unbind to the F5 and the F5 sends the unbind to the LDAP server. This starts the end of the two TCP sessions.
58 to 66 are the proper tear downs of the TCP sessions on both sides of the F5. This means that Client, F5 VIP side, F5 S-NAT side and server all agree the TCP sessions are over.
67 is a different client staring a new TCP handshake with the VIP. It is using the same ephemeral port, 61325. As TCP Segments can arrive out of order, we can see that 68 and 71 complete the TCP handshake, then the client attempts to bind in 69. Keep in mind that the data will be delivered from the IP stack to the app in order.
70 is the S-NAT trying to establish a TCP session with a pool member, 10.99.26.22. There is no SYN/ACK segment, so the S-NAT IP tries again in 72, 73 and 74. After the four failures to establish the S-NAT side TCP session, the F5 gives up and has to do something with the established connection on the VIP side, so it sends the TCP reset in frame 75.
“But wait”, you cry, “In Frame 66 we saw the final ACK of the session close with 10.99.26.22”. WAIT indeed, TIME_WAIT to be more exact. RFC 793 says that the TCP session, while closed in a data transmission sense, is still left locked in a state called TIME_WAIT. This is designed to make sure that any segments that were out of order make it to their destination before the socket is completely closed and refuses the data. It also makes sure that there is enough time for the final ACK of the close process has arrived. After all you can’t ACK that ACK or it’s not the final ACK.
Here’s the rub, TIME_WAIT was defined as 2 times the Maximum Segment Lifetime (MSL) in RFC 793. This value was, as the FRC puts it, is “arbitrary”. To be more exact, it made sense based on the networking technologies when the RFC was written in 1981. This value, which is 4 minutes (2 X an MSL of 2 min), was appropriate for the slow networks of the day. Since then, some vendors have decided this is too long. There is good reason for this. If a server needs to make a large number of outbound connections, this 4 minute penalty before the port can be reused can lead to port exhaustion. This value is tunable on an OS level. The default for windows is still 4 minutes. Our LDAP servers have 60 second TIME_WAIT, as part of the Solaris settings. TIME_WAIT means different things to the client and server. The client cannot make any calls out on that ephemeral port, not to any server, until the wait is over. The server cannot accept traffic from the source host on that same source port until the wait is over. The server could accept new traffic from the same source host, if it were on a new ephemeral port.
Between frames 66 and 74, the final ACK and the final SYN attempt, we only see about 12 seconds. The LDAP server is clearly in its 60 second TIME_WAIT and properly refusing the connection. Now you’re confused and so was I. Well, there are a couple more F5 TCP Profile settings that come into play. First, the TIME_WAIT is settable in the TCP profile and defaults to 2 seconds!! The client and server don’t agree, by way of negotiation, on a TIME_WAIT, they both have their own. This mismatch makes collisions possible. This setting is great to have, but it must be set for your use case. The default is not right for our use case. Second, and even more interesting/scary, is the TIME_WAIT Recycle option, which is defaulted to on. This behavior is described as TIME-WAIT Truncation, in RFC 1644. This allows the S-NAT side to cut the TIME_WAIT short if it receives a packet wanting the port, just like we see in frame 72. In this case, the setting doesn’t matter, as the F5 2 second timer is long over. This setting makes a lot of sense, performance wise, IF the S-NAT does not try to hit a pool member that is still in TIME_WAIT. In our case, we see that all 3 pool members were hit in that 95 seconds window. As we only have one S-NAT IP to work with and the F5 can’t guess at the TIME_WAIT on the pool members, we are setup for failure.
The moral of the story is, all the F5's performance settings have tradeoffs. By keeping TIME_WAIT low, the F5 can save on memory, by dumping the session early and avoid port exhaustion, but unless you have the proper number of S-NAT IPs, you are shooting yourself in the foot. By setting the F5 TIME_WAIT to match the time wait of your severs, you avoid the risk of collisions, but you tie up the ports and RAM on the F5. In essence, Adding IPs to the S-NAT IP pool lets us use probability to avoid most collisions.
The bottom line is, make sure your F5 setting match your use case. You may also want to read this article from F5 on things that may cause the F5 to send a TCP reset.
Lessons learned:
Configuring a load balancer well requires great knowledge about not just networking, but the nature of the application being load balanced.
F5 has done such a great job optimizing the defaults for HTTP, that most customers may never even see problems, if they are there at all.
Fundamentals like TCP session teardown and TCP state are worth knowing more about. There is no better reference than TCP/IP Illustrated by Stevens.

Inputting falsified referrals to this site violates the terms of service of this site and is considered unauthorized access (hacking).