A surprising number of foolish Slashdotters have pointed out that
my latest work, breaking
the NTLM and LM handshakes and phishing for users’ NT hashes, is totally
irrelevant and has been for 12ish years.
As a fan of debate, I’ll start with points that are interesting but have no real bearing on the topic.
- Slashdotters are clearly not qualified to make this assessment. Their Appeal to Authority fails.
- Microsoft wouldn’t issue an advisory and Fix-It if it weren’t relevant. My Appeal to Authority is better than theirs. ;-)
- The existence of one a newer protocol, Kerberos, does not make NTLM simply disappear.
- Wikipedia actually doesn’t say NTLM is long dead. Wikipedia as an appeal to authority is a joke. I link to it regularly, not for its completeness, but because it is written for a layman audience. It is a great place to start if you don’t know something.
- I’m not a Linux fanboy looking to disgrace MS. I’m a long time MCSE and even gave MS some props in my post.
- Finally, my work is what it is, probably the last nail, of hundreds, in the coffin. I make no claim to it being inspired by God.
Now to the relevant details... When MS introduced Active Directory in Windows 2000, they implemented Kerberos 5 as the default authentication protocol FOR DOMAIN ACCOUNTS. This is a pretty important requirement. If a machine is not domain joined, or the account is not a domain account, Kerberos is not an option. The upside here is that when machines are in workgroups, it is much less likely that the accounts will have any sort of value off of the host. However, this does not stop the SSPI from trying to authenticate using your cached account credentials when accessing resources that are not on the host. This means a workgroup host could still be vulnerable to my “Send the Hash” attack if not properly configured.
Even for machines that are domain joined, while Kerberos is the default, NTLM is used in several situations:
- If the service is not Kerberos enabled (Kerberized). Maybe it runs on an NT server?
- The service/server does not have a Service Principal Name (SPN) registered.
- The service/server has duplicate SPNs registered.
- When accessing the system by IP rather than name.
- Improperly built clusters.
- 3rd party system implemented incorrectly.
- When accessing data across forests, using an older domain type trust.
- When the client can’t access a KDC/DC, such as when it is outside the firewall.
- When the KDC/DC is behind NAT.
Before getting in depth with a couple of these cases, I’ll make a generalization, “Kerberos is very hard to get right, except under simple conditions.”
Outside the Firewall or Behind NAT
MS’s implementation of Kerberos requires that the clients, servers, and KDC/DCs all be on the same routed network with AD integrated DNS or DDNS that allows the DCs to register SRV records. The clients must be able to find and access the KDCs to get Kerberos tickets. I am not going to cover all the details of Kerberos, but this is a key difference. With NTLM, the server you want to access does the job of finding a DC and getting the DC to validate the challenge/response after your client has done its handshake. The resource server passes the challenge and response to the DC over RPC using packet privacy and gets back a pass/fail and a list of group memberships which it uses to build the user’s access token. This is super simple and easy when the client is outside your firewall. You only need to open one port, the application's port.
If you intend to make Windows Kerberos work across NAT or behind a firewall, prepare for pain. Each Windows client has a component called the dcLocator. Its exact operations vary slightly from version to version of Windows. You might think you just need to open up TCP88 to a KDC and you’re set. You might get a pony in the mail too.
I’ll blog on the exact details at some point, but the dcLocator first needs to find the KDC DNS SRV records in the _msdcs.domainname.org zone. Right off the bat, this means that you need split DNS, as the answers inside your firewall will not be the same IP as outside. Once you have your external DNS zone and main DC records, the client will ping all the DCs and select the fastest to respond. If ICMP is blocked, nothing proceeds. The client sends a CLDAP query to the fastest DC. This is connectionless LDAP over 389 UDP. This query is to ask which AD site the client is a member of. This query is not answered in a traditional way, based on the filter. Instead, AD uses the source IP to map the IP to an AD subnet which maps to an AD site, which the LDAP search response will contain. If your client is behind NAT, then the source IP will likely be a SNAT address. From this response, the dcLocator then does a second DNS SRV query to get the DCs that are in the AD site returned from the CLDAP query. The dcLocator then pings each of those DCs and the first to respond is queried and if the response is satisfactory, then this becomes the default DC for a period of time. This time can vary by OS version. Now we are ready to do Kerberos. Some versions of Windows try UDP88 first and then when they get back the “response too large” they try TCP88 route. If UDP is blocked, these versions may not try TCP88 even if it is open. I will not be swearing to this in court as it has been over a year since I configured this type of scenario and I am writing this without a net.
This means, that for Kerberos to work outside the firewall or behind NAT, you need to:
- Setup Split DNS
- Create at least one domain level SRV pointing to the external IP address
- Create a site DNS SRV record for EVERY DC in the default site, pointing to the external IP address.
- Open port 389 UDP
- Open port 88 UDP
- Open Port 88 TCP
- Open ICMP
NTLM is a lot easier to use in both NAT and outside the firewall scenarios.
Messed up SPN Scenarios
Principal Name (SPN)
maps an instance of a service running on a server to the account that it runs
under. Kerberos uses shared secrets, between each party and the KDC/DC,
to allow for authentication and key exchange. When a client wants to use
a service, it asks the KDC for a ticket to the specific service/server combination.
The ticket is encrypted by the KDC so that only the service can open it.
If there is no SPN registered for the service, a ticket can’t be
granted. If two accounts have the SPN registered, the ticket cannot be
granted. If the service is running under the context of a different
account, the ticket cannot be decrypted.
Many web admins just allow their appPools
to run as local system or network service. This means they are running
under the machine account, which is a domain account and has an SPN or at least
can have one. In order to make your cluster work with Kerberos, all appPools
must run under the same domain account and usually an SPN must be manually
created by an admin. MS has made some changes
to IIS to make this easier, but in many cases it is out of the
SQL server tries to register its SPN whenever it comes online. If you are following best practices, and using a domain account to run SQL, then SQL will try to register the SPN attribute on that account. This is a proper configuration, however most accounts so not have the SELF write SPN permission (ACE), and fail. This means that Kerberos won't work and NTLM is negotiated.
Connection by IP Address
This fails as there is not an SPN set that uses the IP, such as HTTP/10.0.0.5. Rather the SPN is HTTP/www.domain.com. This can be overcome by registering the IP SPN, but does not scale well. I’ve thought about building a simple tool to do this, but my list of to-dos is long.
In the end, the threat and attack models are very different for the enterprise and stand alone users.
The enterprise is likely to block many outgoing ports, making it less likely that “send the hash” attacks will succeed. This however doesn’t mean there is no risk from NTLMv1. With any type of foothold, the attack is VERY effective. All it takes is one compromised host, one badge not checked, a network port activated outside of your secure areas, etc.
Enterprise users spend time in hotels, airports, hospitals, etc. These types of places are ripe for the picking.
For workgroup members, the threats vary considerably. The user may spend much of their time behind NAT, which may make it hard for an attacker who steals a hash to use it. The user may spend time in coffee shops, etc, making them ripe for attack.
While Kerberos is more secure than NTLMv2, it is not really fair to say it is better. They both have pros and cons. NTLM is VERY commonly used today both by design and due to it being a fallback for failed Kerberos. Depending on your systems’ settings, you may be sending LM and NTLMv1. ALL VERSIONS of Windows still accept LM and NTLM by default, so you may still be allowing the issue. You may not be initiating dirty handshakes, but you will accept them if offered.
Finally, the fix is very simple. There is almost no downside
to making the change. If due to some crazy twist of fate you have a system that only works with NTLMv1 or LM, that system will break.