HTTPS, Fragmentation, and MTU Size

I faced a situation where a server had been migrated to a public cloud provider, and suddenly certain services were no longer working. Looks like we’ve got an MTU problem!

In particular, we found that accessing a particular third-party finance service over SSL was failing. We also found that we could not access their website.

On deeper inspection, we also found that we were having trouble getting to various other websites, but only the ones that used HTTPS.

This error only happened when we used our WAN connection to the provider. If we used the provider’s native internet link, everything was fine.

What Was The Issue?

I’ll get straight to the point on this one, and keep it brief. We found that the don’t fragment bit was set on SSL traffic.

It seems that the websites that were failing had a lot of data to transfer, pushing the packet size over the MTU on our WAN.

Packets need to fragment if they are over the MTU size

This meant that the packet needed to be fragmented, but the DF bit prevented it. So, the result was that the packet was dropped.

I haven’t been able to find anything to say that SSL/HTTPS traffic should have the DF bit set, although others report having the same issue. It seems that this is specific to webservers or browsers.

How is This Resolved?

The problem was clearly on the WAN. Part of the solution was an encrypted GRE tunnel over a private link.

Looking at the tunnel, it was easy to see that the MTU values were incorrect. Specifically, in this case, the MPLS MTU size.

The resolution included finding the correct size using ping and then altering the configuration to match.

After this, everything started working correctly. Show’s how important it is to understand how MTU affects the network.