April 4, 2011 7:50 PM
The problem is sometimes what you think it can't be
I had one of those odd problems today. You know the sort: something stops working for no apparent reason, and although there are two identical machines sitting next to each other, and you can ping them both, but you can connect a terminal session to one and not the other.
I thought about it for a while and decided that the most likely issue was the firewall. After all, it couldn't be a routing issue (pings worked fine, and traceroutes showed the same path to both devices) and I knew the devices were working normally as I could connect to the supposedly inaccessible machine from other offices, just not the one I was sitting in.
The firewall guys were a bit puzzled: they saw the pings, and they saw the terminal session to the accessible server. But they weren't even seeing the packets destined for the inaccessible one - and if the firewall isn't receiving them, it can't be the thing dropping them.
That was Plan A out of the window then - a bit of logical thought was called for. It was unlikely to be the WAN optimiser causing the problem, so the next step was the LAN switch/router. So we stopped and thought about the problem for a while and figured that maybe it made sense not to discount anything we'd not proven. So we turned off the WAN optimisers, causing traffic to be forwarded in its raw form.
And everything started working. And carried on working when we restarted the optimisation service.
We can't explain why the optimisers decided to prohibit access to one machine and not to the one sitting directly above it - but it did. So we've raised a ticket with the vendors, and have been reminded that just because the source of an issue is improbable, this doesn't make it impossible.