Not the usual back-end trouble

The physical network's forensic files find an unlikely culprit for the company's server failure

In the late 90s I worked for a midsize ISP in the midwestern United States. Like any ISP at that time, we had dialup service, which meant we had a consumer-oriented tech support number. A friend of one of my co-workers was hired for dialup tech support. We'll call him Jake. He had some computer experience but no specific qualifications to do more than what he was hired to do.

Jake proved himself fairly quickly, and within a few months, he had risen through the ranks and was managing the tech support department. Everybody was pretty impressed with him overall, since it was rare for anyone with a clue or any ambition to work for dialup support.

Jake's prior job was a bit more ... active than this one, and the fact that he sat in a chair all day answering the phone started to manifest itself physically. The poor guy must have gained 30 or 40 pounds in a matter of a few months. Perils of the job, I guess.

I was the systems lead at that time. One day, the NOC contacted me to let me know our main user server was down. This affected shell accounts, personal Web sites, and customer e-mail service. We found the system up and talking to the network, but it was throwing errors left and right about the /home filesystem, which is where everything and anything that mattered on that machine lived.

After some remote tinkering around, we decided to take a look at the machine itself. It was some kind of SPARC pizza-box machine, with an external disk array attached via SCSI cable. We quickly found the problem: the SCSI cable was dangling, barely hanging on to the socket. We got the disks back online and after some filesystem repairs, everything was back in order.

We were interested to find out what had happened, though. Cables don't just miraculously pop completely out of their sockets. After some investigation, we found out the only person that had carded into the datacenter near the time of the outage was Jake. Since he had no business with the user server, we talked to him to find out what had happened. It turned out that he had been checking up on something in the next row, in the rack just behind the user server. The aforementioned SCSI cable was at just about the same height as his rear end ...

After that, whenever a server went down, we would joke that someone must have "pulled a Jake" on us again.

Some years later, Jake had moved into Network Engineering, and he was tasked with doing a UPS bypass test during a facility audit. The UPS had a big dial on the front with several positions (on, bypass, and so forth). He turned the dial from "on" to "bypass" at the proper time, and was suddenly standing in a very dark, quiet room. Nobody had told him that you couldn't turn the dial that way; you had to turn it the other way, through every other position, to execute a bypass.

The next day we all asked him if he'd turned the dial with his rear end ...

Join the newsletter!

Error: Please check your email address.

More about SocketVIA

Show Comments