A Reboot Fixed It

From UntangleWiki
Jump to: navigation, search

It has become common behavior it today's IT world to reboot servers when something isn't working as expected. I myself am as guilty of this as anyone. The joke goes that if a user complains about an IT problem the IT admin should first ask the user if they have rebooted twice.

This seems to have evolved from historical reliability issues in Microsoft's line of Windows operating systems. These OS's and the applications that run on them seem to have so many minor issues that cause long-term uptime and reliability issues and it is impossible to expect users to troubleshoot them. Given this, it has become common to just expect to have to reboot occasionally or when problems arise, and that usually "fixes" the situation.

However, administrators I know from a unix/linux background tend to never reboot. In fact, many of them seem to hold their servers' uptimes as a point of pride. When something goes wrong instead of rebooting immediately they figure out what is wrong and they fix it and usually without a reboot!

The problem arises now that linux is becoming mainstream and accessible to all sort of users. Given our windows-bred behavior if a problem arises with a linux server we just bounce the box and hope its fixed. If someone asks about the issue we quickly declare "A reboot fixed it" and move on with our lives. If I received a nickel each time I heard this I could start my own venture capital fund.

The reality is that you should not have to reboot linux machines. If you are having to frequently reboot your linux/unix servers you probably *do* have a problem and the reboot is not "fixing" anything - it's only removing the symptoms for a period of time. Unless you are prepared to continuously reboot and deal with other possible side effects I'd advise you do a couple things when problems are encountered:

  1. DO NOT REBOOT
  2. Classify the problem very specifically. The problem isn't that "My computer doesn't work" or "The Internet is Down." What specifically isn't working?
  3. Collect as much relevant information as you can and then some. Often "irrelevant" information is quite relevant as the core cause is not always where one expects.
  4. Troubleshoot the issue. If you are unsure how to troubleshoot the issue - see step 1: DO NOT REBOOT. If you want to fix the issue go find someone who can help you and describe the problem very specifically and give them your gathered information. If you do reboot, you've erased all the symptoms and made finding the problem much more difficult.

This also applies to Untangle. Untangle is based on linux and you should never have to reboot it. If you are having to reboot it - you have an issue. It could be a hardware issue or a software issue or could not be related to Untangle at all. Rebooting will not solve the issue; it will only make troubleshooting the issue harder.