There are two elements to good troubleshooting – preparation and technique. Preparation comes in the form of documentation, change control, and understanding of the environment. The second part, technique, is just as important.
There are a number of methods to tackle the same problem. To be honest, Cisco doesn’t promote a specific approach for the CCNP TSHOOT exam. The important part is that you are consistent and your troubleshooting methodology follows a structured approach.
What Cisco calls structured troubleshooting simply means you use a system to solve a problem by collecting information about the problem, forming a hypothesis, and then test it. The structured approach also is helpful when the hypothesis you create fails. It may rule out many more scenarios and likely leads to the next hypothesis to test. The recovery time for a structured troubleshooting approach is usually much less than randomly changing configurations or settings in a hurry to try and get things working.
There are several common structured troubleshooting approaches, with these being the most common:
Start with the OSI physical layer and work your way up.
Start with the OSI application layer and work your way down.
Consider the path a packet would take from source to destination, checking each node/device/configuration along the way.
This is where configurations are compared between what is currently running and what the expected configurations should be.
Move a device to see if the problem moves with it.
Use the Scientific Method
The first step whenever you encounter a technical problem is to define the problem. This will involve collecting input from those experiencing the issue directly – things like “the Internet is down…” or “my email is slow…” or “I can’t get to my Facebook account when I should be processing TPS reports”… You get the idea. Keep in mind that you will need to understand that they are explaining the symptoms – it’s your job to determine the problem behind the symptoms.
After you have identifies the problem, it’s time to trim it down. What’s the scope? How many users are affected? What changed? When did it happen? Is it a constant problem or intermittent?
Now this is where your tool bag of structured troubleshooting methodologies should come out. Try one that you think best matches your hypothesis of the root issue and work your way through it. Did your test work? If not, continue through the layers, the path, or whatever approach you are using.
When you find a test that is successful and determine that it in fact is the root cause, make sure to communicate the problem and recovery to all stakeholders and update any necessary documentation. These are small, simple tasks – but they are rarely done consistently.
If a configuration change was the culprit, think about your current change control policy and ask if it needs to be updated.