CCNP Course Institute in Delhi

Tuesday, December 14, 2010

Troubleshooting Methodology CCIE Security Training Institute in Delhi Gurgaon India

Network Bulls
www.networkbulls.com
Best Institute for CCNA CCNP CCSP CCIP CCIE Training in India
M-44, Old Dlf, Sector-14 Gurgaon, Haryana, India
Call: +91-9654672192


The responsibilities of a network administrator boil down to four essential measurements: Maximize performance and
availability; minimize cost, and time-to-repair.
This chapter focuses on minimizing time-to-repair. The time it takes to restore functionality is predicated on two things:
preparation and technique. The previous chapter spoke about the elements of preparation, such as documentation and
scheduled preventative maintenance. This chapter focuses on the techniques that you can apply to minimize downtime.
Each of the troubleshooting practices described in this chapter assume that good documentation exists and that appropriate
tools are available. Troubleshooting is much more frustrating and time consuming when the necessary preparation
isn’t accomplished.
Principles
The scientific method is commonly described as a six-step process:
1. Define the problem.
2. Gather information.
3. Hypothesize.
4. Test hypothesis.
[ 16 ]
© 2010 Pearson Education, Inc. All rights reserved. This publication is protected by copyright. Please see page 69 for more details.
CCNP TSHOOT 642-832 Quick Reference by Brent Stewart
NOTE:
Ciscos’ Troubleshooting
test doesn’t assume a
specific approach. Many
approaches and different
approaches might be
successful in specific
situations. The test does
advocate a structured
approach to troubleshooting,
based on the scientific
method.
www.CareerCert.info
CHAPTER 2
Troubleshooting Methodology
5. Analyze test.
6. Interpret results and, if necessary, generate a new hypothesis.
The first step—problem description—is usually accomplished when a user reports a problem. The initial problem description
tends to be vague or overly general (“The Internet is down!”). A troubleshooter’s initial response should therefore be
to gather more information and build a more specific description. You can determine symptoms by talking to the user, by
personal observation, or by referring to management systems such as Netflow, Syslog, and SNMP monitors.
When you have an adequate description of the problem, you can form a hypothesis. A hypothesis is a hypothetical potential
problem whose symptoms would be similar. The hypothesis should commonly suggest a way to prove or disprove
itself. For instance, if you suspect that the WAN connection is down, looking at the interface status or pinging a remote
device would test that theory.
Test results will either support or refute a theory. A single test result can’t prove a theory but just support it. For example,
ping might be used to test a WAN connection. A ping timeout cannot, by itself, be considered definitive. The target might
be shut down or have a firewall that drops ICMP. Test results should be confirmed through a number of different lines of
evidence. If the tests contradict the hypothesis, start over with a new theory.
After a hypothesis is accepted as a reasonable explanation, you can take action to fix the problem. Of course, any action
is another type of test. If the action doesn’t fix the problem, simply develop a new hypothesis and repeat the process.
Structured Troubleshooting
The term structured troubleshooting describes any systematic way of collecting information, forming a hypothesis, and
testing. In a structured approach, each unsuccessful test rules out entire classes of possible solutions and gracefully
suggests the next hypothesis. An unstructured—random—approach usually takes much longer and is less likely to be
successful.
[ 17 ]
© 2010 Pearson Education, Inc. All rights reserved. This publication is protected by copyright. Please see page 69 for more details.
CCNP TSHOOT 642-832 Quick Reference by Brent Stewart
www.CareerCert.info
CHAPTER 2
Troubleshooting Methodology
A number of techniques have been used successfully, their common feature being a rigorous and thoughtful approach that
collects data and analyzes data:
n Top down: Start at the OSI application layer and drill down.
n Bottom up: Start with the OSI physical layer and work up.
n Divide and conquer: Start at the network layer and follow the evidence, developing specific tests of each hypothesis.
n Follow path: Consider the “packets perspective” and examine the devices and processes it encounters moving
through the network. Understand the order of operations within each device to do this.
n Spot difference: Compare the configuration to an older version or to that of a similar device. Diff and WinDiff are
tools that make this comparison easy.
n Move the problem: Swap components to see if the problem moves with a device.
There isn’t a single “best method,” although a given technician might find one more intuitive or more suitable for a given
problem. It’s a good idea to be familiar with each technique and to change approaches if necessary.
Two troubleshooting tactics need special mention. Most technicians build up a reservoir of experience, which gives them
an intuition about the solution to a given problem. This can be incredibly impressive when it works; the trick is to not let
this become a series of random stabs when it doesn’t work.
[ 18 ]
© 2010 Pearson Education, Inc. All rights reserved. This publication is protected by copyright. Please see page 69 for more details.
CCNP TSHOOT 642-832 Quick Reference by Brent Stewart
FIGURE 2-1
The OSI Model
7
6
5
4
3
2
1
Application
Presentation
Session
Transport
Network
Data Link
Physical
www.CareerCert.info
CHAPTER 2
Troubleshooting Methodology
Networkers also look for things that happened at about the same time, on the theory that the similar timing implies causation.
This thinking is a logical error: post hoc ergo proctor hoc. Sometimes this does provide a clue, but large networks
have many things happening contemporaneously every second. This troubleshooting method can easily provide a false
lead.
The Troubleshooting Method
Troubleshooting a network falls into a series of steps that mirror the scientific method.
The first step in troubleshooting is to define the problem. Some users, for instance, might report that “The Internet is
down,” when what they mean is “My e-mail is taking a long time to download.” Some users over-generalize or exaggerate
for effect, but most users lack the technical sophistication to tell which symptoms are relevant. Always start the troubleshooting
process by gathering a detailed description of the problem. Ask questions to gather details, such as the names
and locations of affected devices. One good way to gather details is to ask about how the problem can be duplicated.
(“So, if I browse the web I’ll see this problem?”)
After defining the problem, gather information about the problem. What is the scope? What other devices or locations are
affected? When did it start? How can you test the problem?
As information is gathered, one or more theories might begin to form. Develop tests that confirm or refute the theories,
and work to find the root cause. Tests can be as simple as pings or as complex as implementing a configuration change;
the tests should be aimed at separating valid theories.
When the testing process is complete, take a moment to consider the results. Do the results suggest a configuration or
hardware change? Is the problem resolved? If not, reconsider the problem description and the original hypothesis. Either
the problem was not completely and accurately described, or the hypothesis was incorrect and needs to be revisited.
[ 19 ]
© 2010 Pearson Education, Inc. All rights reserved. This publication is protected by copyright. Please see page 69 for more details.
CCNP TSHOOT 642-832 Quick Reference by Brent Stewart
www.CareerCert.info
CHAPTER 2
Troubleshooting Methodology
When the problem is resolved, take some time to consider the changes. The state of the network and the problem resolution
need to be communicated, and documentation might need to be updated. Past these obvious steps, consider whether
the problem found can be in other parts of the network. If the problem were in the configuration, think through the
configuration template used in your network and determine if the fix needs to be repeated preemptively on other devices.
Each organization has its own specific methods for working through the break/fix cycle. The important points here are to
work logically and methodically, and to view each problem as an opportunity to perfect the larger network.
Integrating Troubleshooting into Maintenance
Every interaction with the network is an opportunity to learn. Smart organizations capture information learned to solve
similar problems and to help understand the network in the future. Change control and documentation are the two principal
ways that feedback from network changes is incorporated into the maintenance cycle, as shown in Figure 2-2.
Preventative maintenance is ongoing, but changing conditions or reported problems create the need to make a change.
Troubleshooting identifies the corrective action to upgrade or repair the network. Throughout these processes, a regular
communication with end users is critical to understand the problem and to gather feedback on the solution.
Communication with end users, within the team, and with management is pervasive throughout the cycle.
[ 20 ]
© 2010 Pearson Education, Inc. All rights reserved. This publication is protected by copyright. Please see page 69 for more details.
CCNP TSHOOT 642-832 Quick Reference by Brent Stewart
FIGURE 2-2
Maintenance Cycle
Change
Required
Troubleshooting
Communication
Change
Control
Preventative
Maintenance
Documentation
www.CareerCert.info
CHAPTER 2
Troubleshooting Methodology
Change control is a process found in many organizations with large networks. The change-control process is a formal
communication process for requesting and receiving permission. Change control provides an opportunity for management
and peers to be aware and consent to the proposed change. The change process encourages the network technician to take
a deliberate and thoughtful approach. Finally, the change process creates a record of the change that can be incorporated
in documentation.
After a change is made and an issue is resolved, updating documentation must be seen as a part of the clean-up process.
Most organizations have records including IPs, inventory, configurations, and topology; changes need to be added to these
records. If the change is sufficiently broad, it might also need to be incorporated into standards and templates so that
other devices can be preemptively upgraded. As records and standards change, team members need to be educated on the
changes.
A baseline is a reading of the critical parameters of the network (such as latency and utilization) over a period of time.
The baseline serves as a record of normal behavior to help identify how performance has changed. Updating baseline
information is part of the documentation process.
Think about troubleshooting as a holistic process. Approach each issue with a rational evidence-based philosophy, make
thoughtful changes, and communicate with all the invested groups often.

No comments:

Post a Comment