Phone:
1-800-346-3119

When Good Networks Go Bad

When Good Networks Go Bad

| 10/3/2016 8:10:07 AM

Part I
 
An Ethernet network is a complex system that typically includes equipment from a variety of vendors, as well as a variety of software programs to manage it.  As a network grows, it becomes even more complex.  Cables may include Ethernet CAT5e cabling or higher, fiber cabling, and coax. Hardware may include switches, routers, media converters, serial devices, and even Telecom devices like VDSL, DS3/E3 and T1/E1. Although the IEEE 802.3 consortium has created standards to help all types of network equipment “talk” to each other, there are still numerous problems that occur on a daily basis.

Problems can occur anywhere in the network.  In this article I am going to focus on just a few of the issues that could be slowing a network down, creating instability, or actually crashing the network. 

Looking at it from the worm’s-eye view
Having spent several years troubleshooting Ethernet network devices in LANs and WANs myself, I want to share some Physical Layer 1(hardware) troubleshooting tips.  First, it’s worth mentioning that network folks often start troubleshooting at Layer 2, Layer 3, or above.  But the best practice is to start from Physical Layer 1 – the hardware.

We have all chuckled at stories about customers who call in about equipment that isn’t working, and it turns out that they haven’t plugged it in. Incredibly, this actually happens.  So the first step is checking the connections and the cabling. 

Many network folks have spare cables.  If so, they can replace a suspected segment with a new one as a simple test – provided that this is possible. But it’s more likely that network technicians will find that they need some kind of test equipment. Test equipment can be expensive, but some of it is worth the investment.  It can save time and money.  Cable testers, fiber optic meters and other simple devices can diagnose a suspected bad line quickly. A laptop also makes a good testing tool, whether in a lab or the field, as its interfaces can provide another way of establishing and testing connectivity.

If the cabling checks out, the next step is checking the network equipment itself. If you suspect that a network device is the root cause of a problem, start by looking at the LEDs. They will typically indicate link, activity, duplex mode, speed and alarms. Like traffic signals, LED colors have specific meanings.  They may be red, amber or green, but they’ll all tell you something about the device status. Many network devices also have diagnostic features that can be enabled to display status with an LED. If a fault has occurred, it may turn an LED on or cause it to change color. Quite often, simply inspecting the LEDs can let you diagnose a problem quickly. Spare network devices can also be used to help diagnose a “bad” segment, if they are available. There are many kinds of devices on a network, of course, so you can’t possibly carry around enough inventory to assess every possibility.  

The next step can be performing a loopback onto the network device itself, depending upon its interfaces.  Let’s say that it’s a copper-to-fiber network device. By connecting the copper port with the same-speed capability to a laptop, the fiber port can be looped back onto itself. If the copper LED shows the interface as active, and the fiber port LED lights up, that is a pretty good indication that the problem you’re investigating is not caused by that that device. T1, DS3 and VDSL devices offer an array of alarms to indicate where a fault is occurring, and they can provide a loopback test when enabled on the device itself.

The best practice is to test the cables or devices in a lab environment. The more simple the test bed, the better the chances of determining the problem without impacting the network’s performance. You may have to schedule service downtime, but that is better than a complete out-of-service condition.

The Dangers of Testing in an Active Network
In an active network, the danger in testing -- or setting up a network device for testing -- is that it can cause a network loop. A network loop occurs when network switches or hubs are connected to themselves, or to one another more than once.  When switches connect this way, network packets can be bounced between two switches indefinitely.

If a switch doesn’t know which port to use to send a packet out, it will send the packet out of all of its ports, except for the port that was the source of the packet.  (The switch is trying to make sure that the packet gets to its destination.)   Once the switch learns which port a computer is connected to, it will stop using every port to send packets to that computer.  It will only send packets through the appropriate port.  This is a standard procedure that can optimize the network.  But this structure breaks down when there is a loop.

Telecom equipment usually has a (fault) loopback function. If the equipment is not switched out of the loopback test setting, data will not pass as expected. Always check the equipment to make sure that it hasn’t been left in loopback mode.

To PING or not to PING…
If the network is running slowly you can PING the device in question.  This assumes that the device is manageable and has been assigned an IP address.  You can use a Command Line Interface on the PC to do this. Make sure that the device is connected to a PC, and that the expected LEDs are on. Remember that a managed device often has a default  IP address until you replace it with a real, available IP address within your network.

There are caveats.  Just because you can PING an IP address, don’t assume only one device has that exact address. Duplicate IP addresses occur on networks more often than you might think.  Many network administrators prefer to assign static IP addresses.  If they don’t manage the IP addresses carefully, duplicate IP addresses are almost inevitable.  As a result, when two devices have the same IP address, one device may respond to the first PING, and the other may respond to a subsequent  PING. PING simply validates the IP address.  It does not confirm that the IP address has been assigned to just one device.

DHCP (Dynamic Host Control Protocol) can be used instead of static IP, but many administrators don’t like that option.  They feel that static IP equals reliability. If they used DHCP, however, it would resolve duplicate IP issues.  With DHCP, an IP address is released back to the available pool until needed again.
Once you have established that the hardware is working as expected, you can move on to device configuration. Physical device failures do occur, but you’ll find that most network issues are caused by bad device configurations. Speed, Mode and Duplex are the keys, and that will be a topic for Part II.  Until then, keep Part 1 in your troubleshooting toolkit.
 

1905 Comments Click here to read/write comments
Subscribe to Susan's Tech Tips