Privacy and Security Notice

Distributed Computing Department
Computational Research Division
Lawrence Berkeley National Laboratory


Methods for Network Analysis and Troubleshooting

Background information for understanding of the tools netest and pipechar paper : download binaries

by Jin Guojun

Introduction

There are three methods for measuring the bandwidth of a given network:
This document will provide an overview of the strengths and weaknesses of each method, and provide links to LBNL implementations of netest and pipechar, which use the first two methods listed above to perform network troubleshooting and analysis.

Sender only method

The sender-based or "sender only" (SO) method is the most complicated and powerful method of the three. It can derive a great variety of algorithms to probe the network. One limitation of sender-based methods is difficult to produce accurate measurements of the exact bandwidth for each network segment, but rather a precise measure of the proportional bandwidth. Because it can do hop by hop probing, SO methods are especially well-suited to dynamically detecting the bottleneck in the communication path. The same method can also detect patterns of dropped packets that are causing poor performance. However, to achieve this goal, the algorithm must transmit a large amount of data on the already distressed network over an extended period of time, which will cause more network congestion that makes the result inaccurately. Unless this is no way to access a remote host, SO methods are not suggested for diagnosing network problems of this type. Instead, SO methods should be used for detecting the problem spot and providing essential information on the state of the network. Detailed network problem analysis are better performed by sender-receiver paired methods.

Sender and receiver paired method

Sender and receiver paired methods (SRP) have the same power as SO method with more features. To use this method, programs must be installed and running on both endpoints of the connection. In SRP, because the sender sends data from source directly to the destination with enough TTL (time to live) to traverse the intervening path, this method will report characteristics of the entire link, as opposed to the hop by hop reporting that SO performs. Because the software is deployed on both ends, SRP can easily see any packet being dropped in either direction on the given network. It can vividly depict what happens on the problem network link. In 1991, a tool -- netest, using this mechanism, was developed for network problem analysis.

Receiver only method

The receiver based or "receiver only packet pair" (ROPP) method is passive, that is, the software runs only on the receiver end. This method depends on a pair of packets of the same size that are transmitted back-to-back and received in the same order without separation. This requires that the measured network is healthy, that is, there is no congestion happening on the testing network. Otherwise, the FSE (addressed in the formal paper) will occur and the measurement will result in wrong report. Due to these limitations, the ROPP mechanism may be used only for measuring the static bandwidth bottlenecks. Other drawbacks of the ROPP method are that it requires kernel modification (because packets can be seen only at the kernel level), and that it cannot be configured to measure a specific path because it does not control which source sends it the data. The advantage of the ROPP method is that it does not affect the network traffic at all during the measurement. Therefore, ROPP is an excellent mechanism to monitoring the income traffic and configuring TCP to open an appropriately sized receiving window. Because we have developed algorithms that use the SRP and SO methods to quickly measure the bandwidth static bandwidth of each segment, the ROPP method is less used for network analysis.

netest and pipechar

When compared side-by-side, SO and SRP mechanisms are better methods for the network analysis and monitoring. Many existing tools, such as ping and traceroute, are based on SO and SRP technologies. We have developed and SRP based tool, netest, which performs detailed network analysis and simulation. In addition, we now have an SO based tool, pipechar, which was extracted from the NCSD (Network Characterization Service daemon) for network analysis to assist netest with quickly nailing down the problem router and finding the "hidden" switch (unresponsive router). Together, netest and pipechar can be used as components in real-time network analyzing and monitoring tool kit. The tools do, however, have distinct and rather complex options which make them easier to use if distributed as separate codebases.

Download

The binaries are available for downloading . Since pipechar reports dynamic network characteristics, you may see different results from time to time. This version will tell how reliable the information is.

Attention: Please see the NCSD HighLight for the important information about understanding this service.
A Key Issue in using pipechar is that you should NEVER run multiple pipechar on a host. It is a desktop tool extracted from the NCSD, and only one pipechar can be started from a host at a time. Running multiple pipechar from a single host simultaneously will cause them hang each other. Pipechar is not designed to do such job and is not designed for programming and development.
It is a desktop tool that simulates a NCS service to perform single and quick task. For programming and development, as well as inquiring multiple paths from a host, you have to use NCS.

As many people confused on this subject, I would like to add this section to explain this further. This is NOT an open issue. This is network layer (L3) v.s. transport layer (L4) issue. The multiplexing capability is at L4 only. Because all switch and routers are either link layer (L2) or network layer or both devices, and the control message is via ICMP in L3, there is no way to let multiple pipechar programs running on a host to share and De-multiplexing one income ICMP stream. A coming in ICMP packet will be sent to all open RAW sockets.

Question:  traceroute is also a network layer tool, how it works if I run multiple traceroute simultaneously from a host?

Answer:  traceroute does not time the income ICMP, it only needs to know what types of income ICMP packets are. In this case, traceroute may spend time to toss those ICMP packets not belong to itself. PxCHAR is different, if we spend a lot of time to filter out irrelevant ICMP packets, timing is screwed up. That is it. Happy :-)

The tutorial provides the basic usage on how to use these tools.
Updated Friday, 15-Apr-2005 14:01:38 PDT ([an error occurred while processing this directive])