pathchar notes
(cobbled-notes-until-we-build-a-better-manpage,
with contributions from Becca Nitzan and
Evi Nemeth.
Addendum to caida's
pathchar page.)
Pathchar written by Van Jacobson of
LBL's
Network Research Group.
Select way-alpha-release binaries:
ftp://ftp.ee.lbl.gov/pathchar.
Interpreting pathchar output
Part I: first line in output
The first line output looks as follows:
doing 24 probes at each of 64 to 1500 by 92
^^ ^^ ^^^^ ^^
a) b) c) d)
where
a) #probes for each packet size, that is, each query
goes through a series of different packet sizes.
b) smallest packet size
c) largest packet size
d) packet size increment
So for this example output (the default),
each hop has ~((1500 - 64)/92 * 24) queries (360).
Part II: dynamic output during execution
Pathchar's execution consists of assessing, in order,
each link along the path to the specific destination.
While pathchar is running, for the particular link
it is assessing at the time, it posts a series of
4 numbers on the screen for each of the
probes sent to that hop.
Example:
1: 24 156 0 0
hop: round# size-bytes drops rtt
where
. round is which round of probes it is sending
-
(counting down from the # of rounds requested (24 by default)
to 1).
. packet size is the size of the packet used for that run
-
(default ranges from 64 to 1500, or can be set in command line)
. drops are the number of packets sent to this hop so far that have been dopped
. rtt is average rtt as measured by ping [@@not sure that's right]
Part III: Pathchar execution output descriptions
3 paloalto-br1.bbnplanet.net (131.119.0.193)
| 38 Mb/s, 25.7 ms (63.7 ms), +q 1.24 ms (5.87 KB) *13
bandwith propagation (rtt) queueing (queue size) *hinge
4 anl-atms.es.net (134.55.24.2)
. bandwidth available (not maximum, although pathchar
is estimating the maximum one can get at this time,
because it does many timed samples and essentially takes
the lowest RTT delay ones to make its bandwidth estimations)
. propagation delay for link, e.g., here, between hop 3 & 4.
. rtt not including q delay.
-
Includes propagation delay, input delay and cpu queues
on far hop (4), output queue for time exceeded response
of these the queue to the cpu is filtered out on the next hop.@@?
Of these the cpu queue is filtered out on the next hop,
but the input/output queues carry forward incorrectly.
@@check with van. More sophisticated deconvolution is
required to separate the two.
-
Should be close to maximum ping-rtt from origin pathchar host to this hop (4).
. queue delays
-
of: hop 3's output q, hop 4's input and output q,
- or: Q-delay = (Q1+Q2+Q3)delays.
. hinge
-
ratio of the interquartile distance (IDQ)
to the median (this ratio is the robust statistics
equivalent of the standard deviation,
so this ratio is the equivalent
of the standard error of the mean).
For normal `statistical' fluctuations,
it should be around 1, and pathchar prints it
if it's larger.
In this case it implies that at least 25% of
the probes saw a queue larger than 16ms;
statistically it's essentially the same as
having 1/4 of the data more than 13 standard
deviations from the mean (unusual, perhaps some
bimodal delay process due to router idiosyncracy).
---------- ------------
| rtr hop 3| | rtr hop 4 |
| |<-------|UUUU:Q3 |
->-|---Q1:UUUU|------->|Q2:UUUU |
---------- ------------
Part IV: command line arguments
-a:
-A:
-d:
-D: filename for debug output
-f: initial hop #
-F: probefilter
-i: intersampletime
-l: max ttl
-L: locality
-m: maxsize maximum packet size in bytes
- (if absent, pathchar determines path MTU)
-M: minsize default: smallest possible
-
(want large differential between max and min size,
-
o/w a node w/queue will impact later nodes
)
-n: don't dns-resolve
-p: port @@?
-q: #queries, default 32
-Q: bytes,
- if (-), packet size increment per query, defaults to 92.
- if (+), number of sizes, defaults to 32
-s: lsrr?
-S: fit spacing
-t: tos
-v: verbose mode
-V: verbose?
-w: seconds wait time
-
notes:
If queue at one hop disappears at next hop, likely the CPU queue
at first hop
Part V: sample output
riesling ~ 79% 14:24: pathchar ka9q.ampr.org
pathchar to ka9q.ampr.org (129.46.90.35)
mtu limitted to 1500 bytes at local host
doing 32 probes at each of 64 to 1500 by 44
0 192.172.226.24 (192.172.226.24)
| 9.3 Mb/s, 269 us (1.83 ms)
1 pinot (192.172.226.1)
| 85 Mb/s, 245 us (2.46 ms), 1% dropped
2 sdscdmz-fddi.cerf.net (198.17.46.153)
| 45 Mb/s, -13 us (2.70 ms)
3 qualcomm-sdsc-ds3.cerf.net (134.24.47.200)
| 8.8 Mb/s, 1 us (4.07 ms)
4 krypton-e2.qualcomm.com (192.35.156.2)
| 5.2 Mb/s, 1.02 ms (8.42 ms)
5 ascend-max.qualcomm.com (129.46.54.31)
| 53.2 Kb/s, 4.20 ms (243 ms)
6 karnp50.qualcomm.com (129.46.90.33)
| 12 Mb/s, -172 us (243 ms), +q 8.96 ms (13.0 KB) *3, 6% dropped
7 unix.ka9q.ampr.org (129.46.90.35)
7 hops, rtt 11.1 ms (243 ms), bottleneck 53.2 Kb/s, pipe 4627 bytes
riesling ~ 80% 15:30:
Part VI: Important interpretation Notes
[from van]:
Getting reasonable estimates
It takes many probes for pathchar to ascertain bandwidth
of a fast link (>50Mb/s). Since we don't yet have method for
pathchar to automatically decide its fit estimate
is `good enough', you need to manually set the
number of probes (via the -q flag)
based on a guess about the path bandwidth & workload.
The default of 32 probes leads to ~2 minutes/hop of probing,
sufficient for links up to 10Mbs.
Rough guidelines:
- use -q 64 for relatively quiet, fddi or slower paths.
- use -q 128 for busy (especially for fast link
beyond a busy, slower link), fddi or slower paths,
or quiet and faster than fddi.
- use -q 256 or -q 512 for links faster than
fddi on a busy path (i.e., not the fast link(s), but upstream
busy links -- the most damage is done by slow busy links upstream
of the fast link you're trying to measure).
parenthetical stickiness on one hop
If at a given node, pathchar prints something like
9: 6 1208 28 102 ->206.34.78.27 (54358)
it means that the source address has changed in the
ICMP time exceeded reply.
Pathchar remembers the first address it encounters
for this hop, and prints this message (and otherwise
ignores the response) when that address doesn't reply.
The number in parentheses is the total number of
responses from the 'wrong' address(es).
Once this hop completes, pathchar prints out
the list of 'wrong' addresses & the
number of replies from each, together with the
estimates for the first address.
If the first address pathchar sees was
a low probability alternate path,
it can take a long time to do a
full cycle of probes to it (the round number
would decrement slowly).
last updated 28 may 97, kc@nlanr.net
questions, feedback: info@caida.org