Archive

Archive for March, 2009

Dell Wireless 5720 Verizon EVDO GPS

March 17th, 2009

So I was playing poker this evening with a friend of mine (who is also an IT professional) and he mentioned that his built in Dell EVDO card (the Verizon version) had GPS capabilities.  I was not aware of this and so I asked some additional questions, as it would be nice to have a GPS in my laptop when using Google Earth, etc…  As it turned out he has the exact same card as I do, except his is in an “E” series laptop rather than my D-630.  His card did not even have an active service account but it still worked.

Device Properties

Device Properties

My friends Dell Mobile Broadband Utility had an extra button that mine did not for “GPS Status”!  I was a bit miffed since I had just done a complete laptop rebuild including all the latest and greatest drivers and Dell Wireless software.  We compared all our driver and application versions and it turned out I actually had a slightly newer build number software and equivalent driver versions.

Dell Mobile Broadband Card Utility

I pulled out my trusty friend google and I came across this.  I flipped the registy setting as described in the article and re-launched the Dell Mobile Broadband Utility.  Just as expected a couple of new com port drivers were installed and the GPS button appeared!

GPS Status

GPS Status

I have no idea why this was enabled on his, but not mine, but I am glad turning it on was so simple!  Now I need to find some good apps to make use of this.  Google Earth 5.0 seems to sort of work with it, but it does not always show me the dot of where I am (it does know where I am however as it centers the view properly).

P.S. Here are the software and driver versions I am running:

Dell Mobile Broadband Card Utility Version

Dell Mobile Broadband Card Utility Version

-Eric

eprosenx Network, Telecom, Wireless

Review of ClearWire WiMax in Portland

March 17th, 2009

I got a friend of mine to bring over his ClearWire WiMax CPE (Customer Premise Equipment) box this evening so that I could play with it.  I put it through some basic tests, but I did not have as long to play with it as I would have liked.  A few key points are below:

  • I was testing the fixed wireless version, not the mobile one.  His CPE (Customer Premise Equipment) device was manufactured by Motorola.
  • I had five bars of signal out of five at my house (dropping to 4 briefly).  There is at least one ClearWire tower within a few blocks of my house and I tested it on my dining room table near a large window.  They are deployed in 2.5ghz spectrum so it does not penetrate all that well.
  • Ping times to a known host increased dramatically during a speedtest – I don’t think it is doing fair-queueing (on either upload or download or both…)
  • The speedtest’s I ran came out at three megabit download exactly (which is what his service plan is for), upload was more variable, though it was clearly bumping up against the 768k subscribed limit.
  • I did not like the web interface to the admin gui on the Motorola box, it is pretty cheesy and does not have many options (and some of the ones it does have are disabled so the customer can’t mess with them).
  • Traceroute’s out to the Internet did not work (they died at the Motorola 192.168.15.1 IP), though my friend said he has seen them work before on a couple of people’s ClearWire connection, so I am not sure if this is something that has changed recently, or if something weird is up with my laptop (traceroutes on my other Internet connections are fine though).
  • Ping times to my office averaged 105ms, compared to ping times across my 802.11g to my FiOS connection and on to my office which are about 62ms (most of that is probably the 802.11g).  On my Verizon EVDO card the ping times are around 155ms.
  • I confirmed that it does support 1500 byte mtu’s (no PPPoE reduced frame size BS)
  • The CPE device is in NAT mode and you can’t turn it off!  Your machine can’t sit directly out on the Internet (although you can map ports through).  The box is supposed to support UPnP dynamic translations, but my friends XBox can’t seem to make them work so he had to manually map ports.
  • Anecdotally, my friend reports that VoIP in his XBox games sucks from time to time.  He is in an apartment complex with not the greatest signal however.

Here is a traceroute from my hosting account down in California to the IP address the CPE device was assigned.  I am unsure how many router hops are in their network beyond where this timed out since traceroute appears to be blocked (or maybe that was timing out at the CPE device’s WAN IP, I can’t be sure).

                             My traceroute  [v0.71]
riddler (0.0.0.0)                                      Sun Mar 15 19:46:03 2009
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                       Packets               Pings
 Host                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. ip-67-205-28-1.dreamhost.com      3.8%    26    0.4  10.5   0.4 159.9  33.1
 2. ge-0-1-0.405.ar1.LAX3.gblx.net    0.0%    26   31.2   3.8   0.6  31.2   7.6
 3. 162.97.117.186                    0.0%    26   26.9  28.9  26.8  69.4   8.5
 4. 64-13-49-225.war.clearwire-dns.n  0.0%    26   59.8  29.8  26.9  59.8   8.4
 5. 64.13.115.162                     0.0%    26   27.4  33.5  27.4  89.2  14.2
 6. ???

Below is a screenshot of one SpeedTest I ran.  I find that Integra Telecom has some fast test servers local here in Portland (though odds are you are bouncing off Seattle or California to get to them).

ClearWire Integra Speedtest

ClearWire Integra Speedtest

If I get some time later with his ClearWire box I will do some more in-depth testing.  I would also like to test the mobile version of ClearWire.  I am sure I will get my hands on one of the USB dongles at some point soon.  ;-)

My overall feeling is that the (fixed base station) version of ClearWire WiMax in Portland is faster than Verizon’s EVDO, and has better latency characteristics than it.  I would certainly not prefer it over my Verizon FiOS however, and I would even venture to say that I might choose a Cable Modem or DSL over the WiMax (assuming I was close enough to the Central Office to get good DSL from a provider that is not oversubscribed).

-Eric

eprosenx Network, Telecom, Wireless

Debug Mode on Dell Verizon Wireless Data Card

March 17th, 2009

So if you are like me and want to actually know what the signal strength and noise levels are of your connection, rather than just how many “bars” some marketing guy has determined should be displayed, you need to know how to get access to the RF engineering screens on whatever device you use.

In the case of my Dell Wireless 5720 card build into my Latitude D630 the trick is to launch the Dell Mobile Broadband Card Utility and type ##debug in when at the main screen.  This will bring up a Debug Info screen with all sorts of interesting information (most of which I don’t have a clue what it is).

Dell Verizon Wireless 5720 Debug

Dell Verizon Wireless 5720 Debug

I could swear this Dell utility is similar to one I have used before by Sierra Wireless.  I am wondering if that is just because the base program is now written by Novatel and Sierra uses their chipsets?

-Eric

eprosenx Network, Telecom, Wireless

DNS Troubleshooting Using DIG

March 16th, 2009

Troubleshooting DNS can sometimes seem like black magic, but once you understand some of the fundamentals, and learn to use a good tool (such as DIG), it becomes much easier.

Let’s say you just made a DNS change for www.bitplumber.net to move the server from one IP to another, but you now find that requests are still going to the old IP.  You pull out your trusty friend “dig” and take a look.

C:\Users\eric.rosenberry>dig www.bitplumber.net

; <<>> DiG 9.6.0-P1 <<>> www.bitplumber.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1318
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.bitplumber.net.            IN      A

;; ANSWER SECTION:
www.bitplumber.net.     10651   IN      A       67.136.229.212

;; Query time: 27 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Sun Mar 15 00:15:25 2009
;; MSG SIZE  rcvd: 52
C:\Users\eric.rosenberry>

You can see here that by simply running “dig www.bitplumber.net” it went out to your default DNS server (in my case the crummy consumer grade nat/switch/access point my Verizon FiOS came with) which is running on 192.168.1.1.  It did a standard lookup, which looks for “CNAME” or “A” records.  In my case it determined that there was an A record pointing at 67.136.229.212.

Note that “10651″ number in between www.bitplumber.net and the IP address.  That is the amount of time in seconds that this record is valid for.  The fact that it is not some convenient round number is a good indication to me that this DNS server is not “authoritative” for the domain and is just serving out a cached entry (that it recursively found during some previous query).  If you run the same command a few seconds later the number should decrement to 10648 as the server is always tracking how long it’s cached entries remain valid for.

Since 67.136.229.212 is the old IP address for www.bitplumber.net we need to go take a look at what the “authoritative” DNS servers for the domain are serving out to make sure the correct IP will be provided to users once the old IP times out of caches.  To do this we are going to tell dig which server we want it to ask the question of.  Since I own the domain I know the authoritative servers are ns1.dreamhost.com, ns2.dreamhost.com, and nd3.dreamhost.com.  I run dig with the @ns1.dreamhost.com parameter to get it to query nd1.dreamhost.com instead of 192.168.1.1.

C:\Users\eric.rosenberry>dig www.bitplumber.net @ns1.dreamhost.com

; <<>> DiG 9.6.0-P1 <<>> www.bitplumber.net@ns1.dreamhost.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 643
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.bitplumber.net.            IN      A

;; ANSWER SECTION:
www.bitplumber.net.     14400   IN      A       75.119.216.176

;; Query time: 48 msec
;; SERVER: 66.33.206.206#53(66.33.206.206)
;; WHEN: Sun Mar 15 00:34:46 2009
;; MSG SIZE  rcvd: 52
C:\Users\eric.rosenberry>

Note that you do have to be careful specifying a DNS server by name rather than IP as you could get unexpected results if the name does not resolve to the server you intended.

You can see that ns1.dreamhost.com is serving out 75.119.216.176 as the IP for www.bitplumber.net and that each time you query it, the answer always comes back with a validity period of 14400 seconds.  To ensure my DNS change propagated to all three servers I will run the command against each of the three dreamhost servers that are designated authoritative to make sure they are in synch as to what IP they are serving out.

To actually troubleshoot domain lookup issues, you should really start at the root of the Internet DNS heierchy, rather than skipping ahead as I did in the example above.  To check the entire resolving path first try querying one of the root servers (i.e. a.root-servers.net):

C:\Users\eric.rosenberry>dig www.bitplumber.net @a.root-servers.net

; <<>> DiG 9.6.0-P1 <<>> www.bitplumber.net @a.root-servers.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1831
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 14
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.bitplumber.net.            IN      A

;; AUTHORITY SECTION:
net.                    172800  IN      NS      D.GTLD-SERVERS.net.
net.                    172800  IN      NS      G.GTLD-SERVERS.net.
net.                    172800  IN      NS      C.GTLD-SERVERS.net.
net.                    172800  IN      NS      H.GTLD-SERVERS.net.
net.                    172800  IN      NS      F.GTLD-SERVERS.net.
net.                    172800  IN      NS      E.GTLD-SERVERS.net.
net.                    172800  IN      NS      I.GTLD-SERVERS.net.
net.                    172800  IN      NS      A.GTLD-SERVERS.net.
net.                    172800  IN      NS      L.GTLD-SERVERS.net.
net.                    172800  IN      NS      B.GTLD-SERVERS.net.
net.                    172800  IN      NS      M.GTLD-SERVERS.net.
net.                    172800  IN      NS      J.GTLD-SERVERS.net.
net.                    172800  IN      NS      K.GTLD-SERVERS.net.

;; ADDITIONAL SECTION:
A.GTLD-SERVERS.net.     172800  IN      A       192.5.6.30
A.GTLD-SERVERS.net.     172800  IN      AAAA    2001:503:a83e::2:30
B.GTLD-SERVERS.net.     172800  IN      A       192.33.14.30
B.GTLD-SERVERS.net.     172800  IN      AAAA    2001:503:231d::2:30
C.GTLD-SERVERS.net.     172800  IN      A       192.26.92.30
D.GTLD-SERVERS.net.     172800  IN      A       192.31.80.30
E.GTLD-SERVERS.net.     172800  IN      A       192.12.94.30
F.GTLD-SERVERS.net.     172800  IN      A       192.35.51.30
G.GTLD-SERVERS.net.     172800  IN      A       192.42.93.30
H.GTLD-SERVERS.net.     172800  IN      A       192.54.112.30
I.GTLD-SERVERS.net.     172800  IN      A       192.43.172.30
J.GTLD-SERVERS.net.     172800  IN      A       192.48.79.30
K.GTLD-SERVERS.net.     172800  IN      A       192.52.178.30
L.GTLD-SERVERS.net.     172800  IN      A       192.41.162.30

;; Query time: 58 msec
;; SERVER: 198.41.0.4#53(198.41.0.4)
;; WHEN: Sun Mar 15 00:41:47 2009
;; MSG SIZE  rcvd: 505
C:\Users\eric.rosenberry>

The response from the root servers almost never changes since all they are doing these days is directing you to the DNS servers responsible for your top level domain.  .com and .net are controlled by the “GTLD” servers (run by VeriSign).  So the root servers just tell you to go ask one of the “GTLD” servers.

C:\Users\eric.rosenberry>dig www.bitplumber.net @a.gtld-servers.net

; <<>> DiG 9.6.0-P1 <<>> www.bitplumber.net @a.gtld-servers.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1585
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.bitplumber.net.            IN      A

;; AUTHORITY SECTION:
bitplumber.net.         172800  IN      NS      ns1.dreamhost.com.
bitplumber.net.         172800  IN      NS      ns2.dreamhost.com.
bitplumber.net.         172800  IN      NS      ns3.dreamhost.com.

;; ADDITIONAL SECTION:
ns1.dreamhost.com.      172800  IN      A       66.33.206.206
ns2.dreamhost.com.      172800  IN      A       208.96.10.221
ns3.dreamhost.com.      172800  IN      A       66.33.216.216

;; Query time: 220 msec
;; SERVER: 192.5.6.30#53(192.5.6.30)
;; WHEN: Sun Mar 15 00:51:06 2009
;; MSG SIZE  rcvd: 151
C:\Users\eric.rosenberry>

The GTLD servers tell me that ns1.dreamhost.com, ns2, and ns3 are who I must then go ask to get the actual answer for www.bitplumber.net.  At this point you would continue as described above.

Another important command in DIG is to tell it what type of record you are looking for.  By default it looks for CNAME’s and A records, but another extremely common lookup is for MX records.  That can be accomplished as follows:

C:\Users\eric.rosenberry>dig mx rosenberry.org

; <<>> DiG 9.6.0-P1 <<>> mx rosenberry.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 492
;; flags: qr rd ra; QUERY: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 7

;; QUESTION SECTION:
;rosenberry.org.                        IN      MX

;; ANSWER SECTION:
rosenberry.org.         1800    IN      MX      30 aspmx3.googlemail.com.
rosenberry.org.         1800    IN      MX      30 aspmx5.googlemail.com.
rosenberry.org.         1800    IN      MX      30 aspmx4.googlemail.com.
rosenberry.org.         1800    IN      MX      30 aspmx2.googlemail.com.
rosenberry.org.         1800    IN      MX      20 alt2.aspmx.l.google.com.
rosenberry.org.         1800    IN      MX      10 aspmx.l.google.com.
rosenberry.org.         1800    IN      MX      20 alt1.aspmx.l.google.com.

;; ADDITIONAL SECTION:
aspmx5.googlemail.com.  247     IN      A       74.125.45.27
aspmx4.googlemail.com.  3462    IN      A       66.249.93.27
aspmx2.googlemail.com.  3540    IN      A       209.85.135.27
alt2.aspmx.l.google.com. 180    IN      A       74.125.79.114
aspmx.l.google.com.     91      IN      A       209.85.199.27
alt1.aspmx.l.google.com. 127    IN      A       216.239.59.27
aspmx3.googlemail.com.  3530    IN      A       209.85.199.27

;; Query time: 87 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Sun Mar 15 00:54:50 2009
;; MSG SIZE  rcvd: 323
C:\Users\eric.rosenberry> 

You can see that I use Google to host my rosenberry.org email.

There are many more things that DIG can do, but the ones described above are the most common options I use.  Hopefully this will come in handy the next time you run into DNS troubles!

-Eric

eprosenx Network

My favorite IP address 4.2.2.1

March 15th, 2009

The most common question I need to answer when I am at the console of a machine is: “Does this machine have basic network connectivity?”.  While there are a lot of ways to accomplish this, I find the most convenient cross platform way to do this is with the ping command.  The question then becomes, what should I ping?

The answer depends on what you are trying to accomplish.  In most cases, I am troubleshooting this hosts connectivity to the network (not the core network itself) so any IP on the internal LAN or out on the Internet will do (most networks I work on do have Internet connectivity such that I can ping Internet hosts).

Many would say you should start with the most basic test and try to ping your gateway address (which I agree with), but this requires some thought on my part to figure out what subnet I am on and then type out some ip like 192.168.1.1 or 172.16.2.1.  I find these IP’s difficult to type, and so I generally go to my favorite standby 4.2.2.1.

The IP 4.2.2.1 is wonderful since it happens to be a DNS server for Level 3 communications (or maybe Verizon, not quite sure which since the IP space is registered to L3, but the reverse DNS points at gtei.net which rolled into Verizon, and back in the day I am pretty sure it was a GTE DNS server).  Since it is a DNS server for one of the largest networks in the world it is *always* available, and I think it is actually implemented as an “anycast” IP which means there are many servers around the world serving out responses to that IP and you will be routed to whichever is closest network wise.

Not only is it a great host to ping to check network connectivity, but it is also a recursive DNS server that will respond to queries from any host on the Internet!  This is useful when you are on some network somewhere and you don’t know what DNS servers to use (or don’t want to use the local ones for some reason), and just need something as a temporary solution.

Now please don’t go ping flooding these dns servers or using them for all your networks recursive DNS queries.  I very much appreciate that Level 3 lets these servers respond to recursive queries (which I don’t think they did at one time in the past) and I don’t want to give them a reason to turn it off!

It’s also worth noting that 4.2.2.2 and 4.2.2.3 also respond similarly.  It might actually be faster to type 4.2.2.2 than 4.2.2.1, but I am so used to the latter that it is what I use.

-Eric

eprosenx Network, Telecom

Getting DIG to work on Windows

March 14th, 2009

As a network engineer, there are a number of tools that are absolutely critical to my job, that I use on a daily basis.  One of those tools named “dig”, is included as part of the BIND package from the Internet Systems Consortium (ISC).  For those not familiar with “dig”, it is a command line query tool used to troubleshoot DNS issues.

Due to my need to support a number of specialty applications, I run a Windows PC as my main work laptop.  This is unfortunate as Microsoft’s command line diagnostic tools (such as nslookup) are quite weak.  While I could SSH into a Linux/Solaris/whatever host each time I need to troubleshoot something, I often find myself on foreign networks with only my trusty laptop available.  A number of years ago I figured out how to get dig running on my Windows machines and have never looked back.

To run dig on a Windows box you could install the full windows version of BIND (which includes dig), but that is quite overkill if all you want is dig.  Instead, I recommend downloading the latest stable precompiled windows binaries of BIND, extracting the .zip, and then copying the files listed below into your c:\windows directory (really you could put them anywhere in your PATH but c:\windows is easy and I doubt the filenames will ever conflict).

  • dig.exe
  • libisc.dll
  • libdns.dll
  • libeay32.dll
  • libbind9.dll
  • libisccfg.dll
  • liblwres.dll

Voila!  You can now run dig from the command prompt!

C:\Users\eric.rosenberry>dig bitplumber.net

; <<>> DiG 9.6.0-P1 <<>> bitplumber.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1886
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;bitplumber.net.                        IN      A

;; ANSWER SECTION:
bitplumber.net.         14400   IN      A       75.119.216.176

;; Query time: 210 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Sun Mar 15 00:05:54 2009
;; MSG SIZE  rcvd: 48
C:\Users\eric.rosenberry>

-Eric

eprosenx Microsoft, Network

How to view MTU size in Windows Vista

March 13th, 2009

It has always frustrated me that ipconfig /all in windows does not show me the currently active MTU size on a given adaptor (like it does in most OS’s).  Up until now I was not aware of any way to get this information (beyond maybe looking up the setting in the registry).

I just discovered the following command in Windows Vista (you have to be running command prompt as admin):

C:\Windows\system32>netsh interface ipv4 show subinterface

   MTU  MediaSenseState   Bytes In  Bytes Out  Interface
——  —————  ———  ———  ————-
4294967295                1          0      41533  Loopback Pseudo-Interface 1
  1300                1    7853050    1403618  Wireless Network Connection
  1300                5          0      23616  Local Area Connection
  1500                5          0          0  Bluetooth Network Connection
C:\Windows\system32>

Note that my MTU sizes on my LAN and Wireless cards are 1300, because the Cisco VPN client is installed, and I believe it sets them low to avoid PMTU issues.

This command does not work in Windows XP or Windows Server 2003 as it complains that RRAS is not running (I suspect it would work if RRAS *was* running).

-Eric

eprosenx Microsoft, Network

Troubleshooting connectivity issues with multiple BGP connected ISP’s

March 11th, 2009
I was rolled out of bed early this morning by a page from my traffic graphing software PRTG alerting me that one of the F5 OID’sI had it monitoring was reporting higher than normal average connection times.  I have it monitoring this F5 pool, because it serves out a very latency sensitive SOAP based application, that must be available 24×7 on the Internet.
 
As it turned out, the average connection times were being driven abnormally high, because one of my ISP’s upstream ISP’s had a router failure in California, and was blackholing some traffic.  This must have been causing either half-opened TCP connections, or delayed/dropped connections, that were taking a long time to timeout, hence driving up the average connect time.
 
This gets me to the question I am attempting to address today: I am having connectivity issues to some sites on the Internet; how do I determine where the issue lies?  Specifically, I am going to describe some steps to troubleshoot this in a hypothetical redundant router and ISP configuration below, however, most of these steps are applicable to single router and even single ISP architectures.

 

Figure 1

Figure 1

 In figure 1, routers INET-A and INET-B are responsible for running BGP and are each peered with two ISP’s.  They receive full Internet routes from each ISP, and share that info through iBGP, on the 5.5.5.0 subet between them.  They advertise 6.6.6.0/24 out to the Internet through all four ISP’s.

In my scenerio described above, I knew I had a problem, but I did not even know what other places on the Internet I had lost connectivity to, or was having difficulty communicating with.  After looking at my handy dandy network graphs, which include a graph of ping time to my favorite DNS server 4.2.2.1 (no kidding, that’s a real DNS server), I discovered that 4.2.2.1 was unreachable.  I also found that one of my customers,(which I ICMP ping every 30 seconds and log to a graph), was also failing (this becomes important later).

In my specific situation, I took a shortcut to issue resolution, by looking at my bandwidth utilization graphs on each of my ISP’s, and noticing that one was showing a dropoff in inbound traffic.  I guessed (correctly) that the issue was with that ISP, and I “AS-prepended” my outgoing advertisements, as well as incoming routes, to effectively disable use of that ISP.  This resolved the issue so I contacted the ISP and went back to bed.

If I were not in a situation where I could shut down an ISP without negative impact, or I needed to run down the trouble specifics, my next step would be to determine the network path to and from one of the unreachable (or partially unreachable) IP addresses.  It is very important to realize that when looking at routes on the Internet it is normal for traffic flows in one direction to follow one path, and traffic flows in the other direction to follow a completely seperate path.  A disruption in either the forward or return path can cause similar issues!

While you could run a traceroute from within the firewall on some host (assuming your PIX/ASA would allow the return packets), I recommend running it from your Internet router (INET-A in this case), since it often has more interesting information (like AS #’s).  As you can see below I have specified the source interface address for the IP packets to originate from, rather than letting the router choose which source IP to use.  This is very important as if left to it’s own devices INET-A will either use 1.1.1.2, 2.2.2.2, or 5.5.5.1 depending on which outbound route it chooses.  If the router chooses 1.1.1.2 as the source IP, that will force the packets to return through ISP-A, rather than possibly returning through a different ISP (which may be where the problem you are attempting to identify lies).  Specifying the source interface forces the source to 6.6.6.2, which is in the same /24 as our firewalls (which is what we care about), and as such will have the same return-path routing behavior.

INET-A#traceroute ip 4.2.2.1 source gigabitEthernet 0/0/0

Type escape sequence to abort.
Tracing the route to vnsc-pri.sys.gtei.net (4.2.2.1)

  1 gi0-0-5.inet-b.example.com (x.x.x.x) 0 msec 1 msec 0 msec
  2 cust01.pdx03.atlas.cogentco.com (38.104.103.43) [AS 174] 0 msec 1 msec 1 msec
  3 gi3-1.102.core01.pdx01.atlas.cogentco.com (38.112.37.217) [AS 174] 0 msec 1 msec 1 msec
  4 po10-0.core01.sfo01.atlas.cogentco.com (154.54.3.133) [AS 174] 14 msec 15 msec 15 msec
  5 te3-1.mpd01.sfo01.atlas.cogentco.com (154.54.3.102) [AS 174] 15 msec 15 msec 15 msec
  6 te4-4.mpd01.sjc01.atlas.cogentco.com (154.54.2.54) [AS 174] 16 msec 16 msec 16 msec
  7 te4-4.mpd01.sjc03.atlas.cogentco.com (154.54.6.238) [AS 174] 38 msec 17 msec 17 msec
  8 te-3-3.car3.SanJose1.Level3.net (4.68.110.137) [AS 3356] 61 msec 188 msec 221 msec
  9 vlan79.csw2.SanJose1.Level3.net (4.68.18.126) [AS 3356] 18 msec 26 msec 19 msec
 10 ge-11-0.core1.SanJose1.Level3.net (4.68.123.38) [AS 3356] 18 msec 18 msec 19 msec
 11 vnsc-pri.sys.gtei.net (4.2.2.1) [AS 3356] 18 msec 19 msec 19 msec
INET-A#

As you can see, by running this traceroute from INET-A, the cisco implementation helpfully adds the AS number of each hop’s IP based on the data it has in it’s BGP table.  The above traceroute example does not show any issues at the time I ran it, but it does show the “forward” path to 4.2.2.1.

To really troubleshoot this, we also need to know what the return path from 4.2.2.1 to us is.  This is where in my example, I had a customer IP that I had lost connectivity to, and I could have called them up and gotten them to send me a traceroute from their network to mine to show the reverse path.

Even with both a forward and reverse path traceroute in hand, the issue might not be immediately obvious.  Traceroutes generally send only three packets to each hop along the way, and they are small packets (usually 64 bytes or so).  If you are fighting intermittent packet loss, it might not be obvious at which hop the packets are lost.  Also, say you have an errored circuit (like a T-1) somewhere in the path.  Very frequently 64 byte packets will make it through 99% of the time but when you send 1500 byte packets they drop 50% of the time.

In the event of intermittent packet loss you might want to quantify just how much is being lost.  To do this run the following command:

INET-A#ping ip 4.2.2.1 size 1500 df-bit repeat 100

Type escape sequence to abort.
Sending 100, 1500-byte ICMP Echos to 4.2.2.1, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (100/100), round-trip min/avg/max = 19/20/24 ms
INET-A#

In this command, I make the assumption that the entire forward and revese path support 1500 byte packets.  This is the default size for Ethernet networks and all carriers I am aware of support 1500 byteframes.  This might not work if you are on some form of DSL line or something running PPPoE that reduces your MTU slightly.  I set the do not fragment bit with the df-bit command, to ensure the pings don’t get fragmented along the path, which could give misleading results.  If 1500 does not work for you try something slightly smaller.

You can run this command, with thousands of pings if necessary, to get some statistics on the amount of packet loss you are seeing.  Also, if you do get packet loss, try pinging the IP of the router one hop away from your destination (as revealed by the first traceroute you ran), and keep repeating this test with each IP closer and closer to your network, until you find where packets are no longer being lost.  At this point you have some idea between which routers the issue lies (note that the issue could also be on the return path from the router hop that is timing out).

While the process I just described works, there actually is a much easier way.  If you have a linuxbox handy on your network, there is a package called “mtr” or “My Traceroute”.  If you run “mtr 4.2.2.1″ in this example, it will run a traceroute to that destination, and then immediatly start pinging each hop along the way repeatedly until told to stop.  In the example below I only let it run for 6/7 iterations, but you get the idea.

                             My traceroute  [v0.71]
riddler (0.0.0.0)                                      Wed Mar 11 01:05:26 2009
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                       Packets               Pings
 Host                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. ip-67-205-28-1.dreamhost.com      0.0%     7    0.6  35.7   0.4 127.8  60.3
 2. border21.ge4-3.newdream-8.lax.pn  0.0%     7    0.6   0.5   0.4   0.6   0.0
 3. core1.po1-20g-bbnet1.lax.pnap.ne  0.0%     7    0.6   0.5   0.4   0.6   0.1
 4. te-2-2.car1.LosAngeles1.Level3.n  0.0%     6   60.5  35.9   0.5 147.7  59.6
 5. vnsc-pri.sys.gtei.net             0.0%     6    0.9   0.9   0.4   1.6   0.5

At this point I feel I must point out that this is not an exact science.  The amount of time a router takes to respond to traceroute/ICMP requests (and whether it even responds at all) is not necessarily any indicator of it’s health, or ability to deliver packets.  You will sometimes find tracerouteswhere some hops in the middle have horrible response times, or time out altogether, but yet the end-to-end traffic is fine.  This is likely because major router manufacturers implementations treat responding to ping packets as one of the lowest priorities for CPU resources.  If a router get’sbusy doing it’s primary job, which is to forward packets, it is not going to mess around responding to pings.  Also, if a router is in the middle of scavenging it’s BGP table for entries with unreachable next-hops (which it does frequently), it may respond slowly to pings, even though it is still forwarding packets at wire rate.

In an optimal world, you would be able to run mtr (or a similar tool), simultaniously from both ends of the connection, and compare the results.  This would give you the best data possible to diagnose your issue.  Hopefully the issue will lie with one of your ISP’s, or one of your customers ISP’s, and you can contact them directly to resolve it.  If the issue lies elsewhere on some intermediary backbone provider, it might be very difficult to open a support case with them (most likely need to escalate through your ISP).

Hopefully this gives you a few tools in your arsenal the next time you come across a connectivity issue such as I have described!

-Eric

eprosenx Cisco, Network

Gigabit Point-to-point Wireless

March 7th, 2009

For a couple years now I have been aware of a startup company called BridgeWave that has come out with some very cool radios.  These devices provide 100 megabit or gigabit point-to-point connectivity using the (unlicensed) 60 ghz band or (licensed) 80 ghz band.  With ranges from half a mile to 6 miles (depending on the unit), these radios are a great building-to-building bridge solution for LAN’s, or last mile access solution for Internet.

If you have line-of-sight between two places you need connectivity, and can get roof rights on both, these can provide a very cost-effective solution.  Their newest product the SLE100 provides full-duplex 100 megabit connectivity with an MSRP of only $9,995.  This unit is also powered by PoE so the only cable you need to run to the units is a CAT5 cable.

For those that have security requirements, they now have models (or add-on features) to enable link layer AES encryption.

There are of course pro’s and con’s to using wireless, but if engineered and installed properly they can beat the availablity of landline based service (i.e. it is hard to put a backhoe through your wireless beam).

P.S.  I also have good things to say about a company called Freewire Broadbandwho is a local reseller and installer of BridgeWave radios.  I have not personally used them, but their staff is very knowledgeable and friendly.

-Eric

eprosenx Wireless

Portland OR Telecom and Colo Providers

March 7th, 2009

I have created a permanent page on the site that I will keep updated as a reference to all of the various Colocation and Telecommunications options in Portland Oregon.  This is all information that I have acquired over the years that I suspect may come in handy for others.  If you have any questions/comments/corrections feel free to post them on the site or shoot me an email!

-Eric

eprosenx Colocation, Telecom