Archive

Archive for the ‘Cisco’ Category

Cisco ASA 8.0.5 TFTP Unspecified Error using PumpKIN

December 10th, 2009 2 comments

I have run into a problem on two separate ASA’s now downloading code to them using the PumpKIN TFTP server.  It get’s part way through the download and dies (at different places each time so it’s an intermittent error).

I was running a 7.0.8 release on these devices, and then upgraded to 8.0.5 (copying the file off a PumpKIN TFTP server with no issue), but then I was reloading 8.0.5 onto the devices (while booting off 8.0.5 already) and it could not access the exact same file properly.

ciscoasa# copy tftp flash
ciscoasa# copy tftp flash Address or name of remote host []? 10.100.99.13 Source filename []? asa805-k8.bin Destination filename [asa805-k8.bin]? Accessing tftp://10.100.99.13/asa805-k8.bin...!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WARNING: TFTP download incomplete! ! %Error reading tftp://10.100.99.13/asa805-k8.bin (Unspecified Error) ciscoasa#

I ended up using the SolarWinds TFTP server instead and it worked like a champ.  I am not sure what the issue is here, but it looks like some kind of bug in PumpKIN or in the ASA code (or some combination thereof).

-Eric

Categories: Cisco, Network Tags:

Cisco AnyConnect Split-DNS resolution not working in Snow Leopard 10.6

November 20th, 2009 2 comments

I just upgraded my Cisco AnyConnect client on my ASA 5510 to 2.4.0202 hoping that the VPN would work for my users with Mac OS 10.6 Snow Leopard, but it would appear they are having DNS resolution issues.  I use the Split-DNS functionality of the ASA/Anyconnect client to only send DNS queries to the across-the-vpn DNS servers for a couple of domain names.

My brief testing has shown that all DNS queries are being sent to the remote hosts local DNS servers rather than sending them to the corporate DNS servers for the Split-DNS domains.

I found Cisco bug ID CSCtc54466 that describes this issue.  It describes this issue as being with Mac OS X 10.6 and they claim the issue is with Apples mDNS code.  They say it is “likely to be fixed in Mac OS X 10.6.3”.

In the meantime they claim you can “Restart the mDNSResponder service”.  I am assuming you would need to restart this service each time you VPN in?  I have not yet looked into how to restart that service yet either.  I will edit this post once I figure it out.

-Eric

Categories: Apple, Cisco, Network Tags:

Advanced PING usage on Cisco, Juniper, Windows, Linux, and Solaris

September 15th, 2009 1 comment

As a network engineer, one of the most common utilities I use is the ping command.  While in its simplest form it is a very valuable tool, there is much more knowledge that can be gleaned from it by specifying the right parameters.

Ping on Cisco routers

On modern Cisco IOS versions the ping tool has quite a few options, however, this was not always the case.

Compare the options to ping in IOS 12.1(14)

EXRTR1#ping ip 4.2.2.1 ?
  <cr>

EXRTR1#ping ip 4.2.2.1

To that in IOS 12.4(24)T

plunger#ping ip 4.2.2.1 ?
  data      specify data pattern
  df-bit    enable do not fragment bit in IP header
  repeat    specify repeat count
  size      specify datagram size
  source    specify source address or name
  timeout   specify timeout interval
  validate  validate reply data
  <cr>

plunger#ping ip 4.2.2.1
plunger#ping ip 4.2.2.1 ?
data      specify data pattern
df-bit    enable do not fragment bit in IP header
repeat    specify repeat count
size      specify datagram size
source    specify source address or name
timeout   specify timeout interval
validate  validate reply data
<cr>
plunger#ping ip 4.2.2.1

When I am running a basic connectivity test between two points on a network I will generally not specify any options to ping (i.e. “ping 4.2.2.1”), however, once I have verified connectivity I will most often then want to verify what MTU size the path will support without fragmentation, and then also run an extended ping process with a thousand or more pings to test the reliability/bandwidth/latency characteristics of the link.

Here is an example of the most basic form.  Note that by default it is sending 100 byte frames:

plunger#ping 4.2.2.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.2.2.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/26/28 ms
plunger#

If I am working on an Ethernet network (or PPP link), it is most common that my target goal is for 1500 byte frames to make it through.  I will use the “size” parameter to force ping to generate 1500 byte frames (note that in Cisco land this means 1472 byte ICMP payloads plus 8 bytes ICMP header and 20 bytes IP header).  I also use the df-bit flag to set the DO NOT FRAGMENT bit on the generated packets.  This will allow me to ensure that the originating router (or some other router in the path), is not fragmenting the packets for me.

plunger#ping 4.2.2.1 size 1500 df-bit 

Type escape sequence to abort.
Sending 5, 1500-byte ICMP Echos to 4.2.2.1, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/26/28 ms
plunger#

If the first ping command worked, but the command above did not, then try backing down the size until you find a value that works.  Note that one common example of a smaller MTU is 1492 which is caused by the 8 bytes of overhead in PPPoE connections.

The next command to try is to send a large number of pings of the maximum MTU your link can support.  This will help you identify packet loss issues and is just a good way to generate traffic on the link to see how much bandwidth you can push (if your monitoring the link with another tool).  I have frequently identified bad WAN circuits using this method.  Note that looking at the Layer 1/2 error statistics before doing this (and perhaps clearing them), and then looking at them again afterwards (on each link in the path!) is often a good idea.

plunger#ping 192.168.0.10 size 1500 df-bit repeat 1000

Type escape sequence to abort.
Sending 1000, 1500-byte ICMP Echos to 192.168.0.10, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (1000/1000), round-trip min/avg/max = 1/2/4 ms
plunger#

Now the final ping parameter I often end up using is the “source” option.  A good example is when you have a router with a WAN connection on one side that has a /30 routing subnet on it, plus then an Ethernet connection with a larger subnet for your users devices.  Say that users on the Ethernet are reporting that they can not ping certain locations on the WAN, though you can ping it just fine from the router.  This is often because the return path from the device you are pinging back to your users subnet on the Ethernet is not being routed properly, but the IP your router has on the /30 WAN subnet is being routed correctly.  The key here is that by default a Cisco router will originate packets from the IP of the Interface it is going to be sending the traffic out (based on it’s routing tables).

To test from the interface your router has in the users subnet, use the source command like this:

plunger#ping 4.2.2.1 source fastEthernet 0/0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.2.2.1, timeout is 2 seconds:
Packet sent with a source address of 173.50.158.74
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/25/28 ms
plunger#

Note that you can also use these command together depending on what you are trying to do:

plunger#ping 192.168.0.10 size 1500 df-bit source FastEthernet 0/0 repeat 100

Type escape sequence to abort.
Sending 100, 1500-byte ICMP Echos to 192.168.0.10, timeout is 2 seconds:
Packet sent with a source address of 173.50.158.74
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (100/100), round-trip min/avg/max = 1/2/4 ms
plunger#

Ping on Juniper routers

The options to ping on Juniper (version 9.3R4.4 in this case) are quite extensive:

root@INCSW1> ping ?
Possible completions:
  <host>               Hostname or IP address of remote host
  bypass-routing       Bypass routing table, use specified interface
  count                Number of ping requests to send (1..2000000000 packets)
  detail               Display incoming interface of received packet
  do-not-fragment      Don't fragment echo request packets (IPv4)
  inet                 Force ping to IPv4 destination
  inet6                Force ping to IPv6 destination
  interface            Source interface (multicast, all-ones, unrouted packets)
  interval             Delay between ping requests (seconds)
  logical-system       Name of logical system
+ loose-source         Intermediate loose source route entry (IPv4)
  no-resolve           Don't attempt to print addresses symbolically
  pattern              Hexadecimal fill pattern
  rapid                Send requests rapidly (default count of 5)
  record-route         Record and report packet's path (IPv4)
  routing-instance     Routing instance for ping attempt
  size                 Size of request packets (0..65468 bytes)
  source               Source address of echo request
  strict               Use strict source route option (IPv4)
+ strict-source        Intermediate strict source route entry (IPv4)
  tos                  IP type-of-service value (0..255)
  ttl                  IP time-to-live value (IPv6 hop-limit value) (hops)
  verbose              Display detailed output
  vpls                 Ping VPLS MAC address
  wait                 Delay after sending last packet (seconds)
root@INCSW1> ping

While there are a lot more options here, I am generally trying to test the same types of things.  A very important note however is that in the Juniper world, the size parameter is the payload size and does not include the 8 byte ICMP header and 20 byte IP header.   The command below is the same as specifying 1500 bytes in Cisco land.

root@INCSW1> ping 4.2.2.1 size 1472 do-not-fragment
PING 4.2.2.1 (4.2.2.1): 1472 data bytes
1480 bytes from 4.2.2.1: icmp_seq=0 ttl=53 time=25.025 ms
1480 bytes from 4.2.2.1: icmp_seq=1 ttl=53 time=24.773 ms
1480 bytes from 4.2.2.1: icmp_seq=2 ttl=53 time=24.757 ms
1480 bytes from 4.2.2.1: icmp_seq=3 ttl=53 time=25.045 ms
1480 bytes from 4.2.2.1: icmp_seq=4 ttl=53 time=24.911 ms
1480 bytes from 4.2.2.1: icmp_seq=5 ttl=53 time=25.152 ms
^C
--- 4.2.2.1 ping statistics ---
6 packets transmitted, 6 packets received, 0% packet loss
round-trip min/avg/max/stddev = 24.757/24.944/25.152/0.145 ms

root@INCSW1>

Here is an example of sending lots of pings quickly on the Juniper to test link reliability:

root@INCSW1> ping 10.0.0.1 size 1472 do-not-fragment rapid count 100
PING 10.0.0.1 (10.0.0.1): 1472 data bytes
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
--- 10.0.0.1 ping statistics ---
100 packets transmitted, 100 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.230/2.429/5.479/1.228 ms

root@INCSW1>

Ping on Windows XP/Vista/2003/2008

While not as full featured, the Windows ping tool can at least set the packet size and the do-not-fragment bit:

Microsoft Windows [Version 6.0.6001]
Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

C:\Users\eric.rosenberry>ping ?
^C
C:\Users\eric.rosenberry>ping

Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS]
            [-r count] [-s count] [[-j host-list] | [-k host-list]]
            [-w timeout] [-R] [-S srcaddr] [-4] [-6] target_name

Options:
    -t             Ping the specified host until stopped.
                   To see statistics and continue - type Control-Break;
                   To stop - type Control-C.
    -a             Resolve addresses to hostnames.
    -n count       Number of echo requests to send.
    -l size        Send buffer size.
    -f             Set Don't Fragment flag in packet (IPv4-only).
    -i TTL         Time To Live.
    -v TOS         Type Of Service (IPv4-only).
    -r count       Record route for count hops (IPv4-only).
    -s count       Timestamp for count hops (IPv4-only).
    -j host-list   Loose source route along host-list (IPv4-only).
    -k host-list   Strict source route along host-list (IPv4-only).
    -w timeout     Timeout in milliseconds to wait for each reply.
    -R             Use routing header to test reverse route also (IPv6-only).
    -S srcaddr     Source address to use.
    -4             Force using IPv4.
    -6             Force using IPv6.

C:\Users\eric.rosenberry>

So your basic ping in Windows claims to send 32 bytes of data (I have not verified this), but I suspect that is 32 bytes of payload, plus 8 bytes ICMP, and 20 bytes of IP for a total of 60 bytes.

C:\Users\eric.rosenberry>ping 4.2.2.1

Pinging 4.2.2.1 with 32 bytes of data:
Reply from 4.2.2.1: bytes=32 time=27ms TTL=53
Reply from 4.2.2.1: bytes=32 time=25ms TTL=53
Reply from 4.2.2.1: bytes=32 time=28ms TTL=53
Reply from 4.2.2.1: bytes=32 time=27ms TTL=53

Ping statistics for 4.2.2.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 25ms, Maximum = 28ms, Average = 26ms

C:\Users\eric.rosenberry>

So a common set of flags I use will be to create full 1500 byte frames (note that it takes 1472 as the parameter for this) and then tell it not to fragment (-f) and to repeat until stopped (-t).

C:\Users\eric.rosenberry>ping 4.2.2.1 -l 1472 -f -t

Pinging 4.2.2.1 with 1472 bytes of data:
Reply from 4.2.2.1: bytes=1472 time=29ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=30ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=31ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=29ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=30ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=28ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=29ms TTL=53

Ping statistics for 4.2.2.1:
    Packets: Sent = 7, Received = 7, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 28ms, Maximum = 31ms, Average = 29ms
Control-C
^C
C:\Users\eric.rosenberry>

Ping on Linux

Hey look, it is a ping command that is not ambiguous about what size frames it is generating!!!  It clearly shows that the payload is 56 bytes, but that the full frame is 84.  Note that this is from an Ubuntu 2.6.27-7-generic kernel box.

ericr@eric-linux:~$ ping 4.2.2.1
PING 4.2.2.1 (4.2.2.1) 56(84) bytes of data.
64 bytes from 4.2.2.1: icmp_seq=1 ttl=52 time=21.8 ms
64 bytes from 4.2.2.1: icmp_seq=2 ttl=52 time=21.4 ms
64 bytes from 4.2.2.1: icmp_seq=3 ttl=52 time=21.4 ms
64 bytes from 4.2.2.1: icmp_seq=4 ttl=52 time=21.6 ms
64 bytes from 4.2.2.1: icmp_seq=5 ttl=52 time=21.8 ms
64 bytes from 4.2.2.1: icmp_seq=6 ttl=52 time=21.8 ms
^C
--- 4.2.2.1 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5019ms
rtt min/avg/max/mdev = 21.435/21.700/21.897/0.190 ms
ericr@eric-linux:~$

To set the packet size use the -s flag (it is asking for payload size, so 1472 will create a 1500 byte frame).  Now if you want to turn off fragmentation by setting the do-not-fragment bit (DF), the parameter is a bit more obscure “-M on”.  Here is an example using both:

ericr@eric-linux:~$ ping 4.2.2.1 -s 1472 -M do
PING 4.2.2.1 (4.2.2.1) 1472(1500) bytes of data.
1480 bytes from 4.2.2.1: icmp_seq=1 ttl=52 time=23.8 ms
1480 bytes from 4.2.2.1: icmp_seq=2 ttl=52 time=24.1 ms
1480 bytes from 4.2.2.1: icmp_seq=3 ttl=52 time=31.4 ms
1480 bytes from 4.2.2.1: icmp_seq=4 ttl=52 time=23.7 ms
1480 bytes from 4.2.2.1: icmp_seq=5 ttl=52 time=23.5 ms
^C
--- 4.2.2.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4017ms
rtt min/avg/max/mdev = 23.589/25.369/31.469/3.057 ms
ericr@eric-linux:~$

And there is another highly useful parameter that we have not seen yet in any of our previous ping utilities.  The linux ping has a “flood” option that will send pings as fast as the machine can generate them.  This is great for testing network links capacity, but can make for unhappy network engineers if you use it inappropriately.  Note that you must be root to use the -f flag.  Output is only shown when packets are dropped:

ericr@eric-linux:~$ sudo ping 10.0.0.1 -s 1472 -M do -f
PING 10.0.0.1 (10.0.0.1) 1472(1500) bytes of data.
.^C
--- 10.0.0.1 ping statistics ---
2763 packets transmitted, 2762 received, 0% packet loss, time 6695ms
rtt min/avg/max/mdev = 2.342/2.374/3.204/0.065 ms, ipg/ewma 2.424/2.384 ms
ericr@eric-linux:~$

Ping on Solaris

Here is the ping options from a Solaris 10 box (I forget what update this super-secret kernel number decodes too):

SunOS dbrd02 5.10 Generic_125100-07 sun4v sparc SUNW,Sun-Fire-T200

I find the basic ping command in Solaris to be annoying:

[erosenbe: dbrd02]/export/home/erosenbe> ping 4.2.2.1
4.2.2.1 is alive
[erosenbe: dbrd02]/export/home/erosenbe>

I want Ping to tell me something more useful than that a host is alive.  Come on Sun, like round trip time at least?  Maybe send a few additional pings than just one?  The -s command makes this operate more like the ping command in other OS’s:

[erosenbe: dbrd02]/export/home/erosenbe> ping -s 4.2.2.1
PING 4.2.2.1: 56 data bytes
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=0. time=21.6 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=1. time=21.5 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=2. time=21.6 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=3. time=21.4 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=4. time=21.1 ms
^C
----4.2.2.1 PING Statistics----
5 packets transmitted, 5 packets received, 0% packet loss
round-trip (ms)  min/avg/max/stddev = 21.1/21.5/21.6/0.22
[erosenbe: dbrd02]/export/home/erosenbe>

With the Solaris built in Ping tool you can specify the packet size, but it is very annoying that you can’t set the do-not-fragment bit.  Come on SUN, didn’t you like invent networking???  So in this example I had it send multiple pings, and I told it to use a size of 1500 but I could not set the DF bit so the OS must have fragmented the packets before sending.  I got responses back that claim to be 1508 bytes which I am assuming means that the 1500 bytes specified was the payload amount and the returned number of bytes includes the 8 byte ICMP header, but not the 20 byte IP header…  Go SUN.

[erosenbe: dbrd02]/export/home/erosenbe> ping -s 4.2.2.1 1500
PING 4.2.2.1: 1500 data bytes
1508 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=0. time=36.7 ms
1508 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=1. time=23.2 ms
^C
----4.2.2.1 PING Statistics----
2 packets transmitted, 2 packets received, 0% packet loss
round-trip (ms)  min/avg/max/stddev = 23.2/30.0/36.7/9.6
[erosenbe: dbrd02]/export/home/erosenbe>

Conclusion

Well, I hope this rundown on a number of different ping tools is useful to folks out there and as always, if you have any comments/questions/corrections please leave a comment!

-Eric

Categories: Cisco, Linux, Network, Sun Tags:

Are blade servers right for my environment?

July 15th, 2009 No comments

IT like most industries has it’s “fad”s.  Whether it be virtualization, or SAN’s, or blade servers.  Granted these three technologies play really nicely together, but once in a while you need to get off the bandwagon for a moment and think about what these technologies really do for us. While they are very cool overall and can make an extremely powerful team, as with anything, there is a right place, time, and situation/environment for their use.  Blades are clearly the “wave of the future” in many respects, but you must be cautious about the implications of implementing them today.

Please do not read this article and come away thinking I am “anti-blade” as that is certainly not the case.  I just feel they are all too often pushed into service in situations they are not the correct solution for and would like to point out some potential pitfalls.

Lifecycle co-termination

When you buy a blade center, one of the main selling points is that the network, SAN, and KVM infrastructure is built in.  This is great in terms of ease of deployment and management, however, on the financial side of things you must realize that the life span of these items is not normally the same.  When buying servers I typically expect them to be in service for 4 years, KVM’s (while becoming less utilized actually), can last much longer under most circumstances (barring changes in technology from PS/2 to USB, etc…), network switches I expect to use in some capacity or another for seven years, and SAN switches will probably have a similar life-cycle to the Storage Arrays they are attached to which I generally target at 5 year life spans.

So what does this mean?  Well, if your servers are showing their age in 4 years you are likely to end up replacing the entire blade enclosure at that point which includes the SAN and network switches.  It is possible the vendor will still sell blades that will fit in that enclosure, however, you are likely to be wanting a SAN or network upgrade before the end of those second set of servers life-cycles which will likely result in whole new platforms being purchased anyway.

Vendor lock

You have just created vendor lock such that with all the investment in enclosures you can’t go buy someone elses servers (this really sucks when your vendor fails to innovate on a particular technology).  All the manufacturers realize this situation exists and will surely use it to their advantage down the road.  It is hard to threaten not to buy Dell blades to put in your existing enclosures when that would mean throwing away your investment in SAN and network switches.

San design

Think about your SAN design – Most shops hook servers to a SAN switch which is directly attached to the storage array their data lives on.  Blade enclosures encourage the use of many more smaller SAN switches which often requires hooking the blade enclosure switches to other aggregation SAN switches which are then hooked to the Storage Processor.  This increases the complexity, increases failure points, decreases MTBF, and increases vendor lock.  Trunking SAN switches together from different vendors can be problematic and may require putting them in a compatibility mode which turns off useful features.

Vendor compatibility

Vendor compatibility becomes a huge issue- Say that you buy a blade enclosure today with 4 gig Brocade SAN switches in it for use with your existing 2 gig Brocade switches attached to an EMC Clarion CX500, but then next year you want to replace that with a Hitachi array attached to new Cisco SAN switches.  There are still many interop issues between SAN switch vendors that make trunking switches problematic.  If you had bought physical servers you may have just chosen to re-cable the servers over to the new Cisco switches directly.

Loss of flexibility

Another pitfall that I have seen folks fall into with blade servers is the loss of flexibility that comes with having a stand alone physical server.  You can’t hook up that external hard drive array full of cheap disks directly to the server, or hook up that network heartbeat crossover cable for your cluster, or add an extra NIC or two to a given machine that needs to be directly attached to some other network (that is not available as a VLAN within your switch)….

Inter-tying dependencies

You are creating dependencies on the common enclosure infrastructure so for full redundancy you need servers in multiple blade enclosures.  The argument that the blade enclosures are extremely redundant does not completely hold water to me.  I have needed to completely power cycle entire blade enclosures before to recover from certain blade management module failures.

Provisioning for highest common denominator

You must provision the blade enclosure for the maximum amount of SAN connectivity, network connectivity, and redundancy that is required on any one server within the enclosure.  Say for instance you have a authentication server that is super critical, but not resource intensive.  This requires your blade center to have fully redundant power supplies, network switches, etc…  Then say you have a different server that needs four 1 gig network interfaces, and yet another DB server that needs only two network interfaces, but it needs four HBA connections to the SAN.  You now need an enclosure that has four network switches and four SAN switches in it just to satisfy the needs of three “special case” servers.  In the case of the Dell M1000 blade enclosures, this configuration would be impossible since they can only have six SAN/Network modules total.

Buying un-used infrastructure

If you purchase a blade center that is not completely full of blades then you are wasting infrastructure resources in the form of unused network ports, SAN ports, power supply, and cooling capacity.  Making the ROI argument for blade centers is much easier if you have need to purchase full enclosures.

Failing to use existing infrastructure

Most environments have some amount of extra capacity on their existing network and SAN switches, as when they were purchased, they planned for the future (probably not with blade enclosures in mind).  Spending money to re-purchase SAN and network hardware within a blade enclosure to allow the use of blades can kill the cost advantages of going with a blade solution.

Moving from “cheap” disks to expensive SAN disks

You typically can not put many local disks into blades.  This is in many cases a huge loss as not everything needs to be on the SAN (and in fact, certain things would be very stupid to put on the SAN such as SWAP files).  I find that these days many people overlook the wonders of locally attached disk.  It is the *cheapest* form of disk you can buy and also can be extremely fast!  If your application does not require any of the advanced features a SAN can provide then DONT PUT IT ON THE SAN!

Over-buying power

In facilities where you are charged for power by the circuit the key is to manage your utilization such that your un-used (but paid for) power capacity is kept to a minimum.  With a blade enclosure, on day 1 you must provide (in this example) two 30 amp circuits for your blade enclosure, even though you are only putting in 4 out of a possible 16 severs.  You are going to be paying for those circuits even though you are nowhere near fully utilizing them.  The Dell blade enclosures as an example require two three phase 30 amp circuits for full power (though depending on the server configurations you put in them you can get away with dual 30 amp 208v circuits).

Think about the end of the life-cycle

You can’t turn off the power to a blade enclosure until the last server in that enclosure is decommissioned.  You also need to maintain support and maintenance contracts on the SAN switches, network switches, and enclosure until the last server is no longer mission critical.

When are blades the right tools for the job?

  • When your operational costs of operations and maintenance personnel far outweigh the cost inefficiencies of blades.
  • When you are buying enough servers that you can purchase *full* blade enclosures that have similar connectivity and redundancy requirements (i.e. each needs two 1 gig network ports and two 4 gig SAN connections).
  • When you absolutely need the highest density of servers offered (note that most datacenters in operation today can’t handle the density of power required and heat that blades can put out).

An example of a good use of blades would be a huge Citrix farm, or VMWare farms, or in some cases webserver farms (though I would argue very large web farms that can scale out easily should be on some of the cheapest hardware you can buy which typically does not include blades).

Another good example would be compute farms (say even lucene cache engines) – as long as you have enough nodes to be able to fill enclosures with machines that have the same connectivity and redundancy requirements.

Conclusion

While blades can be great solutions, they need to be implemented in the right environments for the right reasons.  It may indeed be the case that the savings in operational costs of employees to setup, manage, and maintain your servers far outweighs all of the points raised above, but it is important to factor all of these into your purchase decision.

As always, if you have any feedback or comments, please post below or feel free to shoot me an email.

-Eric

Categories: Cisco, Dell, HP, IBM, Network, Sun, Systems Tags:

Cisco Netflow to tell who is using Internet bandwidth

July 4th, 2009 1 comment

When working with telecom circuits that are slow and “expensive” (relative to lan circuits), the question frequently comes up “What is using up all of our bandwidth?”.  Many times this is asked because an over-subscribed WAN or Internet circuit is inducing latency/packet drops in mission critical applications such as Citrix or VoIP.  In other cases a company may be paying for a “burstable” Internet connection whereby they are paying for a floor of 10 megabits, but they can utilize up to 30 megabits and just be billed for the overage (at the 95th percentile generally).

So how do you tell which user/server/application is chewing up your Internet or WAN circuits?  Well Cisco has implemented a technology called “netflow” that allows your router to keep statistics on each TCP or UDP “flow” and then periodically shove that data into a logging packet and ship it off to some external server.  On this server you can run one of a variety of different software packages to analyze the data and understand what is using up your network bandwidth.

The question is, what software package should you utilize?  I have not gone and evaluated all of the available options, but I do have experience with a couple of them.  I have used Scrutinizer from Plixer in the past and not been very impressed.  Part of it may have been that the machine it was running on was not very fast, but I just did not like the interface or capabilities much.

More recently I have downloaded and run NetFlow Analyzer from ManageEngine and I have been very impressed!  It is free for only two interfaces and they have an easy-to download and install demo that will run unlimited interfaces for 30 days.  It runs on Linux or Windows (I tried the Linux version) and is is dirt simple to install and configure.  There really is nothing of note to configure on the server itself, you just need to point your router at the server’s IP and it will automatically start generating graphs for you.

I should also mention that Paessler has some kind of netflow capabilities (in PRTG), but I have not checked it out.  I note it here since I use their snmp monitoring software extensively and I have been happy with it.

To get your router to send NetFlow data to a collector, you need to set a couple of basic settings (including which version of NetFlow to use and where to send the packets), and then enable sending flows for traffic on all interfaces.  Note that it used to be you could only collect netflow data upon ingress to an interface and so in order to collect data on bi-directional traffic you needed to enable it on every single router interface in order to see the traffic in the opposite direction.  This was done with the “ip route-cache flow” command on each interface.

Now “ip route-cache flow” has been replaced with “ip flow ingress” and you can also issue “ip flow egress” command if you were to not wanting to monitor all router interfaces.  I have just stuck with issuing “ip flow ingress” on all my interfaces since I wanted to see all traffic anyway (and I am not quite sure what would happen if you issue both commands on two interfaces and then had traffic flow between them, it might double count those flows).

Here are the exact commands I used on plunger to ship data to Netflow Analyzer 7:

plunger#conf t

Enter configuration commands, one per line.  End with CNTL/Z.

plunger(config)#ip flow-cache timeout active 1

plunger(config)#ip flow-export version 5

plunger(config)#ip flow-export destination x.x.x.x 9996

plunger(config)#int fastEthernet 0/0

plunger(config-if)#ip flow ingress

plunger(config-if)#int fastEthernet 0/1

plunger(config-if)#ip flow ingress

plunger(config-if)#end

plunger#write mem

Building configuration…

[OK]

plunger#exit

Happy NetFlowing!

-Eric

Categories: Cisco, Network Tags:

Finally Got My Cisco ASA 5510 AnyConnect Essentials License

June 23rd, 2009 5 comments

After waiting for several weeks for Cisco to fufill my license code order (for ASA-AC-E-5510) I finally got the code in an email today!  Using the product authorization key to generate an activation key for my specific device was easy on the Cisco licensing web site.

I used the “activation key” command to plug it into my ASA 5510 and now I have the “AnyConnect Essentials” feature enabled when I do a “show ver”.

I never rebooted the device to activate the new license, however, I have not tested more than one user logged into it so I have no proof it works as of yet.

My ASA 5510 has been running the 8.2 code now for 26 days as my production corporate firewall without a hitch so I give it the thumbs up.  I did have one hiccup while rebooting the box after uploading the 8.2 code, but I actually think that 8.0.4 crashed when I asked it to reboot, rather than the 8.2 code failing to come up properly (I was not on the console however when I did this so I really have no proof, I ended up power-cycling the box).

I do have to gripe that the AnyConnect client has had issues on my Windows Vista laptop a number of times, though According to Cisco this may be due to Windows bugs relating to sleeping my laptop (which I do multiple times a day).  I get the dreaded “The vpn client driver has encountered an error” message.

Perhaps one other thing worth noting is that 8.2 created a new “coredumpinfo” folder on the internal flash file system with a file in it called coredump.cfg.  This file seems to somehow update it’s timestamp every time you do a show run and so It messes up my RANCID process which grabs the config and file system directory listings every 30 minutes and diff’s them for me.  This causes RANCID to email me every half hour with useless data that this file changed.

P.S.  The AnyConnect Essentials license key for my ASA 5510 was only $108 from my CDW rep including shipping (which was email btw…)

-Eric

UPDATE 6/24/09:

I forgot to mention that I am running this 5510 with only the stock 256 megs of RAM without issue.  There is reference in the release notes of possibly needing more RAM on that platform for 8.2 depending on what you are doing.  My RAM utilization actually went down between 8.0.4 and 8.2, though I also made some config changes around the same time so YMMV.

Also, there is some reference in the ASDM GUI about needing to reboot after applying a new activation key so I may need to do that…  Still have not tested it yet since my Vista laptop is being dumb.

Categories: Cisco, Network Tags:

Cisco ASA 5510 8.2 AnyConnect License Price ASA-AC-E-5510

May 13th, 2009 5 comments

As a follow-up to a previous post, I am happy to report that Cisco has finally posted the bits to the ASA 8.2 code online for download.  I have been looking forward to this, as this release includes a new license model for the AnyConnect VPN client called “Cisco AnyConnect Essentials”.

While I still can’t find any written reference (on the Cisco price list or elsewhere) for how much the AnyConnect VPN client is going to cost, I have confirmed that the previous rumor of it being “next to free” is indeed true.  Cisco is only charging $150 for the AnyConnect VPN Essentials license on a 5510 which will give you up to 250 simultaneous users!  (that is about as close-too-free as Cisco gets)

This is the answer you are looking for to deal with 64 bit client support!  A coworker of mine even told me today that the AnyConnect client works in his Windows 7 Beta 2 machine (which surprised me, I suspect under-the-hood the Windows 7 networking stack is very similar to Windows Vista).

The part number you need for an ASA 5510 is ASA-AC-E-5510=.  If you need the part numbers for other models check out the release announcement.

There is some reference in the release notes to possibly needing more ram in the ASA 5510 platforms (I am not yet sure if this will impact me, I am not doing a ton of stuff on my ASA 5510 but yet I run near 80% RAM utilization on version 8.0.4).  It is worth noting that there is annoying footnote that says the 256 -> 512 meg of RAM upgrade won’t be available till June…

Also, I have been told that the Botnet detection feature will be $460 a year.  This is part number ASA5510-BOT-1YR= for the ASA 5510.

I will write up another post once I install the 8.2 code somewhere.

-Eric

UPDATE: 5/18/09

I am getting conflicting information from my VAR than I got directly from Cisco.  They say MSRP is $350 right now and it won’t be available till late this month or early June.  CDW has it posted for $232.99 without any special pricing discounts you may have with them.  Availability says to call…

UPDATE: 5/29/09

The CDW site now shows that the ASA-AC-E-5510 part is $101.99.  It still says availability is “call”…

And for those of you looking for the part numbers you need to purchase the AnyConnect Essentials for your model of ASA, here they are:

  • AnyConnect Essentials VPN License – ASA 5505 (25 Prs) – ASA-AC-E-5505=
  • AnyConnect Essentials VPN License – ASA 5510 (250 Prs) – ASA-AC-E-5510=
  • AnyConnect Essentials VPN License – ASA 5520 (750 Prs) – ASA-AC-E-5520=
  • AnyConnect Essentials VPN License – ASA 5540 (2500 Prs) – ASA-AC-E-5540=
  • AnyConnect Essentials VPN License – ASA 5550 (5000 Prs) – ASA-AC-E-5550=
  • AnyConnect Essentials VPN License – ASA 5580 (10K Prs) – ASA-AC-E-5580=
Categories: Cisco, Network Tags:

Cisco PIX/ASA VPN Client for 64 Bit Windows

April 23rd, 2009 3 comments

For quite some time now I have been annoyed at Cisco for not releasing a 64 bit edition of their IPSec VPN client.  As far as I am concerned, their plan has been to force everyone over to the AnyConnect VPN client (SSL VPN) which does support 64 bit clients (i.e. Windows Vista 64 bit).

Oh, and by-the-way the AnyConnect Premium client costs $1,250 (MSRP) for a 10 concurrent user license on a 5510, where as the IPSEC client is FREE for unlimited users.

On a recent trip to Costco I was amazed at what percentage of the systems now being sold were coming with 64bit Windows Vista.  There are becoming more-and-more home users that can’t VPN in anymore with the IPSEC client.

While I am still unhappy with Cisco for artificially forcing people over to AnyConnect client, there is some amount of relief in sight to this issue.

Cisco just announced this week at RSA some new features that will be included in the ASA 8.2 code.  One of which is a new AnyConnect license level “AnyConnect Essentials” which according to my sources will be “Almost Free” (you can decide what that means for yourself).  This license will provide basic VPN  access, but not include the clientless web portal stuff or Cisco Secure Desktop (basically the stuff that I would rather not have to support anyway).

The other cool feature I am looking forward to is a new Botnet detection capability.  Basically the ASA will periodically download signature files from Cisco that tell it what traffic to look for.  If the ASA observes internal machines connecting out to known Botnet controllers it will be able to report on them.  There will be a yearly fee to enable this, so I am curious to see what they charge.

No word yet on availability dates for 8.2 or official pricing.  You can find more details on what is in 8.2 here.

-Eric

Categories: Cisco, Network Tags:

Cisco PIX/ASA ASDM Troubles with Java

March 31st, 2009 No comments

Over the past few months I have been annoyed by ASDM bug CSCsv12681 which causes pretty much all versions of ASDM to fail on any machine that gets upgraded to a recent Java version:

Error while loading ASDM: “Unconnected sockets not implemented”

Symptom:
While loading ASDM, a dialog is displayed that says:
“ASDM cannot be loaded. Click OK to exit ASDM.
Unconnected sockets not implemented”

This occurs when using Java 6 Update 10 or later.

Conditions:
ASDM version 5.0 or later running on ASA, PIX or FWSM and using Java 6
Update 10 or later.

Workaround:
Use Java 6 Update 7.

I have a couple Pix devices running 8.0(4) code that I have not been able to manage with ASDM 6.1.3 since I re-built my laptop with Windows Vista and the latest JVM.  Today I decided to fix this by upgrading to ASDM interim release 6.1(5)51.  After the upgrade, I ran into a different issue where the software would load (progress bar would finish), but it would get stuck trying to open the main window and just sit there spinning.

I went to Cisco’s download page for the pix and looked for even newer interim releases (the 6.1(5)51 version had been listed on the main download page) and discovered bug CSCsw43498:

ASDM is not working with Java 1.6.0_11 and Vista OS. 

Symptom:

ASDM on Vista OS with Java 1.6.0_11 unable to load

Conditions:

Windows Vista OS and Java 1.6.0_10/Java 1.6.0_11

Workaround:
1. Right-click on the Cisco ASDM Launcher shortcut.
2. Select “Property”
3. Click on “Compatibility”
4. Click on the box “Disable Visual Theme”
5. Restart ASDM

Alternatively, users can downgrade to Java version 1.6.0_07 (download from http://java.sun.com/products/archive/j2se/6u7/

I opted for the workaround rather than running an even more bleeding edge interim release 6.1(5)57.  Note that it did not work the first time, so I set it “for all users” rather than just for myself and then it seemed to work.

ASDM Vista Compatibility Mode

ASDM Vista Compatibility Mode

-Eric

UPDATE: 5/30/09

Cisco has release ASDM 6.2 which I believe fixes all the bugs listed above and it works with PIX/ASA code 8.0.2, 8.0.2, 8.0.4 and 8.2.  Check out the release notes for the full compatability guide.  This release corresponds with version 8.2 of the ASA code which has a number of new exciting features including the AnyConnect Essentials license pack which supports 64 bit Windows.

Categories: Cisco, Network Tags:

Cisco 3750 Delayed Power On

March 31st, 2009 7 comments

Last week I un-racked a Cisco 3750 (WS-C3750G-24TS) from our datacenter and brought it over to our office.  This device had been running reliably at the datacenter for over a year.  When I went to power it back up at the office, the device acted DOA.  No lights, no fans, nothing.

I tested the power cord on another 3750 switch to make sure the power was good and it worked fine.  So I plugged it back into the dead 3750 and tried re-seating the cord, etc, to no avail.  At this point I picked up the phone and called Cisco to get an RMA issued.  About half way through my conversation with the call-taker, I looked down on my desk and realized the switch had booted.  I continued with the Cisco case as the last thing I needed was an intermittent switch.

I then tried un-plugging the switch and plugging it back in several times, and it would boot every time.  Thinking it might be a capacitor charge issue, I un-plugged it while I went to lunch, and then re-connected it when I got back and it took several minutes again to power on.  I have since reproduced this a third time, in which it took over five minutes to boot!  I actually have it on video just in case I need to prove it to Cisco.

This was such a weird failure mechanism that I figured I should share in the hopes that someone else might find it a useful data point.

-Eric

Categories: Cisco, Network Tags: