Cisco AnyConnect Split-DNS resolution not working in Snow Leopard 10.6

November 20th, 2009 2 comments

I just upgraded my Cisco AnyConnect client on my ASA 5510 to 2.4.0202 hoping that the VPN would work for my users with Mac OS 10.6 Snow Leopard, but it would appear they are having DNS resolution issues.  I use the Split-DNS functionality of the ASA/Anyconnect client to only send DNS queries to the across-the-vpn DNS servers for a couple of domain names.

My brief testing has shown that all DNS queries are being sent to the remote hosts local DNS servers rather than sending them to the corporate DNS servers for the Split-DNS domains.

I found Cisco bug ID CSCtc54466 that describes this issue.  It describes this issue as being with Mac OS X 10.6 and they claim the issue is with Apples mDNS code.  They say it is “likely to be fixed in Mac OS X 10.6.3”.

In the meantime they claim you can “Restart the mDNSResponder service”.  I am assuming you would need to restart this service each time you VPN in?  I have not yet looked into how to restart that service yet either.  I will edit this post once I figure it out.

-Eric

Categories: Apple, Cisco, Network Tags:

What type of server rack/cabinet should I buy?

October 17th, 2009 2 comments

Over the years I have run into any number of problems physically mounting servers/equipment into racks/cabinets/enclosures.  This is often a major headache as it is not easy to change out your enclosure without taking everything offline, and often times (in the case of colocation facilities) it is simply not an option.

I was asked yesterday by a colocation facility for my advice on what types of new cabinets they should buy, which started me thinking.  I decided to post my current recommendations online in the hopes that it is of use to others:

  • Any server cabinet you buy absolutely 100% must have front to back cooling with fully perforated doors (none of this lexan crud with small holes and fans).  Every last square inch of the front and back needs to be perforated.  Period.
  • Don’t even think about using a bottom-to-top cooling setup cabinet.  If your colo provider tries to give you one of these, run away screaming.  This type of design was intended for telco style equipment.  What happens is the gear in the top of the rack bakes.  With modern gear you need much more airflow than that model can provide.
  • Make sure the cabinet is deep enough for everything you intend to put in it (including bezels on the front and cables on the back)  This is the largest problem I run into with colocation facility’s that have old racks.  The equipment has gotten longer but a lot of facilities don’t want to spend the money to upgrade (and the longer ones take up more floor space).  From a quick survey of rack manufacturers, it looks like 42″ is the new standard depth that should work with pretty much everything.
  • Make sure any cabinet you buy has standard mount points for vertical mount PDU’s (Power Distribution Units).  With density increasing vertical PDU’s are the only way to go (and they put the power strip right where you need it so you can use extremely short power cables).
  • The industry standard is now to have square holes rather than round holes.  This keeps you from stripping out threading and ruining an entire rack rail.  You can put cage nuts in the holes if you need threads.  (As a side note, I have seen at least three types of round hole racks, two with different types of threading, and one with no threading at all – I am glad these are all going away – except in two post racks where threaded holes are still standard)
  • Vertical wire management channels, chases, brackets, are a plus.  Think about how you are going to run your power, network, fiber, etc… cables.
  • Make sure the cabinet is built heavy-duty enough to handle the increased density of modern equipment.  Older cabinets were not designed for today’s weight loads.
  • The cabinets you get need to have proper heavy-duty bolt down points for earthquake and stability purposes (so they don’t tip over on you when you pull out servers).  Think about how this will work in the context of raised floors (if you have raised floors).
  • Decide if you want combo dials on the cabinets, or key’d entry.  I personally think colo facility’s should offer both and let the customer decide.
  • Your standard height cabinet is 42 rack units.  I don’t see any reason to deviate from this unless you can’t get something that tall into the building.  They also make taller ones, but who really thinks lifting servers above your head is a good idea OHSA wise?
  • Standard width these days is 24 inches.  If this is your own personal datacenter you could consider wider cabinets to provide a little more wiring space, but 24 is the industry norm (note that regardless of cabinet width, the rail width needs to be 19″ which is the standard).
  • Some cabinets come with split rear doors for reduced clearance which I find to be very convenient in many cases.  I really like the Dell ones.
  • The doors need to be very easy to remove and put back on (by ONE person) without hassle (like little nylon washers that fall out and get lost).  The doors should not bow or flex such that lining up the pins is a pain in the butt.  Dell gets good marks here too.
  • When you go to put equipment in your cabinet, if it has adjustable rails, make sure to adjust them properly BEFORE you install all your equipment.  Most server equipment can accept a certain range of depths these days so pick a depth that fits all your gear.

As usual, please post below if you have any comments/questions or shoot me an email!

-Eric

Categories: Uncategorized Tags:

Review of ATT Wireless HSDPA and Verizon Wireless EVDO Rev A.

September 16th, 2009 2 comments

Today I brought home a new ATT USBConnect Mercury card to test out the service in comparison to my trusty Dell Wireless 5720 EVDO Rev-A card built into my Latitude D630 (which is about 18 months old now).

Not wanting to pollute my primary work laptop with extra cruft, I installed the ATT card software on a spare Dell D800 I had laying around for test purposes.

Here is a screenshot of the ATT Communication Manager as connected from the master bedroom of my house in Beaverton/Hillsboro Oregon:

ATT Communication Manager

As you can see, the signal is decent, but not right under a tower.

The first test as always is to ping my favorite IP address.  Sorry for the long paste here, but it is important to see how the latency varies over time.  From this test (and numerous others earlier in the day from work) I was not very impressed with the latency of the card or the consistency in the latency (jitter).

C:\Documents and Settings\Administrator>ping 4.2.2.1 -t
Pinging 4.2.2.1 with 32 bytes of data:
Reply from 4.2.2.1: bytes=32 time=318ms TTL=53
Reply from 4.2.2.1: bytes=32 time=388ms TTL=53
Reply from 4.2.2.1: bytes=32 time=386ms TTL=53
Reply from 4.2.2.1: bytes=32 time=345ms TTL=53
Reply from 4.2.2.1: bytes=32 time=384ms TTL=53
Reply from 4.2.2.1: bytes=32 time=422ms TTL=53
Reply from 4.2.2.1: bytes=32 time=311ms TTL=53
Reply from 4.2.2.1: bytes=32 time=339ms TTL=53
Reply from 4.2.2.1: bytes=32 time=338ms TTL=53
Reply from 4.2.2.1: bytes=32 time=336ms TTL=53
Reply from 4.2.2.1: bytes=32 time=365ms TTL=53
Reply from 4.2.2.1: bytes=32 time=323ms TTL=53
Reply from 4.2.2.1: bytes=32 time=362ms TTL=53
Reply from 4.2.2.1: bytes=32 time=320ms TTL=53
Reply from 4.2.2.1: bytes=32 time=409ms TTL=53
Reply from 4.2.2.1: bytes=32 time=397ms TTL=53
Reply from 4.2.2.1: bytes=32 time=276ms TTL=53
Reply from 4.2.2.1: bytes=32 time=285ms TTL=53
Reply from 4.2.2.1: bytes=32 time=353ms TTL=53
Reply from 4.2.2.1: bytes=32 time=322ms TTL=53
Reply from 4.2.2.1: bytes=32 time=350ms TTL=53
Reply from 4.2.2.1: bytes=32 time=309ms TTL=53
Reply from 4.2.2.1: bytes=32 time=417ms TTL=53
Reply from 4.2.2.1: bytes=32 time=266ms TTL=53
Reply from 4.2.2.1: bytes=32 time=304ms TTL=53
Reply from 4.2.2.1: bytes=32 time=515ms TTL=53
Reply from 4.2.2.1: bytes=32 time=94ms TTL=53
Reply from 4.2.2.1: bytes=32 time=102ms TTL=53
Reply from 4.2.2.1: bytes=32 time=101ms TTL=53
Reply from 4.2.2.1: bytes=32 time=99ms TTL=53
Reply from 4.2.2.1: bytes=32 time=1056ms TTL=53
Reply from 4.2.2.1: bytes=32 time=369ms TTL=53
Reply from 4.2.2.1: bytes=32 time=353ms TTL=53
Reply from 4.2.2.1: bytes=32 time=431ms TTL=53
Reply from 4.2.2.1: bytes=32 time=390ms TTL=53
Reply from 4.2.2.1: bytes=32 time=308ms TTL=53
Reply from 4.2.2.1: bytes=32 time=337ms TTL=53
Ping statistics for 4.2.2.1:
    Packets: Sent = 37, Received = 37, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 94ms, Maximum = 1056ms, Average = 345ms
Control-C
^C
C:\Documents and Settings\Administrator>

The next test up was to see how fast the download/upload performance would be.  For this purpose I used Speedtest.net:

Speedtest.net ATT Test From Home

Hmm, that is certainly nothing to write home about, but not absolutely horrible for a mobile broadband card.

After checking latency and bandwidth, I moved on to test the maximum MTU size the connection would support.  Some applications are finicky about MTU sizes (especially UDP based ones).  I found that on ATT Wireless, the largest packet size it would support was 1450 bytes (note that in Windows ping tool below you specify this as the ICMP payload size of 1422 to which it adds 8 bytes of ICMP header and 20 bytes of IP header).

C:\Documents and Settings\Administrator>ping 4.2.2.1 -f -l 1422
Pinging 4.2.2.1 with 1422 bytes of data:
Reply from 4.2.2.1: bytes=1422 time=1021ms TTL=53
Reply from 4.2.2.1: bytes=1422 time=167ms TTL=53
Reply from 4.2.2.1: bytes=1422 time=178ms TTL=53
Reply from 4.2.2.1: bytes=1422 time=167ms TTL=53
Ping statistics for 4.2.2.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 167ms, Maximum = 1021ms, Average = 383ms
C:\Documents and Settings\Administrator>

The Verizon test

Here is the same ping test from the verizon card in my Dell Latitude D630:

C:\Users\eric.rosenberry>ping 4.2.2.1 -t
C:\Users\eric.rosenberry>ping 4.2.2.1 -t Pinging 4.2.2.1 with 32 bytes of data: Reply from 4.2.2.1: bytes=32 time=87ms TTL=49 Reply from 4.2.2.1: bytes=32 time=87ms TTL=49 Reply from 4.2.2.1: bytes=32 time=90ms TTL=49 Reply from 4.2.2.1: bytes=32 time=87ms TTL=49 Reply from 4.2.2.1: bytes=32 time=85ms TTL=49 Reply from 4.2.2.1: bytes=32 time=92ms TTL=49 Reply from 4.2.2.1: bytes=32 time=88ms TTL=49 Reply from 4.2.2.1: bytes=32 time=86ms TTL=49 Reply from 4.2.2.1: bytes=32 time=90ms TTL=49 Reply from 4.2.2.1: bytes=32 time=84ms TTL=49 Reply from 4.2.2.1: bytes=32 time=89ms TTL=49 Reply from 4.2.2.1: bytes=32 time=88ms TTL=49 Reply from 4.2.2.1: bytes=32 time=86ms TTL=49 Reply from 4.2.2.1: bytes=32 time=91ms TTL=49 Reply from 4.2.2.1: bytes=32 time=87ms TTL=49 Reply from 4.2.2.1: bytes=32 time=113ms TTL=49 Reply from 4.2.2.1: bytes=32 time=97ms TTL=49 Reply from 4.2.2.1: bytes=32 time=95ms TTL=49 Reply from 4.2.2.1: bytes=32 time=85ms TTL=49 Reply from 4.2.2.1: bytes=32 time=90ms TTL=49 Reply from 4.2.2.1: bytes=32 time=88ms TTL=49 Reply from 4.2.2.1: bytes=32 time=88ms TTL=49 Reply from 4.2.2.1: bytes=32 time=92ms TTL=49 Reply from 4.2.2.1: bytes=32 time=92ms TTL=49 Reply from 4.2.2.1: bytes=32 time=87ms TTL=49 Reply from 4.2.2.1: bytes=32 time=85ms TTL=49 Reply from 4.2.2.1: bytes=32 time=89ms TTL=49 Reply from 4.2.2.1: bytes=32 time=86ms TTL=49 Reply from 4.2.2.1: bytes=32 time=89ms TTL=49 Reply from 4.2.2.1: bytes=32 time=86ms TTL=49 Reply from 4.2.2.1: bytes=32 time=89ms TTL=49 Reply from 4.2.2.1: bytes=32 time=85ms TTL=49 Ping statistics for 4.2.2.1: Packets: Sent = 32, Received = 32, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 84ms, Maximum = 113ms, Average = 89ms Control-C ^C C:\Users\eric.rosenberry>

Note how much lower that is than the ATT HSDPA card!  An average of 89ms vs 345!  Verizon’s max ping time is nearly the same as ATT’s best!

How about the speed test?

Verizon Wireless Speedtest From HomeIt is still not great, but it is a bit better than ATT.  I often wonder how many of these speed test sites actually have enough bandwidth to provide realistic tests?

When testing the MTU size capabilities, I was plesently surprised to find that the Verizon EVDO card allowed full 1500 byte packets!

Wrap up

I feel I must point out the issues with my testing before drawing any conclusions:

  • My EVDO card is built into my Dell D630 and has diversity antennas spatially separated in the screen that are probably larger than the one in the ATT dongle.  I could get a card that would work with ATT for my D630 to do more apples-to-apples testing.
  • I only tested from a single location.  Wireless service is incredibly location dependent.  ATT could happen to have weaker signal at this particular location than Verizon (though I actually got similar results at my office earlier in the day in downtown Portland which provides a second data point).
  • Load on my particular tower may happen to have been heavy at the time of my testing, so these types of results are only useful in aggregate when enough samples are collected to be statistically significant.

It is also worth noting that anecdotally, browsing on the ATT card was painfully slow, where browsing on the Verizon card was pretty good.  I started this blog post on the ATT card and then switched over to the Verizon card to finish it.

Other things that can impact performance include:

  • Type of gear deployed on the tower you are connecting to (is it HSDPA capable)
  • Signal strength
  • Number of carrier channels deployed on the tower in question
  • Amount of other users on the tower (density)
  • The time of day (rush hour gets a lot of use)
  • The back-haul capacity from the tower (is it connected by one or two T-1’s, or do they have DS-3/OC-3 back-hauls?)

So my conclusion?  If purchasing a broadband card for myself I would definetly choose the Verizon card over the ATT card no questions asked.  In Portland Oregon, Verizon Wireless has a network that is extremely difficult to beat!

-Eric

Categories: Uncategorized Tags:

Advanced PING usage on Cisco, Juniper, Windows, Linux, and Solaris

September 15th, 2009 1 comment

As a network engineer, one of the most common utilities I use is the ping command.  While in its simplest form it is a very valuable tool, there is much more knowledge that can be gleaned from it by specifying the right parameters.

Ping on Cisco routers

On modern Cisco IOS versions the ping tool has quite a few options, however, this was not always the case.

Compare the options to ping in IOS 12.1(14)

EXRTR1#ping ip 4.2.2.1 ?
  <cr>

EXRTR1#ping ip 4.2.2.1

To that in IOS 12.4(24)T

plunger#ping ip 4.2.2.1 ?
  data      specify data pattern
  df-bit    enable do not fragment bit in IP header
  repeat    specify repeat count
  size      specify datagram size
  source    specify source address or name
  timeout   specify timeout interval
  validate  validate reply data
  <cr>

plunger#ping ip 4.2.2.1
plunger#ping ip 4.2.2.1 ?
data      specify data pattern
df-bit    enable do not fragment bit in IP header
repeat    specify repeat count
size      specify datagram size
source    specify source address or name
timeout   specify timeout interval
validate  validate reply data
<cr>
plunger#ping ip 4.2.2.1

When I am running a basic connectivity test between two points on a network I will generally not specify any options to ping (i.e. “ping 4.2.2.1”), however, once I have verified connectivity I will most often then want to verify what MTU size the path will support without fragmentation, and then also run an extended ping process with a thousand or more pings to test the reliability/bandwidth/latency characteristics of the link.

Here is an example of the most basic form.  Note that by default it is sending 100 byte frames:

plunger#ping 4.2.2.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.2.2.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/26/28 ms
plunger#

If I am working on an Ethernet network (or PPP link), it is most common that my target goal is for 1500 byte frames to make it through.  I will use the “size” parameter to force ping to generate 1500 byte frames (note that in Cisco land this means 1472 byte ICMP payloads plus 8 bytes ICMP header and 20 bytes IP header).  I also use the df-bit flag to set the DO NOT FRAGMENT bit on the generated packets.  This will allow me to ensure that the originating router (or some other router in the path), is not fragmenting the packets for me.

plunger#ping 4.2.2.1 size 1500 df-bit 

Type escape sequence to abort.
Sending 5, 1500-byte ICMP Echos to 4.2.2.1, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/26/28 ms
plunger#

If the first ping command worked, but the command above did not, then try backing down the size until you find a value that works.  Note that one common example of a smaller MTU is 1492 which is caused by the 8 bytes of overhead in PPPoE connections.

The next command to try is to send a large number of pings of the maximum MTU your link can support.  This will help you identify packet loss issues and is just a good way to generate traffic on the link to see how much bandwidth you can push (if your monitoring the link with another tool).  I have frequently identified bad WAN circuits using this method.  Note that looking at the Layer 1/2 error statistics before doing this (and perhaps clearing them), and then looking at them again afterwards (on each link in the path!) is often a good idea.

plunger#ping 192.168.0.10 size 1500 df-bit repeat 1000

Type escape sequence to abort.
Sending 1000, 1500-byte ICMP Echos to 192.168.0.10, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (1000/1000), round-trip min/avg/max = 1/2/4 ms
plunger#

Now the final ping parameter I often end up using is the “source” option.  A good example is when you have a router with a WAN connection on one side that has a /30 routing subnet on it, plus then an Ethernet connection with a larger subnet for your users devices.  Say that users on the Ethernet are reporting that they can not ping certain locations on the WAN, though you can ping it just fine from the router.  This is often because the return path from the device you are pinging back to your users subnet on the Ethernet is not being routed properly, but the IP your router has on the /30 WAN subnet is being routed correctly.  The key here is that by default a Cisco router will originate packets from the IP of the Interface it is going to be sending the traffic out (based on it’s routing tables).

To test from the interface your router has in the users subnet, use the source command like this:

plunger#ping 4.2.2.1 source fastEthernet 0/0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.2.2.1, timeout is 2 seconds:
Packet sent with a source address of 173.50.158.74
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/25/28 ms
plunger#

Note that you can also use these command together depending on what you are trying to do:

plunger#ping 192.168.0.10 size 1500 df-bit source FastEthernet 0/0 repeat 100

Type escape sequence to abort.
Sending 100, 1500-byte ICMP Echos to 192.168.0.10, timeout is 2 seconds:
Packet sent with a source address of 173.50.158.74
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (100/100), round-trip min/avg/max = 1/2/4 ms
plunger#

Ping on Juniper routers

The options to ping on Juniper (version 9.3R4.4 in this case) are quite extensive:

root@INCSW1> ping ?
Possible completions:
  <host>               Hostname or IP address of remote host
  bypass-routing       Bypass routing table, use specified interface
  count                Number of ping requests to send (1..2000000000 packets)
  detail               Display incoming interface of received packet
  do-not-fragment      Don't fragment echo request packets (IPv4)
  inet                 Force ping to IPv4 destination
  inet6                Force ping to IPv6 destination
  interface            Source interface (multicast, all-ones, unrouted packets)
  interval             Delay between ping requests (seconds)
  logical-system       Name of logical system
+ loose-source         Intermediate loose source route entry (IPv4)
  no-resolve           Don't attempt to print addresses symbolically
  pattern              Hexadecimal fill pattern
  rapid                Send requests rapidly (default count of 5)
  record-route         Record and report packet's path (IPv4)
  routing-instance     Routing instance for ping attempt
  size                 Size of request packets (0..65468 bytes)
  source               Source address of echo request
  strict               Use strict source route option (IPv4)
+ strict-source        Intermediate strict source route entry (IPv4)
  tos                  IP type-of-service value (0..255)
  ttl                  IP time-to-live value (IPv6 hop-limit value) (hops)
  verbose              Display detailed output
  vpls                 Ping VPLS MAC address
  wait                 Delay after sending last packet (seconds)
root@INCSW1> ping

While there are a lot more options here, I am generally trying to test the same types of things.  A very important note however is that in the Juniper world, the size parameter is the payload size and does not include the 8 byte ICMP header and 20 byte IP header.   The command below is the same as specifying 1500 bytes in Cisco land.

root@INCSW1> ping 4.2.2.1 size 1472 do-not-fragment
PING 4.2.2.1 (4.2.2.1): 1472 data bytes
1480 bytes from 4.2.2.1: icmp_seq=0 ttl=53 time=25.025 ms
1480 bytes from 4.2.2.1: icmp_seq=1 ttl=53 time=24.773 ms
1480 bytes from 4.2.2.1: icmp_seq=2 ttl=53 time=24.757 ms
1480 bytes from 4.2.2.1: icmp_seq=3 ttl=53 time=25.045 ms
1480 bytes from 4.2.2.1: icmp_seq=4 ttl=53 time=24.911 ms
1480 bytes from 4.2.2.1: icmp_seq=5 ttl=53 time=25.152 ms
^C
--- 4.2.2.1 ping statistics ---
6 packets transmitted, 6 packets received, 0% packet loss
round-trip min/avg/max/stddev = 24.757/24.944/25.152/0.145 ms

root@INCSW1>

Here is an example of sending lots of pings quickly on the Juniper to test link reliability:

root@INCSW1> ping 10.0.0.1 size 1472 do-not-fragment rapid count 100
PING 10.0.0.1 (10.0.0.1): 1472 data bytes
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
--- 10.0.0.1 ping statistics ---
100 packets transmitted, 100 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.230/2.429/5.479/1.228 ms

root@INCSW1>

Ping on Windows XP/Vista/2003/2008

While not as full featured, the Windows ping tool can at least set the packet size and the do-not-fragment bit:

Microsoft Windows [Version 6.0.6001]
Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

C:\Users\eric.rosenberry>ping ?
^C
C:\Users\eric.rosenberry>ping

Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS]
            [-r count] [-s count] [[-j host-list] | [-k host-list]]
            [-w timeout] [-R] [-S srcaddr] [-4] [-6] target_name

Options:
    -t             Ping the specified host until stopped.
                   To see statistics and continue - type Control-Break;
                   To stop - type Control-C.
    -a             Resolve addresses to hostnames.
    -n count       Number of echo requests to send.
    -l size        Send buffer size.
    -f             Set Don't Fragment flag in packet (IPv4-only).
    -i TTL         Time To Live.
    -v TOS         Type Of Service (IPv4-only).
    -r count       Record route for count hops (IPv4-only).
    -s count       Timestamp for count hops (IPv4-only).
    -j host-list   Loose source route along host-list (IPv4-only).
    -k host-list   Strict source route along host-list (IPv4-only).
    -w timeout     Timeout in milliseconds to wait for each reply.
    -R             Use routing header to test reverse route also (IPv6-only).
    -S srcaddr     Source address to use.
    -4             Force using IPv4.
    -6             Force using IPv6.

C:\Users\eric.rosenberry>

So your basic ping in Windows claims to send 32 bytes of data (I have not verified this), but I suspect that is 32 bytes of payload, plus 8 bytes ICMP, and 20 bytes of IP for a total of 60 bytes.

C:\Users\eric.rosenberry>ping 4.2.2.1

Pinging 4.2.2.1 with 32 bytes of data:
Reply from 4.2.2.1: bytes=32 time=27ms TTL=53
Reply from 4.2.2.1: bytes=32 time=25ms TTL=53
Reply from 4.2.2.1: bytes=32 time=28ms TTL=53
Reply from 4.2.2.1: bytes=32 time=27ms TTL=53

Ping statistics for 4.2.2.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 25ms, Maximum = 28ms, Average = 26ms

C:\Users\eric.rosenberry>

So a common set of flags I use will be to create full 1500 byte frames (note that it takes 1472 as the parameter for this) and then tell it not to fragment (-f) and to repeat until stopped (-t).

C:\Users\eric.rosenberry>ping 4.2.2.1 -l 1472 -f -t

Pinging 4.2.2.1 with 1472 bytes of data:
Reply from 4.2.2.1: bytes=1472 time=29ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=30ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=31ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=29ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=30ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=28ms TTL=53
Reply from 4.2.2.1: bytes=1472 time=29ms TTL=53

Ping statistics for 4.2.2.1:
    Packets: Sent = 7, Received = 7, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 28ms, Maximum = 31ms, Average = 29ms
Control-C
^C
C:\Users\eric.rosenberry>

Ping on Linux

Hey look, it is a ping command that is not ambiguous about what size frames it is generating!!!  It clearly shows that the payload is 56 bytes, but that the full frame is 84.  Note that this is from an Ubuntu 2.6.27-7-generic kernel box.

ericr@eric-linux:~$ ping 4.2.2.1
PING 4.2.2.1 (4.2.2.1) 56(84) bytes of data.
64 bytes from 4.2.2.1: icmp_seq=1 ttl=52 time=21.8 ms
64 bytes from 4.2.2.1: icmp_seq=2 ttl=52 time=21.4 ms
64 bytes from 4.2.2.1: icmp_seq=3 ttl=52 time=21.4 ms
64 bytes from 4.2.2.1: icmp_seq=4 ttl=52 time=21.6 ms
64 bytes from 4.2.2.1: icmp_seq=5 ttl=52 time=21.8 ms
64 bytes from 4.2.2.1: icmp_seq=6 ttl=52 time=21.8 ms
^C
--- 4.2.2.1 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5019ms
rtt min/avg/max/mdev = 21.435/21.700/21.897/0.190 ms
ericr@eric-linux:~$

To set the packet size use the -s flag (it is asking for payload size, so 1472 will create a 1500 byte frame).  Now if you want to turn off fragmentation by setting the do-not-fragment bit (DF), the parameter is a bit more obscure “-M on”.  Here is an example using both:

ericr@eric-linux:~$ ping 4.2.2.1 -s 1472 -M do
PING 4.2.2.1 (4.2.2.1) 1472(1500) bytes of data.
1480 bytes from 4.2.2.1: icmp_seq=1 ttl=52 time=23.8 ms
1480 bytes from 4.2.2.1: icmp_seq=2 ttl=52 time=24.1 ms
1480 bytes from 4.2.2.1: icmp_seq=3 ttl=52 time=31.4 ms
1480 bytes from 4.2.2.1: icmp_seq=4 ttl=52 time=23.7 ms
1480 bytes from 4.2.2.1: icmp_seq=5 ttl=52 time=23.5 ms
^C
--- 4.2.2.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4017ms
rtt min/avg/max/mdev = 23.589/25.369/31.469/3.057 ms
ericr@eric-linux:~$

And there is another highly useful parameter that we have not seen yet in any of our previous ping utilities.  The linux ping has a “flood” option that will send pings as fast as the machine can generate them.  This is great for testing network links capacity, but can make for unhappy network engineers if you use it inappropriately.  Note that you must be root to use the -f flag.  Output is only shown when packets are dropped:

ericr@eric-linux:~$ sudo ping 10.0.0.1 -s 1472 -M do -f
PING 10.0.0.1 (10.0.0.1) 1472(1500) bytes of data.
.^C
--- 10.0.0.1 ping statistics ---
2763 packets transmitted, 2762 received, 0% packet loss, time 6695ms
rtt min/avg/max/mdev = 2.342/2.374/3.204/0.065 ms, ipg/ewma 2.424/2.384 ms
ericr@eric-linux:~$

Ping on Solaris

Here is the ping options from a Solaris 10 box (I forget what update this super-secret kernel number decodes too):

SunOS dbrd02 5.10 Generic_125100-07 sun4v sparc SUNW,Sun-Fire-T200

I find the basic ping command in Solaris to be annoying:

[erosenbe: dbrd02]/export/home/erosenbe> ping 4.2.2.1
4.2.2.1 is alive
[erosenbe: dbrd02]/export/home/erosenbe>

I want Ping to tell me something more useful than that a host is alive.  Come on Sun, like round trip time at least?  Maybe send a few additional pings than just one?  The -s command makes this operate more like the ping command in other OS’s:

[erosenbe: dbrd02]/export/home/erosenbe> ping -s 4.2.2.1
PING 4.2.2.1: 56 data bytes
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=0. time=21.6 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=1. time=21.5 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=2. time=21.6 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=3. time=21.4 ms
64 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=4. time=21.1 ms
^C
----4.2.2.1 PING Statistics----
5 packets transmitted, 5 packets received, 0% packet loss
round-trip (ms)  min/avg/max/stddev = 21.1/21.5/21.6/0.22
[erosenbe: dbrd02]/export/home/erosenbe>

With the Solaris built in Ping tool you can specify the packet size, but it is very annoying that you can’t set the do-not-fragment bit.  Come on SUN, didn’t you like invent networking???  So in this example I had it send multiple pings, and I told it to use a size of 1500 but I could not set the DF bit so the OS must have fragmented the packets before sending.  I got responses back that claim to be 1508 bytes which I am assuming means that the 1500 bytes specified was the payload amount and the returned number of bytes includes the 8 byte ICMP header, but not the 20 byte IP header…  Go SUN.

[erosenbe: dbrd02]/export/home/erosenbe> ping -s 4.2.2.1 1500
PING 4.2.2.1: 1500 data bytes
1508 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=0. time=36.7 ms
1508 bytes from vnsc-pri.sys.gtei.net (4.2.2.1): icmp_seq=1. time=23.2 ms
^C
----4.2.2.1 PING Statistics----
2 packets transmitted, 2 packets received, 0% packet loss
round-trip (ms)  min/avg/max/stddev = 23.2/30.0/36.7/9.6
[erosenbe: dbrd02]/export/home/erosenbe>

Conclusion

Well, I hope this rundown on a number of different ping tools is useful to folks out there and as always, if you have any comments/questions/corrections please leave a comment!

-Eric

Categories: Cisco, Linux, Network, Sun Tags:

Are blade servers right for my environment?

July 15th, 2009 No comments

IT like most industries has it’s “fad”s.  Whether it be virtualization, or SAN’s, or blade servers.  Granted these three technologies play really nicely together, but once in a while you need to get off the bandwagon for a moment and think about what these technologies really do for us. While they are very cool overall and can make an extremely powerful team, as with anything, there is a right place, time, and situation/environment for their use.  Blades are clearly the “wave of the future” in many respects, but you must be cautious about the implications of implementing them today.

Please do not read this article and come away thinking I am “anti-blade” as that is certainly not the case.  I just feel they are all too often pushed into service in situations they are not the correct solution for and would like to point out some potential pitfalls.

Lifecycle co-termination

When you buy a blade center, one of the main selling points is that the network, SAN, and KVM infrastructure is built in.  This is great in terms of ease of deployment and management, however, on the financial side of things you must realize that the life span of these items is not normally the same.  When buying servers I typically expect them to be in service for 4 years, KVM’s (while becoming less utilized actually), can last much longer under most circumstances (barring changes in technology from PS/2 to USB, etc…), network switches I expect to use in some capacity or another for seven years, and SAN switches will probably have a similar life-cycle to the Storage Arrays they are attached to which I generally target at 5 year life spans.

So what does this mean?  Well, if your servers are showing their age in 4 years you are likely to end up replacing the entire blade enclosure at that point which includes the SAN and network switches.  It is possible the vendor will still sell blades that will fit in that enclosure, however, you are likely to be wanting a SAN or network upgrade before the end of those second set of servers life-cycles which will likely result in whole new platforms being purchased anyway.

Vendor lock

You have just created vendor lock such that with all the investment in enclosures you can’t go buy someone elses servers (this really sucks when your vendor fails to innovate on a particular technology).  All the manufacturers realize this situation exists and will surely use it to their advantage down the road.  It is hard to threaten not to buy Dell blades to put in your existing enclosures when that would mean throwing away your investment in SAN and network switches.

San design

Think about your SAN design – Most shops hook servers to a SAN switch which is directly attached to the storage array their data lives on.  Blade enclosures encourage the use of many more smaller SAN switches which often requires hooking the blade enclosure switches to other aggregation SAN switches which are then hooked to the Storage Processor.  This increases the complexity, increases failure points, decreases MTBF, and increases vendor lock.  Trunking SAN switches together from different vendors can be problematic and may require putting them in a compatibility mode which turns off useful features.

Vendor compatibility

Vendor compatibility becomes a huge issue- Say that you buy a blade enclosure today with 4 gig Brocade SAN switches in it for use with your existing 2 gig Brocade switches attached to an EMC Clarion CX500, but then next year you want to replace that with a Hitachi array attached to new Cisco SAN switches.  There are still many interop issues between SAN switch vendors that make trunking switches problematic.  If you had bought physical servers you may have just chosen to re-cable the servers over to the new Cisco switches directly.

Loss of flexibility

Another pitfall that I have seen folks fall into with blade servers is the loss of flexibility that comes with having a stand alone physical server.  You can’t hook up that external hard drive array full of cheap disks directly to the server, or hook up that network heartbeat crossover cable for your cluster, or add an extra NIC or two to a given machine that needs to be directly attached to some other network (that is not available as a VLAN within your switch)….

Inter-tying dependencies

You are creating dependencies on the common enclosure infrastructure so for full redundancy you need servers in multiple blade enclosures.  The argument that the blade enclosures are extremely redundant does not completely hold water to me.  I have needed to completely power cycle entire blade enclosures before to recover from certain blade management module failures.

Provisioning for highest common denominator

You must provision the blade enclosure for the maximum amount of SAN connectivity, network connectivity, and redundancy that is required on any one server within the enclosure.  Say for instance you have a authentication server that is super critical, but not resource intensive.  This requires your blade center to have fully redundant power supplies, network switches, etc…  Then say you have a different server that needs four 1 gig network interfaces, and yet another DB server that needs only two network interfaces, but it needs four HBA connections to the SAN.  You now need an enclosure that has four network switches and four SAN switches in it just to satisfy the needs of three “special case” servers.  In the case of the Dell M1000 blade enclosures, this configuration would be impossible since they can only have six SAN/Network modules total.

Buying un-used infrastructure

If you purchase a blade center that is not completely full of blades then you are wasting infrastructure resources in the form of unused network ports, SAN ports, power supply, and cooling capacity.  Making the ROI argument for blade centers is much easier if you have need to purchase full enclosures.

Failing to use existing infrastructure

Most environments have some amount of extra capacity on their existing network and SAN switches, as when they were purchased, they planned for the future (probably not with blade enclosures in mind).  Spending money to re-purchase SAN and network hardware within a blade enclosure to allow the use of blades can kill the cost advantages of going with a blade solution.

Moving from “cheap” disks to expensive SAN disks

You typically can not put many local disks into blades.  This is in many cases a huge loss as not everything needs to be on the SAN (and in fact, certain things would be very stupid to put on the SAN such as SWAP files).  I find that these days many people overlook the wonders of locally attached disk.  It is the *cheapest* form of disk you can buy and also can be extremely fast!  If your application does not require any of the advanced features a SAN can provide then DONT PUT IT ON THE SAN!

Over-buying power

In facilities where you are charged for power by the circuit the key is to manage your utilization such that your un-used (but paid for) power capacity is kept to a minimum.  With a blade enclosure, on day 1 you must provide (in this example) two 30 amp circuits for your blade enclosure, even though you are only putting in 4 out of a possible 16 severs.  You are going to be paying for those circuits even though you are nowhere near fully utilizing them.  The Dell blade enclosures as an example require two three phase 30 amp circuits for full power (though depending on the server configurations you put in them you can get away with dual 30 amp 208v circuits).

Think about the end of the life-cycle

You can’t turn off the power to a blade enclosure until the last server in that enclosure is decommissioned.  You also need to maintain support and maintenance contracts on the SAN switches, network switches, and enclosure until the last server is no longer mission critical.

When are blades the right tools for the job?

  • When your operational costs of operations and maintenance personnel far outweigh the cost inefficiencies of blades.
  • When you are buying enough servers that you can purchase *full* blade enclosures that have similar connectivity and redundancy requirements (i.e. each needs two 1 gig network ports and two 4 gig SAN connections).
  • When you absolutely need the highest density of servers offered (note that most datacenters in operation today can’t handle the density of power required and heat that blades can put out).

An example of a good use of blades would be a huge Citrix farm, or VMWare farms, or in some cases webserver farms (though I would argue very large web farms that can scale out easily should be on some of the cheapest hardware you can buy which typically does not include blades).

Another good example would be compute farms (say even lucene cache engines) – as long as you have enough nodes to be able to fill enclosures with machines that have the same connectivity and redundancy requirements.

Conclusion

While blades can be great solutions, they need to be implemented in the right environments for the right reasons.  It may indeed be the case that the savings in operational costs of employees to setup, manage, and maintain your servers far outweighs all of the points raised above, but it is important to factor all of these into your purchase decision.

As always, if you have any feedback or comments, please post below or feel free to shoot me an email.

-Eric

Categories: Cisco, Dell, HP, IBM, Network, Sun, Systems Tags:

Cisco Netflow to tell who is using Internet bandwidth

July 4th, 2009 1 comment

When working with telecom circuits that are slow and “expensive” (relative to lan circuits), the question frequently comes up “What is using up all of our bandwidth?”.  Many times this is asked because an over-subscribed WAN or Internet circuit is inducing latency/packet drops in mission critical applications such as Citrix or VoIP.  In other cases a company may be paying for a “burstable” Internet connection whereby they are paying for a floor of 10 megabits, but they can utilize up to 30 megabits and just be billed for the overage (at the 95th percentile generally).

So how do you tell which user/server/application is chewing up your Internet or WAN circuits?  Well Cisco has implemented a technology called “netflow” that allows your router to keep statistics on each TCP or UDP “flow” and then periodically shove that data into a logging packet and ship it off to some external server.  On this server you can run one of a variety of different software packages to analyze the data and understand what is using up your network bandwidth.

The question is, what software package should you utilize?  I have not gone and evaluated all of the available options, but I do have experience with a couple of them.  I have used Scrutinizer from Plixer in the past and not been very impressed.  Part of it may have been that the machine it was running on was not very fast, but I just did not like the interface or capabilities much.

More recently I have downloaded and run NetFlow Analyzer from ManageEngine and I have been very impressed!  It is free for only two interfaces and they have an easy-to download and install demo that will run unlimited interfaces for 30 days.  It runs on Linux or Windows (I tried the Linux version) and is is dirt simple to install and configure.  There really is nothing of note to configure on the server itself, you just need to point your router at the server’s IP and it will automatically start generating graphs for you.

I should also mention that Paessler has some kind of netflow capabilities (in PRTG), but I have not checked it out.  I note it here since I use their snmp monitoring software extensively and I have been happy with it.

To get your router to send NetFlow data to a collector, you need to set a couple of basic settings (including which version of NetFlow to use and where to send the packets), and then enable sending flows for traffic on all interfaces.  Note that it used to be you could only collect netflow data upon ingress to an interface and so in order to collect data on bi-directional traffic you needed to enable it on every single router interface in order to see the traffic in the opposite direction.  This was done with the “ip route-cache flow” command on each interface.

Now “ip route-cache flow” has been replaced with “ip flow ingress” and you can also issue “ip flow egress” command if you were to not wanting to monitor all router interfaces.  I have just stuck with issuing “ip flow ingress” on all my interfaces since I wanted to see all traffic anyway (and I am not quite sure what would happen if you issue both commands on two interfaces and then had traffic flow between them, it might double count those flows).

Here are the exact commands I used on plunger to ship data to Netflow Analyzer 7:

plunger#conf t

Enter configuration commands, one per line.  End with CNTL/Z.

plunger(config)#ip flow-cache timeout active 1

plunger(config)#ip flow-export version 5

plunger(config)#ip flow-export destination x.x.x.x 9996

plunger(config)#int fastEthernet 0/0

plunger(config-if)#ip flow ingress

plunger(config-if)#int fastEthernet 0/1

plunger(config-if)#ip flow ingress

plunger(config-if)#end

plunger#write mem

Building configuration…

[OK]

plunger#exit

Happy NetFlowing!

-Eric

Categories: Cisco, Network Tags:

Finally Got My Cisco ASA 5510 AnyConnect Essentials License

June 23rd, 2009 5 comments

After waiting for several weeks for Cisco to fufill my license code order (for ASA-AC-E-5510) I finally got the code in an email today!  Using the product authorization key to generate an activation key for my specific device was easy on the Cisco licensing web site.

I used the “activation key” command to plug it into my ASA 5510 and now I have the “AnyConnect Essentials” feature enabled when I do a “show ver”.

I never rebooted the device to activate the new license, however, I have not tested more than one user logged into it so I have no proof it works as of yet.

My ASA 5510 has been running the 8.2 code now for 26 days as my production corporate firewall without a hitch so I give it the thumbs up.  I did have one hiccup while rebooting the box after uploading the 8.2 code, but I actually think that 8.0.4 crashed when I asked it to reboot, rather than the 8.2 code failing to come up properly (I was not on the console however when I did this so I really have no proof, I ended up power-cycling the box).

I do have to gripe that the AnyConnect client has had issues on my Windows Vista laptop a number of times, though According to Cisco this may be due to Windows bugs relating to sleeping my laptop (which I do multiple times a day).  I get the dreaded “The vpn client driver has encountered an error” message.

Perhaps one other thing worth noting is that 8.2 created a new “coredumpinfo” folder on the internal flash file system with a file in it called coredump.cfg.  This file seems to somehow update it’s timestamp every time you do a show run and so It messes up my RANCID process which grabs the config and file system directory listings every 30 minutes and diff’s them for me.  This causes RANCID to email me every half hour with useless data that this file changed.

P.S.  The AnyConnect Essentials license key for my ASA 5510 was only $108 from my CDW rep including shipping (which was email btw…)

-Eric

UPDATE 6/24/09:

I forgot to mention that I am running this 5510 with only the stock 256 megs of RAM without issue.  There is reference in the release notes of possibly needing more RAM on that platform for 8.2 depending on what you are doing.  My RAM utilization actually went down between 8.0.4 and 8.2, though I also made some config changes around the same time so YMMV.

Also, there is some reference in the ASDM GUI about needing to reboot after applying a new activation key so I may need to do that…  Still have not tested it yet since my Vista laptop is being dumb.

Categories: Cisco, Network Tags:

Sun SPARC Ultra 25 Boot Fails at Probing I/O buses

May 27th, 2009 3 comments

So I have burned *way* too many hours on and off over the last couple weeks trying to get an Ultra 25 Sparc box I inherited working.  This box came to me with the video card not in the machine and some comment about it not working.

After putting the video card back in the box, when I booted the box it would not give any video output to the monitor.  I hooked into the serial console (9600-8-N-1 of course) and it appeared to be hanging with the last output being: Probing I/O buses

reset reason: 0000.0000.0000.0004
@(#)OBP 4.25.9 2007/08/23 14:17 Sun Ultra 25 Workstation
Clearing TLBs
Power-On Reset
Membase: 0000.0000.0000.0000
MemSize: 0000.0000.0004.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB (small-footprint mode) Done
MMUs ON
Init Fire JBUS Control Register… 
Find dropin, Copying Done, Size 0000.0000.0000.7260
PC = 0000.07ff.f000.6178
PC = 0000.0000.0000.6228
Find dropin, Copying Done, Size 0000.0000.0001.1440
Diagnostic console initialized
Configuring system memory & CPU(s)

CPU 0 Memory Configuration: Valid
CPU 0 Bank 0 1024 MB Bank 1 <empty> Bank 2 1024 MB Bank 3 <empty>

reset reason: 0000.0000.0000.0005
@(#)OBP 4.25.9 2007/08/23 14:17 Sun Ultra 25 Workstation
Clearing TLBs
Loading Configuration

Membase: 0000.0002.0000.0000
MemSize: 0000.0000.4000.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB Done
MMUs ON
Init Fire JBUS Control Register… 
Block Scrubbing Done
Find dropin, Copying Done, Size 0000.0000.0000.7260
PC = 0000.07ff.f000.6178
PC = 0000.0000.0000.6228
Find dropin, (copied), Decompressing Done, Size 0000.0000.0006.4530
Diagnostic console initialized
System Reset: CPU Reset (SPOR)
Probing system devices
jbus at 0,0 SUNW,UltraSPARC-IIIi (1336 MHz @ 8:1, 1 MB) memory-controller
jbus at 1,0 Nothing there
jbus at 1c,0 Nothing there
jbus at 1d,0 Nothing there
jbus at 1e,0 pci
jbus at 1f,0 pci
Loading Support Packages: kbd-translator obp-tftp SUNW,i2c-ram-device SUNW,fru-device SUNW,asr
Loading onboard drivers: ebus i2c i2c i2c ppm
/ebus@1f,464000: flashprom rtc serial serial env-monitor i2c power
/ebus@1f,464000/i2c@3,80: gpio temperature temperature temperature front-io-fru-prom sas-backplane-fru-prom dimm-spd psu-fru-prom hardware-monitor
/i2c@1f,520000: dimm-spd dimm-spd dimm-spd dimm-spd
/i2c@1f,530000: motherboard-fru-prom gpio clock-generator
/i2c@1f,462020: nvram idprom
Probing memory
CPU 0 Bank 0 base          0 size 1024 MB
CPU 0 Bank 2 base  200000000 size 1024 MB
Probing I/O buses

Based on the fact that it stopped working at “Probing I/O buses” and there was the possibility of an issue with the video card, I tried removing the card and booting headless.  In this configuration the system came up fine, with access from the serial port.

I eventually discovered that the issue was an impacted pin in the external dongle that splits the high density dual-dvi port in to two separate DVI ports.  The important note here for anybody searching for this issue is that when you have a video card in the machine, the last thing you will see on the serial console is Probing I/O buses since once it finds the video card, all future output is redirected to the video console.  So if you don’t get any output on the screen make sure to double check your video dongle, cables, and monitor!

Also, another unexpected behavior I ran into while troubleshooting- If I left the USB keyboard hooked to the machine while booting, it will assign that as the input device, and it won’t accept input on the serial console, even though that is where all output is going!  It is very odd typing on a keyboard and having your output go to a serial console…

-Eric

Categories: Sun Tags:

Host/System and Device/Router Naming Standards

May 21st, 2009 No comments

At each organization I am exposed to, it is interesting to see the various naming schemes that have been employed over time.  I most often find a hodgepodge of different naming standards that have been poorly followed.  Well thought out naming standards will make a huge difference in the ease of maintaining your environment.

So how should you come up with a device naming standard?  I won’t profess to give you a one-size-fits-all solution, but instead I will outline a number of the pitfalls to device naming that I have run into in order to help you devise your own convention.

Uses for a name

In IT, device names serve three primary roles:

  • They are a unique identifier used to define a device (note that a MAC address or serial number could be used as a unique ID, though it provides no other information about the device and is difficult for humans to work with).
  • When entered into DNS they provide an easy way to connect to a given device by typing in it’s name from scratch, or device names may be selected from a list in a program such as a SSH program.
  • When you see a device name in a log, or on a document it’s name should be obvious what the device in question is and convey to you critical information about the device.

Naming goals

  • Names should be as short as possible, easy to type and read, but with enough information to be unique and descriptive.
  • Make things as intuative as possible.  If you have an IT contractor working in your environment it should be pretty obvious to them what various servers do based soley on the machine names.
  • Your naming system should be flexible enough to allow for growth.

Naming structure

  • Generally you should start the name with the most significant identifier, and work your way through to the least significant identifier.   This makes sorting useful.
  • Think about how long should each field in the name be.  It needs to be long enough to hold unique entries for as many items of that type as will likely be utilized using the characterset defined for that field (i.e. if you have a two digit alpha field for site code, you can have a max of 676 sites, though if you want them to be intuative you probably don’t want to use the XZ designator) – a numeric only field has less options, 0-9 only yields 10 possibilities per digit.
  • Within a name you might choose to include delimiters between fields in order to seperate them, or just for stylistic reasons.  This makes names longer to type (and sometimes to long to fit in documentation, etc…), but they are often worthwhile from a readability standpoint.  PRF5A is a lot harder to read than PR-F5-A.  Most special characters are banned from device names, though dash “-” seems pretty well supported.
  • You can only have one variable length field in a name, unless you are using delimeters, or adjacent fields are obviously seperate since some are alpha only, and others are numeric only.
  • Note that not everything needs to have names of the same length – It is ok to name one server PDXFILE1 and another PDXSAN1.
  • Not everything needs to follow exactly the same nomenclature – routers and network hardware can follow one standard, while servers may follow another.  THIS IS OK!  As long as they don’t conflict…

Know your organization

  • Think about how your company will grow.  Might you ever have more than one VMWare server?
  • Unless there is no way your business will ever have more than one site (what if you were acquired) I highly recommend your names start with a site code (more on this below).
  • Not everybody has the same needs!  You don’t have to force the same scheme on every organziation!  A small manufacturering company has different needs from a global multinational.  You can get away with much simpler names in a small company than in a huge multinational corporation.

Who is your audience?

  • Names should be descriptive to your audience,  Who is your audience?  Users?  IT staff?
  • In an optimal world, machine names should not be seen by users.  In end-user facing situations I recommend using CNAME’s wherever possible to alias “service names” to “server names”. (i.e. webmail.bitplumber.net could be CNAME’d to pdxmail1.bitplumber.net.  Note that this often falls down in Windows since in Outlook for instance it insists on showing the user the *real* servername…  The same goes for file server names.
  • Internet facing services should never have users seeing the machine names.  They are likely connecting to a firwall and or load balancer first anyway so this is easy to hide.

High-level recommendations

  • Don’t name things non-sensical names, this is not 1990 (yeah, I know I broke this rule when naming plunger.bitplumber.net)
  • Avoid putting un-necessary junk in server names – I don’t really care what the model number of server is (in most cases), or even if it is a VMWare guest server or a physical server (this matters less and less as time goes on).
  • Don’t put a version number of software in the name as you will likely upgrade it! (I have seen servers named Win2k that are running Windows 2003 Server)
  • If the server might end up running multiple applications don’t put the name of one piece of software in the name, call it an application server or something…  (I have seen a server named backupexec that was running netbackup…)
  • In a software development shop (or even a non-software shop), you will likely have multiple copies of similar environments for testing purposes.  PRODUCTION, QA, DEVELOPMENT, STAGING , etc…  This is a good thing to include in the name as you typically have similar server names in each and you don’t want to inadvertantly make a change in Production when you intended to make it in QA.
  • Usually it makes sense to name services with a number on the end as you might have multiple servers performing the same function, or even if you only have a single server in that function you might move to another physical server later which you designate with a different number on the end.
    Many environments put two numbers on the end of servers, but how often do you really have more than 9 servers of the same type at one site?  It may be ok for some servers to have a single digit number on the end, while others have two digits.

Site codes

In most organizations I recommend the use of site codes as even single-site companies often end up with remote sales offices, disaster recovery datacenters, etc…

The goal with site codes is to choose a identifier that people both from the site in question, and others far away can easily identify as being related to a given location.  I have often struggled with this as there is no standard, and lots of potential for confusion and overlap.

You must decide how long you want your site codes to be.  I know Intel used to use two digit codes.  Many organizations choose three digit codes which conveniently enough corresponds with airport codes.

There are  a couple issues with airport codes however:

  • Some airport codes are not obvious which city they are in
  • You often times will have multiple sites within the serving area of a single airport

Note that not all site names have to be the same length (depending on your name structure).  At the last company I worked for I gave the large headquarters site in each region a three digit code, and then the smaller satellite sites got five character codes that began with the three digit region in which they were located.  i.e. PDX was the headquarters site and PDXPC was the Pacific Center satellite site.

A few other notes

Two situations to consider: Naming a device after a department, but that department moves elsewhere physically, but the device stays…  Or, naming a device after a building, but the company moves to another facility along with the device, and keeps the name.  Sometimes you must make a decision as to what a device will stay sticky with, the company/department, or the physical facility.

What is the timespan that your naming scheme must be good for?  I doubt a single site company is going to become a multinational overnight…  Your average IT device lasts 3-7 years so your naming scheme can easily change at replacement time to handle growth.

You might need to consider naming of devices with multiple network interfaces, each with different IP’s.

  • Windows is dumb and by default wants to register every interface with the same thing in DNS.  This can lead to issues if all networks are not directly reachable by all hosts accessing the device.
  • Solaris is interesting in that it wants each interface named differently.  In this case I recommend making the main server name map to the “primary” interface (i.e. probably the one you set the default gateway on) and then use <hostname>-xx for additional interfaces where -xx is something like -bk for backups, etc…
  • Routers should have different forward and reverse names for each interface, plus forward and reverse names for a loopback IP.  (i.e. fa0-0.plunger.bitplumber.net and fa0-1.plunger.bitplumber.net and just plain plunger.bitplumber.net for the loopback IP)

In one environment I have worked in we name all of our iLO, ilom’s, DRAC’s, etc…  <hostname>-SC (sc = service controller).  This makes it easy to go login to one in an emergency.  Just don’t accidentally cross the DNS entries or else you might power cycle the wrong box!

You must be careful not use special characters in device names.  Note that different devices and directory systems may have different “special characters”.  Think about Windows names, Unix names, router names, DNS names, WINS names, etc…  Each different type of name has different restrictions on what characters and symbols are allowed, and what the minimum and maximum lengths are.  Some names could be case sensitive, but most are not.

I personally find uppercase names easier to read in documentation and on screen, but that is in many cases a matter of personal preference, and in others may be enforced by the system in/on which the name is set (i.e. DNS).

IP addressing in relation to names

This is a topic worthy of another complete blog post, but I will point out just a couple of key recommendations here.

Since private ip address space is “free” and “plentiful” I generally build my subnets with plenty of IP space so that I can space machines widely and align their last number with their server number.  Most often I will use /23 subnets for servers and clients which gives me 512 IP’s (minus a few for network, broadcast, and default gateway).  As an example, you could have a server called PDXESX1 with an IP of 10.111.2.21 and another called PDXESX2 with IP 10.111.2.22, PDXESX3 as 10.111.2.23, etc…

On a somewhat unrelated note, in my oppinion the default gateway should always be the lowest usable IP in the range because it is intuative for anyone that follows after you.  Along these same lines, I am a fan of always making my DNS servers .11 and .12 in a given subnet (or .11 in one subnet and .11 in another subnbet).

Is this the right time to change?

Is change really needed?  Or is it simply change for change sakes?

The natural tendency for each new “owner” of a network is to want to do things their way with a naming standard that makes sense to them.  Don’t keep changing your naming schemes!  Even if the existing one is not perfect, it may be better overall just to leave it as is!

You generally don’t want to avoid changing a machines name after it has been set – the name gets referenced all over the place, and unless your process to change it is perfect, it will get missed somewhere and cause confusion down the road…  Think about all of the places you might have to change the name:

  • On the machine itself (hostname, hosts files, application configurations…)
  • In your ip address spreadsheets
  • In your inventory system
  • In DNS entries (including CNAME’s that reference the host name)
  • On the labels stuck to the machine physically
  • Your labels in the network switch (and supporting documentation)
  • Labels on the cables attached to the server – network, power, etc…
  • In your monitoring software
  • On your kvm switch
  • In description fields on your remote power cycle device (PDU’s) 
  • On your network diagrams and documentation

Final thoughts

While this may be a bit overwhelming, it is crucial to consider all of these aspects ahead of time in order to avoid needing to change your standard down the road.  I hope this has given you an overview of many of the pitfalls of naming I have run into during my career such that you can avoid the same mistakes!

As always, if you have any additional comments, feel free to post them here, or shoot me an email and I may include them in a future post.

-Eric

Categories: Uncategorized Tags:

Cisco ASA 5510 8.2 AnyConnect License Price ASA-AC-E-5510

May 13th, 2009 5 comments

As a follow-up to a previous post, I am happy to report that Cisco has finally posted the bits to the ASA 8.2 code online for download.  I have been looking forward to this, as this release includes a new license model for the AnyConnect VPN client called “Cisco AnyConnect Essentials”.

While I still can’t find any written reference (on the Cisco price list or elsewhere) for how much the AnyConnect VPN client is going to cost, I have confirmed that the previous rumor of it being “next to free” is indeed true.  Cisco is only charging $150 for the AnyConnect VPN Essentials license on a 5510 which will give you up to 250 simultaneous users!  (that is about as close-too-free as Cisco gets)

This is the answer you are looking for to deal with 64 bit client support!  A coworker of mine even told me today that the AnyConnect client works in his Windows 7 Beta 2 machine (which surprised me, I suspect under-the-hood the Windows 7 networking stack is very similar to Windows Vista).

The part number you need for an ASA 5510 is ASA-AC-E-5510=.  If you need the part numbers for other models check out the release announcement.

There is some reference in the release notes to possibly needing more ram in the ASA 5510 platforms (I am not yet sure if this will impact me, I am not doing a ton of stuff on my ASA 5510 but yet I run near 80% RAM utilization on version 8.0.4).  It is worth noting that there is annoying footnote that says the 256 -> 512 meg of RAM upgrade won’t be available till June…

Also, I have been told that the Botnet detection feature will be $460 a year.  This is part number ASA5510-BOT-1YR= for the ASA 5510.

I will write up another post once I install the 8.2 code somewhere.

-Eric

UPDATE: 5/18/09

I am getting conflicting information from my VAR than I got directly from Cisco.  They say MSRP is $350 right now and it won’t be available till late this month or early June.  CDW has it posted for $232.99 without any special pricing discounts you may have with them.  Availability says to call…

UPDATE: 5/29/09

The CDW site now shows that the ASA-AC-E-5510 part is $101.99.  It still says availability is “call”…

And for those of you looking for the part numbers you need to purchase the AnyConnect Essentials for your model of ASA, here they are:

  • AnyConnect Essentials VPN License – ASA 5505 (25 Prs) – ASA-AC-E-5505=
  • AnyConnect Essentials VPN License – ASA 5510 (250 Prs) – ASA-AC-E-5510=
  • AnyConnect Essentials VPN License – ASA 5520 (750 Prs) – ASA-AC-E-5520=
  • AnyConnect Essentials VPN License – ASA 5540 (2500 Prs) – ASA-AC-E-5540=
  • AnyConnect Essentials VPN License – ASA 5550 (5000 Prs) – ASA-AC-E-5550=
  • AnyConnect Essentials VPN License – ASA 5580 (10K Prs) – ASA-AC-E-5580=
Categories: Cisco, Network Tags: