Wonders of Telecom: CCIE Troubleshooting

CCIE Troubleshooting: Part 1 (Is it really my fault?)

There are two types of troubleshooting that you’ll run into on the CCIE lab:
1. The “Proctor is Evil” Troubleshooting
2. The “Self-Induced” Troubleshooting
The latter type is by far the more time-consuming but also the most important. Basically you messed something up, therefore you have to fix it! (At least if you want the points) The reason it is the most time-consuming is because it could be ANY silly mistake or combination of silly mistakes along the way, and there is no predicting what kinds of things can be done to mess with your own head!
The most important rule with this kind of troubleshooting is time management. Set a time limit of 15 minutes. If you can’t figure something out in 15 minutes (no, I don’t care how “close” you think you are!) go do something else. Whether this involves a bathroom break, a soda/snack break, standing on your head on the high-quality lab chairs or simply moving on to another “service” or “security” task of your lab makes no difference. The idea is to separate your brain from staring at the same thing over and over.
The longer you stare at something, the more you see what you want and not what’s really there. Most self-induced errors are really small, and fairly inane. You know. Those “DUH!” moments once we figure it out. But time management is the consequence we suffer due to silly mistakes. Avoid it!
Anyway… on to the more exciting things. The unpredictable nature of the “Proctor is Evil” Troubleshooting. Having started my training career specializing in the old CIT (Cisco Internetwork Troubleshooting) class, I can greatly appreciate some of the humorous things that MAY get thrown into lab exams. The question becomes, if I don’t know what they are and there are many different things that could go wrong… What the heck do I do about it?!?!
Excellent question! Process, my dear. It’s all about process.

We’ve had many different posts and blogs about the things you should be doing in your lab exam. You know… The stuff about reading your exam, re-drawing your diagrams, L2 diagramming, etc. Been there, done that. But then there’s troubleshooting. Some things are easy to spot. Others are not! Some things may obvious, others are just plain obnoxious.
The one thing to remember is that during the lab exam, you are there for your proctor’s entertainment. As long as this goes through your head, you’ll understand why some things are done. It’s all in good fun! And once you pass the CCIE Lab exam, it’s much easier to laugh about it all!
Start with the basics. IP addressing. You are working on your diagram anyway before your lab. Seriously. Troubleshooting and Diagramming all in one step! This is a good time to quickly verify things; A few different commands to think about, and some differences between them.
1. show ip interface brief | exclude unassigned
2. show interface | include Internet
3. show run | include interface|ip address
Each is slightly different.

Rack1R3(config)#do sh ip int br | exc una
Interface                  IP-Address      OK? Method Status                Protocol
FastEthernet0/0            145.1.36.3      YES manual up                    up
FastEthernet0/1            145.1.3.3       YES manual up                    up
Serial1/2.13               145.1.13.3      YES manual up                    up
Serial1/3.23               145.1.23.3      YES manual up                    up
Loopback0                  150.1.3.3       YES manual up                    up
Loopback111                145.1.133.3     YES manual up                    up
Rack1R3(config)#

This gives me simple IP addresses to check against.

Rack1R3(config)#do sh int | in Internet
Internet address is 145.1.36.3/24
Internet address is 145.1.3.3/24
Internet address is 145.1.13.3/24
Internet address is 145.1.23.3/24
Internet address is 150.1.3.3/24
Internet address is 145.1.133.3/24
Rack1R3(config)#

This gives me IP addresses with VLSM masks that may well be present in my diagram. More information to check. Context can be provided by:

Rack1R3(config)#do sh int | in Internet|is (up|down)
FastEthernet0/0 is up, line protocol is up
Internet address is 145.1.36.3/24
FastEthernet0/1 is up, line protocol is up
Internet address is 145.1.3.3/24
Serial0/0/0 is administratively down, line protocol is down
Serial0/1/0 is administratively down, line protocol is down
Serial0/2/0 is administratively down, line protocol is down
Serial0/2/1 is administratively down, line protocol is down
Serial1/0 is administratively down, line protocol is down
Serial1/1 is administratively down, line protocol is down
Serial1/2 is up, line protocol is up
Serial1/2.13 is up, line protocol is up
Internet address is 145.1.13.3/24
Serial1/3 is up, line protocol is up
Serial1/3.23 is up, line protocol is up
Internet address is 145.1.23.3/24
NVI0 is up, line protocol is up
Loopback0 is up, line protocol is up
Internet address is 150.1.3.3/24
Loopback111 is up, line protocol is up
Internet address is 145.1.133.3/24
Rack1R3(config)#

(I think this is the best option, by the way!)

Rack1R3(config)#do sh run | in interface|ip address
ip telnet source-interface Loopback111
ip ftp source-interface Loopback0
interface Loopback0
ip address 150.1.3.3 255.255.255.0
interface Loopback111
ip address 145.1.133.3 255.255.255.0
interface FastEthernet0/0
ip address 145.1.36.3 255.255.255.0
interface FastEthernet0/1
ip address 145.1.3.3 255.255.255.0
interface Serial0/0/0
no ip address
interface Serial0/1/0
no ip address
interface Serial0/2/0
no ip address
interface Serial0/2/1
no ip address
interface Serial1/0
no ip address
interface Serial1/1
no ip address
interface Serial1/2
no ip address
frame-relay route 112 interface Serial1/3 121
interface Serial1/2.13 point-to-point
ip address 145.1.13.3 255.255.255.0
frame-relay interface-dlci 131
interface Serial1/3
no ip address
frame-relay route 121 interface Serial1/2 112
interface Serial1/3.23 point-to-point
ip address 145.1.23.3 255.255.255.0
frame-relay interface-dlci 132
passive-interface default
no passive-interface FastEthernet0/0
ip nat inside source list 111 interface Loopback0 overload
match ip address 101
match ip address prefix-list NAT-Route
Rack1R3(config)#

This one allows you to see the dotted decimal portion of IPs.
You may do whatever you like the best, but this way you can quickly look at your diagrams and see what’s happening. Check to be sure that your routers match your diagram.
Checking IP address only is nice. Checking the IP and Mask is better as you may not “discover” the error until your routing protocol configurations. At that point, you aren’t thinking about troubleshooting or IP addresses any longer, so it takes more time to figure things out.
Keep it simple!
The next important stage is the L2 diagramming. Your lab may or may not include any sort of physical connection diagram. While your nice L3 diagram may show a link between SW3 and R2, unless they are physically connected to each other, there may be important steps missing in your functionality! The easiest thing to do is go to your switches and use the “show cdp neighbor” command.
A switch’s interfaces are up by default (unlike a routers’). Although, you may have discovered any “down” interfaces by your IP address checks, it’s good to see anyway.

Rack1SW1#sh cdp n
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
Rack1SW4         Fas 0/19          160          R S I     WS-C3550- Fas 0/13
Rack1SW3         Fas 0/16          174           S I      WS-C3560- Fas 0/13
emanon-3750-Dev  Gig 0/2           170           S I      WS-C3750- Gig 1/0/2
Rack1R1          Fas 0/1           176          R S I     2811      Fas 0/0
Rack1R3          Fas 0/3           119          R S I     2811      Fas 0/0
Rack1R5          Fas 0/5           129          R S I     2811      Fas 0/0
Rack1SW1#

Rack1SW2#sh cdp n
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
RS.1.1.BB2       Fas 0/24          156          R S I     2811      Fas 0/0
Rack1SW4         Fas 0/19          144          R S I     WS-C3550- Fas 0/16
Rack1SW3         Fas 0/16          124           S I      WS-C3560- Fas 0/16
Rack1R2          Fas 0/2           157          R S I     2811      Fas 0/0
Rack1R4          Fas 0/4           172          R S I     2811      Fas 0/0
Rack1SW2#

Rack1SW3#sh cdp n
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
RS.1.1.BB3       Fas 0/24          130          R S I     2811      Fas 0/0
Rack1SW4         Fas 0/20          177          R S I     WS-C3550- Fas 0/20
Rack1SW4         Fas 0/19          177          R S I     WS-C3550- Fas 0/19
Rack1SW2         Fas 0/16          159          R S I     WS-C3560- Fas 0/16
Rack1SW1         Fas 0/13          174          R S I     WS-C3560- Fas 0/16
Rack1R3          Fas 0/4           163          R S I     2811      Fas 0/1
Rack1R5          Fas 0/6           132          R S I     2811      Fas 0/1
Rack1SW3#

Rack1SW4#sh cdp n
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
Rack1SW2         Fas 0/16          145          R S I     WS-C3560- Fas 0/19
Rack1SW3         Fas 0/20          170           S I      WS-C3560- Fas 0/20
Rack1SW3         Fas 0/19          170           S I      WS-C3560- Fas 0/19
Rack1SW1         Fas 0/13          161          R S I     WS-C3560- Fas 0/19
Rack1R4          Fas 0/4           159          R S I     2811      Fas 0/1
Rack1R6          Fas 0/6           165          R S I     3825      Gig 0/1
Rack1SW4#

Draw yourself a quick diagram and center the drawing around your switches. Use colors as needed for trunk vs. access ports. Use colors for ISL vs. 802.1Q trunks. Do whatever makes sense for you. But this too will assist in troubleshooting connections (or lack thereof) later in the lab.
Make sure the interfaces match what you are asked for in your lab. Whether on the diagram for Layer3, or in the lab task list for Layer2 connections. Are ports missing? Some quick “show run interface (intf)” commands may help out there, or at least enough to make a quick note to check later while configuring the trunking section!
Those two discovery items may well handle some of your Troubleshooting tasks, and will certainly save time in later verification or Self-Induced Troubleshooting!
That’s enough to whet the appetite a little bit. More on the notorious and/or obnoxious troubleshooting next time. Same Bat time…. Same Bat channel….

CCIE Troubleshooting: Part 2 (Dude, Where’s My Routing?)

Thanks for tuning in again! We’re back for more of the excitement known as Troubleshooting! Today we’re going to look at little more at some of the more nefarious (my word for the day) things that may come your way. How simple little commands can certainly change the way your lab is going!
In case you haven’t noticed by now, the CCIE lab is a largely psychological event. Technical knowledge is a very good thing, but if you can’t handle the pressure then it doesn’t help much! I still remember my first lab exam. Or more importantly the weeks leading up to my first lab exam, and I couldn’t make simple things work correctly! Stuff I’d been doing for years. And it was all in my head.
So what kinds of things can be on your lab which may have an impact on this stuff? Some are very simple, some are not.
So in the last post, I mentioned a little about process. The process by which we troubleshoot things (or how we even start our lab) may make a tremendous amount of difference in what our outcome, or at least our psychological state may end up being. Remember, you are there for the proctor’s entertainment. And sometimes they are very entertained!
Let’s take an obvious one. What if “no ip routing” was in one of your routers? “What?” you say… “Something THAT obvious, any CCNA could figure out, that’s just plain (insert appropriate word of shock or exasperation here).”

Well, yes and no. A lot depends on WHEN you discover it. So let’s assume the obvious. You check the routing table:

TestRouter(config)#do sh ip ro
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

TestRouter(config)#no ip routing
TestRouter(config)#
TestRouter(config)#
TestRouter(config)#do sh ip ro
Default gateway is not set

Host               Gateway           Last Use    Total Uses  Interface
ICMP redirect cache is empty
TestRouter(config)#

That’s an obvious one to figure out. Ok…. What if you don’t check that, but go to implement a routing protocol???

TestRouter(config)#router rip
IP routing not enabled
TestRouter(config)#

Ok, another obvious one. Mostly because the router TELLS you what’s wrong! Let’s hop to another router and look at some not-so-obvious problems with this command! Again, some of my routers are already fully configured from Mock Lab 4, so there’s a functional network going on that we’re going to mess with!

Rack1R1(config-if)#do sh frame map
Serial0/0/0 (up): ip 145.1.125.2 dlci 105(0x69,0x1890), static,
CISCO, status defined, active
Serial0/0/0 (up): ip 145.1.125.5 dlci 105(0x69,0x1890), dynamic,
broadcast,
CISCO, status defined, active
Serial0/1/0.12 (up): point-to-point dlci, dlci 112(0x70,0x1C00), broadcast
status defined, active
Serial0/1/0.13 (up): point-to-point dlci, dlci 131(0x83,0x2030), broadcast
status defined, active
Rack1R1(config-if)#do sh ip int br | ex un
Interface                  IP-Address      OK? Method Status                Protocol
FastEthernet0/0            145.1.17.1      YES manual up                    up
Serial0/0/0                145.1.125.1     YES manual up                    up
Serial0/1/0.12             145.1.12.1      YES manual up                    up
Serial0/1/0.13             145.1.13.1      YES manual up                    up
Loopback0                  150.1.1.1       YES manual up                    up
Loopback1                  145.1.111.111   YES manual up                    up
Rack1R1(config-if)#

Rack1R1(config-if)#do ping 145.1.125.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.125.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/59/64 ms
Rack1R1(config-if)#do ping 145.1.125.5

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.125.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/29/32 ms
Rack1R1(config-if)#do ping 145.1.12.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.12.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/56/60 ms
Rack1R1(config-if)#do ping 145.1.13.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.13.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/29/32 ms
Rack1R1(config-if)#

Looks like a great setup, and things are working!
R1
no ip routing
In my case, I still have “debug ip routing” turned on, so there’s LOTS of stuff going on at the moment. In the beginning of your lab, you wouldn’t see a thing.

Rack1R1(config)#do ping 145.1.125.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.125.2, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)
Rack1R1(config)#do ping 145.1.125.5

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.125.5, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)
Rack1R1(config)#do ping 145.1.12.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.12.2, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)
Rack1R1(config)#do ping 145.1.13.3

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.13.3, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)
Rack1R1(config)#

Now how’s that for an interesting turn of events???

Rack1R1(config)#do sh frame map
Serial0/0/0 (up): ip 145.1.125.2 dlci 105(0x69,0x1890), static,
CISCO, status defined, active
Serial0/0/0 (up): ip 145.1.125.5 dlci 105(0x69,0x1890), dynamic,
broadcast,
CISCO, status defined, active
Serial0/1/0.12 (up): point-to-point dlci, dlci 112(0x70,0x1C00), broadcast
status defined, active
Serial0/1/0.13 (up): point-to-point dlci, dlci 131(0x83,0x2030), broadcast
status defined, active
Rack1R1(config)#

Nothing has changed in my mapping or other configuration! In other words, my frame-relay configuration is perfectly fine!

Rack1R1(config)#do ping 145.1.17.7

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.17.7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/202/1000 ms
Rack1R1(config)#

I can ping other interfaces perfectly fine. Just not frame-relay.
So if you did your IP address checks, and immediately dove into configuring your lab, you’d get to frame-relay and insist that you were going insane!

Rack1R1(config)#do debug ip packet detail
IP packet debugging is on (detailed)
Rack1R1(config)#do ping 145.1.125.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.125.2, timeout is 2 seconds:

*Feb 16 16:24:57.185: IP: tableid=0, s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), routed via RIB
*Feb 16 16:24:57.185: IP: s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), len 100, sending
*Feb 16 16:24:57.185:     ICMP type=8, code=0.
*Feb 16 16:24:59.185: IP: tableid=0, s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), routed via RIB
*Feb 16 16:24:59.185: IP: s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), len 100, sending
*Feb 16 16:24:59.185:     ICMP type=8, code=0
*Feb 16 16:24:59.745: IP: tableid=0, s=150.1.1.1 (local), d=150.1.6.6 (FastEthernet0/0), routed via RIB
*Feb 16 16:24:59.745: IP: s=150.1.1.1 (local), d=150.1.6.6 (FastEthernet0/0), len 145, sending
*Feb 16 16:24:59.745:     TCP src=65067, dst=179, seq=2727314646, ack=3832900916, win=16289 ACK PSH FIN.
*Feb 16 16:25:00.613: IP: s=150.1.4.4 (Serial0/0/0), d=145.1.125.1, len 64, rcvd 1
*Feb 16 16:25:00.613:     TCP src=45959, dst=179, seq=2516853527, ack=0, win=16384 SYN
*Feb 16 16:25:00.613: IP: tableid=0, s=145.1.125.1 (local), d=150.1.4.4 (FastEthernet0/0), routed via RIB
*Feb 16 16:25:00.613: IP: s=145.1.125.1 (local), d=150.1.4.4 (FastEthernet0/0), len 40, sending
*Feb 16 16:25:00.613:     TCP src=179, dst=45959, seq=0, ack=2516853528, win=0 ACK RST
*Feb 16 16:25:01.185: IP: tableid=0, s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), routed via RIB
*Feb 16 16:25:01.185: IP: s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), len 100, sending
*Feb 16 16:25:01.185:     ICMP type=8, code=0.
*Feb 16 16:25:02.613: IP: s=150.1.4.4 (Serial0/0/0), d=145.1.125.1, len 64, rcvd 1
*Feb 16 16:25:02.613:     TCP src=45959, dst=179, seq=2516853527, ack=0, win=16384 SYN
*Feb 16 16:25:02.613: IP: tableid=0, s=145.1.125.1 (local), d=150.1.4.4 (FastEthernet0/0), routed via RIB
*Feb 16 16:25:02.613: IP: s=145.1.125.1 (local), d=150.1.4.4 (FastEthernet0/0), len 40, sending
*Feb 16 16:25:02.613:     TCP src=179, dst=45959, seq=0, ack=2516853528, win=0 ACK RST
*Feb 16 16:25:03.185: IP: tableid=0, s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), routed via RIB
*Feb 16 16:25:03.185: IP: s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), len 100, sending
*Feb 16 16:25:03.185:     ICMP type=8, code=0.
*Feb 16 16:25:05.185: IP: tableid=0, s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), routed via RIB
*Feb 16 16:25:05.185: IP: s=145.1.17.1 (local), d=145.1.125.2 (FastEthernet0/0), len 100, sending
*Feb 16 16:25:05.185:     ICMP type=8, code=0.
Success rate is 0 percent (0/5)
Rack1R1(config)#
Rack1R1(config)#do un all
All possible debugging has been turned off
Rack1R1(config)#

You’ll notice a couple of incoming BGP messages there as well trying to re-establish that connection. Bottom line is that things LOOK like they are being sent, and yet nothing actually goes out. How long would you spend pulling your hair out? Would you ever think to look at “no ip routing” as your culprit? Perhaps if you have been through this before then “yes” but otherwise, it’s not part of what you would expect for a local interface connection! Process, process, process. Otherwise, definitely a time killer!
If you look really closely, you’ll see something more interesting on that. The packets are going out your FastEthernet0/0 interface. Any idea why? Not treating the local interface as valid there, your router is trying to ARP for the address. In this functional, pre-configured network, SW1 (the other end of Fa0/0) has a route to the IP and therefore will reply with Proxy ARP. If other routers did not have a route, you would see “encapsulation failed” messages in the debug.

Rack1R1(config)#ip routing
Rack1R1(config)#
Rack1R1(config)#do ping 145.1.125.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 145.1.125.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/58/60 ms
Rack1R1(config)#
*Feb 16 16:29:24.525: %PIM-5-NBRCHG: neighbor 145.1.125.5 UP on interface Serial0/0/0
*Feb 16 16:29:24.529: %PIM-5-DRCHG: DR change from neighbor 145.1.125.1 to 145.1.125.5 on interface Serial0/0/0
Rack1R1(config)#

As soon as we re-enable it, things start working again! And OSPF, and BGP, and PIM.
What other things can we do? How about this…. Can you assign 140.100.0.1/24 to Fa0/0 for me? Sounds simple…

TestRouter(config)#do sh run int f0/0
Building configuration...

Current configuration : 92 bytes
!
interface FastEthernet0/0
no ip address
no ip route-cache
duplex auto
speed auto
end

TestRouter(config)#int f0/0
TestRouter(config-if)#ip addr 140.100.0.1 255.255.255.0
Bad mask /24 for address 140.100.0.1
TestRouter(config-if)#

What happened there? I swear I’ve typed IP addresses like that for years. Really, my children can probably type that IP address correctly. What’s wrong? How about “no ip subnet-zero” in your configuration? Would you discover it? Perhaps, perhaps not. But it can be very frustrating if you haven’t seen it before! On a side note, if you configure addresses and THEN use “no ip subnet-zero” your existing interfaces will work just fine, but any new ones within the first available subnet (subnet-zero) cannot be used!
Fun, huh?
Ever see an OSPF interface not work? Sure… but here’s one. What can cause this?
Before:

Rack1R1(config-if)#do sh ip o i b
Interface    PID   Area            IP Address/Mask    Cost  State Nbrs F/C
Lo0          1     0               150.1.1.1/24       1     LOOP  0/0
VL7          1     0               145.1.125.1/24     64    P2P   1/1
VL6          1     0               145.1.13.1/24      64    P2P   1/1
VL5          1     0               145.1.12.1/24      64    P2P   1/1
VL4          1     0               145.1.17.1/24      1     P2P   1/1
Fa0/0        1     17              145.1.17.1/24      1     BDR   1/1
Se0/1/0.13   1     123             145.1.13.1/24      64    P2P   1/1
Se0/1/0.12   1     123             145.1.12.1/24      64    P2P   1/1
Se0/0/0      1     125             145.1.125.1/24     64    P2MP  1/1
Rack1R1(config-if)#

and After:

Rack1R1(config-if)#do sh ip o i b
Interface    PID   Area            IP Address/Mask    Cost  State Nbrs F/C
Lo0          1     0               150.1.1.1/24       1     LOOP  0/0
VL7          1     0               145.1.125.1/24     64    P2P   1/1
VL6          1     0               145.1.13.1/24      64    P2P   1/1
VL5          1     0               145.1.12.1/24      64    P2P   1/1
VL4          1     0               0.0.0.0/0          65535 DOWN  0/0
Fa0/0        1     17              145.1.17.1/24      1     DR    0/0
Se0/1/0.13   1     123             145.1.13.1/24      64    P2P   1/1
Se0/1/0.12   1     123             145.1.12.1/24      64    P2P   1/1
Se0/0/0      1     125             145.1.125.1/24     64    P2MP  1/1
Rack1R1(config-if)#

The virtual-link VL4 is down (could be many things), but I’ve lost a neighbor on Fa0/0. You probably wouldn’t see a change like this in the live running (because that would usually imply something you did!) but let’s pretend I’m setting OSPF up for the first time, and my neighbor on a fast ethernet is not coming up (see above, it does work, and there’s nothing on the other side causing a problem, I promise). So you have local configuration….

Rack1R1(config-if)#do sh run | s ospf
ip ospf network point-to-multipoint
ip ospf authentication-key CISCO12
ip ospf authentication null
router ospf 1
log-adjacency-changes
area 0 authentication message-digest
area 17 virtual-link 150.1.7.7 message-digest-key 1 md5 CISCO
area 123 authentication
area 123 virtual-link 150.1.3.3 message-digest-key 1 md5 CISCO
area 123 virtual-link 150.1.2.2 message-digest-key 1 md5 CISCO
area 125 virtual-link 150.1.5.5 message-digest-key 1 md5 CISCO
network 145.1.12.1 0.0.0.0 area 123
network 145.1.13.1 0.0.0.0 area 123
network 145.1.17.1 0.0.0.0 area 17
network 145.1.125.1 0.0.0.0 area 125
network 150.1.1.1 0.0.0.0 area 0
Rack1R1(config-if)#
Rack1R1(config-if)#do sh ip o i f0/0
FastEthernet0/0 is up, line protocol is up
Internet Address 145.1.17.1/24, Area 17
Process ID 1, Router ID 150.1.1.1, Network Type BROADCAST, Cost: 1
Transmit Delay is 1 sec, State DR, Priority 1
Designated Router (ID) 150.1.1.1, Interface address 145.1.17.1
No backup designated router on this network
Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
oob-resync timeout 40
No Hellos (Passive interface)
Supports Link-local Signaling (LLS)
Index 1/7, flood queue length 0
Next 0x0(0)/0x0(0)
Last flood scan length is 2, maximum is 15
Last flood scan time is 0 msec, maximum is 0 msec
Neighbor Count is 0, Adjacent neighbor count is 0
Suppress hello for 0 neighbor(s)
Rack1R1(config-if)#

Ok, there’s a hint…. Passive interface. But look in the OSPF section. There is nothing about passive-interface up there!

Rack1R1(config-if)#do sh run | in passive
Rack1R1(config-if)#

In fact, there’s nothing on the entire router about passive-interface! There’s a new one! We know that loopbacks are automatically treated as stub hosts unless otherwise specified, but there’s nothing to automatically treat a FastEthernet as passive is there? Particularly not in the “router ospf” section!?!?

Rack1R1(config-if)#do sh run int f0/0
Building configuration...

Current configuration : 135 bytes
!
interface FastEthernet0/0
ip address 145.1.17.1 255.255.255.0
ip pim sparse-mode
duplex auto
speed auto
no routing dynamic
end

Rack1R1(config-if)#

While it may seem like an innocuous command, and one you may have never seen before… the “no routing dynamic” will cause this passive behavior without typing “passive-interface fa0/0″ anyplace! And if you aren’t looking for it, we can see all sorts of issues with it.
So there’s a couple more things to be paranoid about for your lab exam. But how do we check for them? Honestly, as part of your initial discovery, I’d do a very simple command!

Rack1R1(config)#do sh run | in no\
no service password-encryption
no ip subnet-zero
no ip domain lookup
no dspfarm
no routing dynamic
no ip address
no frame-relay inverse-arp IP 102
no frame-relay inverse-arp IP 103
no frame-relay inverse-arp IP 104
no frame-relay inverse-arp IP 113
no ip address
no synchronization
no auto-summary
no ip http secure-server
Rack1R1(config)#

There’s a space after the “\” character up there (to tell GREP about the special character to follow). That way you avoid anything with just “no” in there and only get “no ” as a match.
There may be many things shown (as above) that we really don’t care about. But some things like the “subnet-zero” and the “routing dynamic” should definitely leap out at us! A quick scan on each device like this can save HOURS of troubleshooting later.
The fixes are simple!

Rack1R1(config)#ip sub
Rack1R1(config)#int f0/0
Rack1R1(config-if)#rou dyn
Rack1R1(config-if)#
*Feb 16 17:06:49.197: %OSPF-5-ADJCHG: Process 1, Nbr 150.1.7.7 on FastEthernet0/0 from LOADING to FULL, Loading Done
Rack1R1(config-if)#
*Feb 16 17:07:04.225: %OSPF-5-ADJCHG: Process 1, Nbr 150.1.7.7 on OSPF_VL4 from LOADING to FULL, Loading Done
Rack1R1(config-if)#

Within about 10 seconds, things are back to working order again.
Never fear…. There’s more coming…. Will our router be boiled in hot lava? Will our switch get hijacked by the proctor? Have we been punk’d by the lab? Stay tuned next time for the exciting conclusion to CCIE Troubleshooting!

CCIE Troubleshooting: Part 3 (Ummmm…. Houston, We Have a Problem)

And we’re back for an exciting conclusion to the CCIE Troubleshooting series. Just what fate is in store for our heroes? Well, if we’re anything like the Friday the 13th movies, you may need to wait about 20+ years to find out!

You’re probably asking yourself, “With the amount of stuff we’ve gone through already, you’re really telling me there’s more to be concerned about?”
Yup, that’s exactly what I’m talking about. So what next then? How about little unexpected zingers that may just confuse, confound and otherwise astound you?
Have you ever rebooted your router/switch and had it reply to you “Would you like to enter basic management setup?” Of course you have. All the time in practice labs! But what if it happened in the middle of your actual lab exam? You DID save your configuration, right? You didn’t see any error messages pop up did you? Well, how about that?
How about a bit of preventive medicine? When you start the day, in SecureCRT that you’ll be using in the lab, I would click on the Session Options and then on the Emulation (under Terminal in the left-hand pane). See the part about Scrollback buffer. Set that to AT LEAST 5000 lines. Gives you the ability to review things you’ve typed or what the router has said, or recreate your steps in case of an emergency.
But back to this dilemma. What do you do? Run in circles, scream and shout? While that may be entertaining, it would scare everyone else in the room and not be overly helpful to your cause! Whatever you do, DO NOT enter the setup mode. Before completely panicking, just do a “show start” and see whether your config was really saved or not!
If it did not save, go check your scrollback and see what error you missed. Then hope you have enough scrollback to reconstruct your configuration. THEN run in circles, scream and shout!

You likely were victimized by the configuration-register. Well, ok, you were victimized by the Proctor’s warped sense of humor, but the weapon of choice was the configuration-register. Deal with it. You should have checked ahead of time! “show version | include Config”

Rack1R1(config)#do sh ver | in Config
Configuration register is 0x2102
Rack1R1(config)#

That would be the sign of a good router.
0×2142 is the sign of a bad router – your startup configuration will be skipped
0×2101 is the sign of a really bad router – you will go into bootstrap or ROMMON mode
Those are the primary ones to think about. There are others that involve the changing of console speed. While particularly entertaining, being that you do not have access to the terminal server configuration or physical access to devices any longer, that’s not something you’ll be worrying about. Back in my day, that was one method of torture potentially bestowed upon us.
So what do you do if you end up in ROMMON? Partly that will depend on what routing platform you happen to be using. But let’s go with the standard ISR routers (28xx and 38xx which I believe are the currently listed devices on the blueprint for Lab Hardware).

rommon 1 >
rommon 1 > ?
alias               set and display aliases command
boot                boot up an external process
break               set/show/clear the breakpoint
confreg             configuration register utility
cont                continue executing a downloaded image
context             display the context of a loaded image
cookie              display contents of motherboard cookie PROM in hex
dev                 list the device table
dir                 list files in file system
dis                 disassemble instruction stream
dnld                serial download a program module
frame               print out a selected stack frame
help                monitor builtin command help
history             monitor command history
iomemset            set IO memory percent
meminfo             main memory information
repeat              repeat a monitor command
reset               system reset
rommon-pref         Select ROMMON
set                 display the monitor variables
showmon             display currently selected ROM monitor
stack               produce a stack trace
sync                write monitor environment to NVRAM
sysret              print out info from last system return
tftpdnld            tftp image download
unalias             unset an alias
unset               unset a monitor variable
xmodem              x/ymodem image download
rommon 2 >

You’ll have a very specific set of commands there having to do with the boot process and/or reloading IOS through the serial port (not much fun, and not possible over telnet as configured!).
You can use the “dir flash:” command to see what files are available (or “dev” if you need to know the names of current devices like DISK0: or DISK1: for PCMCIA cards) and then “boot flash:(filename)” if there’s any doubt.
Knowing how to get out of ROMMON is a great skill. Murphy’s Law says whatever can go wrong will go wrong, and generally at the most inopportune moment! Know how to reboot!
So what if you get into a router at the beginning of your lab and you see this:

rommon 1>
rommon 1>?
Exec commands:
  access-enable    Create a temporary Access-List entry
  access-profile   Apply user-profile to interface
  call             Voice call
  clear            Reset functions
  connect          Open a terminal connection
  crypto           Encryption related commands.
  disable          Turn off privileged commands
  disconnect       Disconnect an existing network connection
  enable           Turn on privileged commands
  exit             Exit from the EXEC
  help             Description of the interactive help system
  lat              Open a lat connection
  lock             Lock the terminal
  login            Log in as a particular user
  logout           Exit from the EXEC
  modemui          Start a modem-like user interface
  mrinfo           Request neighbor and version information from a multicast
                   router
  mstat            Show statistics after multiple multicast traceroutes
  mtrace           Trace reverse multicast path from destination to source
  name-connection  Name an existing network connection
  pad              Open a X.29 PAD connection
  ping             Send echo messages
  ppp              Start IETF Point-to-Point Protocol (PPP)
  release          Release a resource
  renew            Renew a resource
  resume           Resume an active network connection
  rlogin           Open an rlogin connection
  set              Set system parameter (not config)
  show             Show running system information
  slip             Start Serial-line IP (SLIP)
  ssh              Open a secure shell client connection
  systat           Display information about terminal lines
  tclquit          Quit Tool Command Language shell
  telnet           Open a telnet connection
  terminal         Set terminal line parameters
  tn3270           Open a tn3270 connection
  traceroute       Trace route to destination
  tunnel           Open a tunnel connection
  udptn            Open an udptn connection
  where            List active connections
  x28              Become an X.28 PAD
  x3               Set X.3 parameters on PAD

rommon 1>

That’s entirely different and you have no capability of setting the boot parameters there. Miraculously though, you do have the “enable” command.
I’ll give you a hint, there’s no enable command in ROMMON mode! You aren’t really in ROMMON, you are just being punk’d by the router.

rommon 1>enable
rommon 1>sh run | in rommon
prompt "rommon 1>"
rommon 1>

No matter what mode you are in, that’s the prompt. Let’s exit back out though, and assume that we’re doing things our “typical” configuration fashion.

rommon 1>
rommon 1>en

*
*
*
*?

ERR

The "help" PAD command signal consists of the following elements:

  where
  is the identifier for the type of
                explanatory information requested

*

What the heck is that??? Ahhhh… More entertainment. That would be the X28 Diagnostic Mode (helps if you are running an X.25 PAD, but since it’s highly unlikely that normal people today even know what that is, chances are you don’t want to run it! And yet here we are. Punk’d again.
The “exit” command will get you out. Only to be placed back to your fake ROMMON prompt! Try typing “enable” fully. (By the way, the prompt won’t change, so don’t believe everything you see!)

rommon 1>enable
rommon 1>sh run | in en
Current configuration : 3980 bytes
no service password-encryption
boot-end-marker
 enrollment selfsigned
ip http authentication local
Please change these publicly known initial credentials using SDM or the IOS CLI.
alias exec en x28
end
rommon 1>

Ahhhh… Aliases. Aren’t they exciting. If your proctor REALLY doesn’t like you, they’ll alias “en” and “enable” to something equally inane. But that’s a start. So let’s get rid of these things.
Don’t forget that “configure terminal” may be necessary to fully type out in case they aliased “conf” to “exit” or something fun like that!

rommon 1>
rommon 1>conf t
Enter configuration commands, one per line.  End with CNTL/Z.

exit
rommon 1>
*Feb 17 04:35:23.451: %SYS-5-CONFIG_I: Configured from console by console
rommon 1>

Ummmm… Did someone eat the configuration mode?

rommon 1>conf t
Enter configuration commands, one per line.  End with CNTL/Z.

interface fa0/0

ip address ?
  A.B.C.D  IP address
  dhcp     IP Address negotiated via DHCP
  pool     IP Address autoconfigured from a local DHCP pool

ip address
% Incomplete command.

^Z
rommon 1>
*Feb 17 04:36:24.755: %SYS-5-CONFIG_I: Configured from console by console
rommon 1>

Commands appear to work, but we can’t see anything. That would be another command!

rommon 1>conf t
Enter configuration commands, one per line.  End with CNTL/Z.

do sh run | in config
Building configuration...
Current configuration : 4005 bytes
no service prompt config

Fix me!
  ^
% Invalid input detected at '^' marker.
service prompt config
TestRouter(config)#

Just put the service back on and we’re good. Interesting enough, the hostname shows up while in config mode. Once back in user mode, the prompt comes back.

TestRouter(config)#end
rommon 1>
rommon 1>
rommon 1>
*Feb 17 04:38:52.911: %SYS-5-CONFIG_I: Configured from console by consoleconf t
Enter configuration commands, one per line.  End with CNTL/Z.
TestRouter(config)#
TestRouter(config)#
TestRouter(config)#default prompt
TestRouter(config)#exit
TestRouter#
*Feb 17 04:39:04.775: %SYS-5-CONFIG_I: Configured from console by console
TestRouter#
TestRouter#

Other things you may have occur to your routers… Modification of the Exec-Timeout timers. Every 30 seconds may work without detection. Or if someone is really being amusing set “exec-timeout 0 1″ on the console port. Type one character every second or get kicked out.
This is a place for cut/paste if I’ve ever seen one!
There may be any number of other odd things appearing throughout your configurations. With a decent glance these (hopefully) will stand out like a sore thumb. Other than the show commands from the last two days to verify IP addressing and looking for basic “no” things in your config, you should do a “show run” on every device.
When you get into the lab exam, you really have no idea just how much configuration will be in place already. Just like any consulting engagement, you could have a completely blank greenfield deployment. Or you could walk into a semi-dysfunctional existing network to improve/fix/enhance throughout the day. Check out what you have to begin with. Make notes.
Things that are especially important as they may lead to future difficulties when you configure the tasks given to you:
1. Backup-interface configurations — These leave interfaces in a “standby” state which is most definitely not up!
2. Span or Remote-Span configurations — This may involve the copying of information from one port to another. While it’s one way, so OSPF peers won’t show up, RIP advertisements could!
3. “ip classless” command — This may have effects on your routing processes, or at least what is showing up versus expected!
4. kron jobs — I outlined this before regarding time-based redistribution, but anything pre-existing should be noted!
5. EEM (Embedded Event Manager) — Shouldn’t see these anyplace (or rarely!) – See below
Every once and a while, we hear ramblings from people insisting that the proctor got into their racks and changed configuration. Even if using Notepad to track your commands, or the scrollback buffer, there’s insistence that interfaces were configured one minute and had no configuration the next minute. There was no reboot, therefore it must have been the proctor.
While they do have a devlish glint in their eyes most of the time, and look like rather unsavory individuals, the Proctor’s job is not to interfere with anyone’s routers or switches. They have enough to do rather than resort to that level of torture! The Geneva Convention actually prohibits this type of behavior!
So ask yourself… If the proctor doesn’t get into my equipment… And I KNOW that I’ve configured things and they are working…. What could possibly be the cause of it? How about the last two things I mentioned above? Kron or EEM have the ability to execute commands, configure device changes and/or copy files from TFTP devices into the running config. You should be aware of what’s happening on a network at any point.
If you job is to evaluate and improve, it would be silly to rush off to do a list of tasks without understanding the impact along the way, wouldn’t it? Or what forces were working against you?
A simple check of the running configuration before hand can show these anomalies to you. Anything that looks strange needs to be investigated! Nothing worse than working through things in the middle of the day, then seeing:

Rack1R1(config)#
*Feb 18 16:47:59.973: %OSPF-5-ADJCHG: Process 1, Nbr 150.1.7.7 on FastEthernet0/0 from FULL to DOWN, Neighbor Down: Interface down or
detached
*Feb 18 16:47:59.997: %PIM-5-NBRCHG: neighbor 145.1.17.7 DOWN on interface FastEthernet0/0 DR
*Feb 18 16:48:00.009: %SYS-5-CONFIG_I: Configured from console by vty0
Rack1R1(config)#
*Feb 18 16:48:05.473: %OSPF-5-ADJCHG: Process 1, Nbr 150.1.7.7 on OSPF_VL4 from FULL to DOWN, Neighbor Down: Interface down or detached
Rack1R1(config)#
*Feb 18 16:48:30.041: %SYS-5-CONFIG_I: Configured from console by vty0
Rack1R1(config)#
Rack1R1(config)#
Rack1R1(config)#

And man, I’d agree. That evil proctor just jacked my lab!

Rack1R1(config)#do sh run int fa0/0
Building configuration...

Current configuration : 73 bytes
!
interface FastEthernet0/0
 no ip address
 duplex auto
 speed auto
end

Rack1R1(config)#

Well, that’s way not cool. My scrollback even tells me what I’ve done.

Rack1R1(config)#do sh run int f0/0
Building configuration...

Current configuration : 115 bytes
!
interface FastEthernet0/0
 ip address 145.1.17.1 255.255.255.0
 ip pim sparse-mode
 duplex auto
 speed auto
end

Rack1R1(config)#

Well, let’s put it back…

Rack1R1(config)#
Rack1R1(config)#interface FastEthernet0/0
Rack1R1(config-if)#
Rack1R1(config-if)# ip address 145.1.17.1 255.255.255.0
Rack1R1(config-if)#
Rack1R1(config-if)# ip pim sparse-mode
Rack1R1(config-if)#
Rack1R1(config-if)# duplex auto
Rack1R1(config-if)#
Rack1R1(config-if)# speed auto
Rack1R1(config-if)#
Rack1R1(config-if)#
Rack1R1(config-if)#
*Feb 18 16:51:22.753: %PIM-5-NBRCHG: neighbor 145.1.17.7 UP on interface FastEthernet0/0
*Feb 18 16:51:22.757: %PIM-5-DRCHG: DR change from neighbor 0.0.0.0 to 145.1.17.7 on interface FastEthernet0/0
*Feb 18 16:51:27.501: %OSPF-5-ADJCHG: Process 1, Nbr 150.1.7.7 on FastEthernet0/0 from LOADING to FULL, Loading Done
Rack1R1#do sh run int f0/0
*Feb 18 16:51:29.921: %OSPF-5-ADJCHG: Process 1, Nbr 150.1.7.7 on FastEthernet0/0 from FULL to DOWN, Neighbor Down: Interface down or
detached
*Feb 18 16:51:29.929: %PIM-5-NBRCHG: neighbor 145.1.17.7 DOWN on interface FastEthernet0/0 DR
*Feb 18 16:51:30.005: %SYS-5-CONFIG_I: Configured from console by vty0
Rack1R1(config)#do sh run int f0/0
Building configuration...

Current configuration : 73 bytes
!
interface FastEthernet0/0
 no ip address
 duplex auto
 speed auto
end

Rack1R1(config)#

Our neighbors come back, things are good again… But then it’s back down right away. Or it could be much later. Either way, the same frustration ensues!

Rack1R1(config)#do sh run | s event
event manager applet NeverTrustTheseThings
 event timer watchdog time 300
 action 1.0 cli command "enable"
 action 2.0 cli command "configure terminal"
 action 3.0 cli command "default interface FastEthernet0/0"
Rack1R1(config)#

Well, that would certainly do it. And it may have been hidden before by a startup “logging console warnings” or something like that.
EEM is a POWERFUL tool. Check out the Network Management section of your Documentation. http://www.cisco.com/en/US/docs/ios/netmgmt/command/reference/nm_06.html#wp1157622

Rack1R1(config)#no event manager applet NeverTrustTheseThings
Rack1R1(config)#do sh run | s event

Source: http://www.ciscosim.net/category/ccie/ccie-sp/page/2

Rack1R1(config)#

So all during the first 30-45 minutes of your lab exam (after the Core Technology Q&A), you should be:
1. Reading through the whole exam
2. Taking notes to remind yourself of things later
3. Redraw the diagram quickly so you can write on it
4. Verify IP addresses quite simply
5. Identify major things in the configs beginning with “no” or altering the configuration register
6. Quickly do “show run” on all devices and scan through for anything that looks strange or out of place
7. Get ready to kick butt and take your number home!
We get stuck in ruts when going through practice labs. We have a process generally dictated by what level of preparation that we’ve done. While there are certainly a good number of tools and labs and documents out there to help you study, there is nothing that compares to the “personalized approach”. By that, I mean, make changes yourself. Or have a buddy make some additional tweaks, tasks, changes to labs for you.
Keep in mind that folks on the CCIE team probably have EVERYONE’s study materials. So while we are indeed pretty cool, they’ll go out of their way to find something we didn’t think of. So outsmart them! Process, process, process.
Good luck in your lab prep, and most certainly in the troubleshooting portion! Don’t forget, this addresses nothing about the self-induced troubleshooting. While going through your lab, you should be verifying things every step of the way with show or debug (or whatever) commands. If properly verified, you will have few, if any, surprises during your lab.

Source:
http://www.ciscosim.net

Wonders of Telecom

Wednesday, December 23, 2009

CCIE Troubleshooting

CCIE Troubleshooting: Part 1 (Is it really my fault?)

CCIE Troubleshooting: Part 2 (Dude, Where’s My Routing?)

CCIE Troubleshooting: Part 3 (Ummmm…. Houston, We Have a Problem)

No comments:

Post a Comment

Search This Blog