inevitable

Thursday, March 05, 2009

Deep Diving Router Architecture, Part I

When I was young (how old do you think I am now?) I used to look at a router as a “black box” or just a node. I mean, I was not interested to go to the internal packet switching process inside the router itself and I was focusing more on the protocols and features that are run between nodes. Well, actually interest is not the best word to describe it. If you don’t work for a company who makes the routers, do you think you can get more detail information about what is really going on inside the box? Now I’m still young (I guess) but at least I have the chance to look deep dive down to the architecture level of a router hardware.

And actually it’s not always required to have such knowledge anyway in our daily job. Most of the network engineers, even the CCIEs, may just need to assume that the router is a box with multiple interfaces, and its function is to forward the packet to the next hop based on the routing table built from dynamic or static routing protocol. Then we put more focus on the communication between routers to build that routing table, instead of the packet switching process from one interface to the other inside a router. In OSPF, the LSA packets, database and SPF calculation discussion can be very complex and give us lots of headache, especially if we have to do redistribution with another IGP protocol or BGP and so on. So once we can see the routes in the routing table, and there is no other treatment such as filter or policy, normally we would happily assume that the packet will be processed and forwarded to the next hop. Then we can focus more on the other features or applications that run on top of the routing, which will probably give us another different kind of headaches.

So for most of us mere mortals, it may be enough to say that the packet switching within a router means switching the packet from ingress (input) interface to egress (output) interface. In CCIE we do need to dig a bit inside, for example when we have to determine the sequence of features implementation in the router. Does NAT come first or Access Control List? How about policy based routing that override the routing table? And so on. But we never really bother to look at which the internal part of a router who does this or that. Later I can explain why most of us don’t bother, other than due to lack of resources available to learn it.

Why is it important to understand the internal packet switching?
For me personally, is to understand the limitation of protocols or features implementation due to the hardware. And this is important for any design engineer. I mean, we can build a network design to specify number and type of hardware for core routers, aggregation, access etc. Then we recommend the protocols and features to be enabled, and come up with a nice and complete configuration to be pasted to the box. In reality, there is standard for a protocol but every vendor may implements it differently, depending on their interpretation of the standard or perhaps because they invent their own approach in following the standard. And for some features, or the way the protocols are implemented, depend on the hardware architecture. We may end up into situation where the new network has been up and running and only after sometime we start noticing a performance or scalability issue due to the limitation of the hardware inside the routers, when we really have heavy traffic in the network or when we want to expand the design.

A very simplified process of packet switching can be shown in the above picture. The packet travels on the wire with Layer 3 and Layer 2 header information as per TCP/IP protocol stack. The interface processor in a router is capable to pick it up, inspect and strip the layer 2 header and send it to the route processor for further process. While waiting for the route processor doing a layer 3 lookup in the routing table (and forwarding table) to check what should it do to the packet, the packet itself must be stored in a queue or buffer. Once the next hop is determined, the route processor knows to which interface it should send the packet. Then the packet can be moved to an output queue to wait before it can be transmitted back to the wire, get re-written with the new layer 2 header containing the information of the next hop, then the packet can leave out the box. The input and output queue can be virtual, so it can refer to the same physical memory and the packet never moves anywhere. But it makes it possible to apply different treatment when the packet is considered in input queue (before the lookup) and when it is already in the output, where the lookup has been done and the destination interface for the packet has been determined.

So the keywords are: Layer 3 and Layer 2 header, input queue, routing table and forwarding table, lookup, move packet between different location or queues, output queue, layer 2 re-write.

Let’s see it once again in more detail. Here is the snapshot from Vijay Bollapragrada’s Inside Cisco IOS Architecture book, for a very basic switching process called process switching.

Once the interface processor receives the packet from the network media on input or ingress interface, it has to store it in the buffer or memory (1) and at the same time it has to interrupt the processor (2) to inform there is a packet need to be processed. The book focus on software architecture, so it explains how the processor then invokes a process (3), which is called ip_input in Cisco, to start doing the lookup in the routing and forwarding table. This lookup results on which output or egress interface the router need to send out the packet, along with layer 2 information need to be written to the packet before it can be sent out (4). Processor then will do the layer 2 rewrite (5) and move the packet to be processed by egress interface processor (6), then off the packet goes back to the network media. Step 7 is just to inform the main processor that the packet has been sent out, so the memory can be freed and the packet counter on the interface can be increased.

I have to admit that I won’t be able to explain as good as how Vijay (and the other guys) does, so I suggest to read the book for those who are still curious. But my point here is just to emphasize that there are different tasks need to be done other than the lookup, such as moving the packet from ingress to egress interface, and re-writing the new layer 2 information to the packet, which will become important for later discussion.

Again, why we need to worry about internal process of packet switching? Hang on there. I know we usually put more focus on the interaction between routers with routing protocol, to ensure each router can build the routing table successfully. Once we have the table, the Layer 3 lookup process itself now can be done very fast. For each incoming packet we need to compare the destination against the database containing the list of all destinations with the associated egress interface. It can be done quickly, especially since a vendor like Cisco has invented a mechanism so the comparison doesn’t need to be done by going through the entry in the list one by one. Instead, Cisco Express Forwarding (CEF) builds a new mtrie data structure from the routing table, as shown in the next picture. Once the entry has been found, it can give a pointer to the adjacency table which contains the layer 2 information of the next hop.

Enough with the lookup process and how the router can determine to which interface it should send the packet. There is a book written dedicatedly to explain CEF in more detail. And I want to focus on the hardware architecture instead of software or algorithm of the lookup, so I suggest you to read this Cisco Express Forwarding book as well as Vijay’s book.

Now, let’s talk about moving the packet from ingress interface to egress interface. As discussed previously, the packet can be stored in a central memory while waiting for the lookup process. So the ingress interface processor must store the packet there, and the egress interface process can copy the packet (with new Layer 2 information) from the same central location. As you can see, with this idea, the bottleneck is in the central memory performance and obviously the memory must be able to serve multiple requests from different interface processors at the same time.

To improve the memory performance, one may want to use local memory on each interface. So the packet is stored in local memory of ingress interface, then it can be copied to the shared central memory over bus communication, and the local memory of egress interface can get the packet from there. You may start asking, why the ingress interface memory doesn’t send the packet directly to the egress interface memory? Hold your horse for a while. It is possible but it requires some sort of intelligence on the ingress interface processor to define to which egress interface memory it should send the packet. In other word, the ingress interface components may need to do the lookup. I will talk more about this in the next part.

When you open the chasing of an old mid-range router, you may see something similar with below picture. The main board is the base component to connect all other components. There is a central route processor, central memory, the interface network cards, PCI bus to communicate the network cards to the route processor, and other components such as flash where we can store the software image, boot ROM to run the firmware required for booting process before we can load the router software image, and so on.

Back to our keywords quickly: Layer 3 and Layer 2 header are inside the packet. Input queue or buffer can be in ingress network card local memory or in central memory. Routing table and forwarding table are build by route processor using protocol to communicate to other routers. Layer 3 lookup (along with the layer 2 information of the next hop) is done by route processor, by using algorithm to compare the destination against the routing table and forwarding table. Move packet between different location or queues, meaning the packet from ingress network cards local memory must be copied to the central memory using PCI or bus communication, then the egress network cards local memory can get it from there. Output queue is the egress network cards local memory or central memory. Layer 2 re-write to put the layer 2 information to the packet must be done by route processor before the packet can be sent out the router. All the features such as filter or NAT are done by the route processor. Applying the feature on ingress interface or egress interface can just simply be a function to apply the feature on the state of the packet before or after the lookup has been done.

Looking at the picture above, does it remind you of something? Yes, it looks the same as the components of normal PC main board! This is a reason why some talented people can build their own router software, upload it to normal PC, put multiple network cards, and claim they can compete or even beat a router built in dedicated hardware by router vendor.

My take on this: it depends. If you want to compare the free router on normal PC to some old mid-range router, this might be true. Because all the tasks inside the router are done in central processor and memory, so what it takes is to build a good software to do lookup and packet switching, with optimization to ensure it can utilize the resource in proper or better way.

But how about the latest features in next generation network? Do you think some people will build it for free? The features in a router are getting more complicated it needs decision from the team on how to implement it even there is a standard already defined. And in second part I will explain what a vendor has gone far to develop a modern or next generation router. Because obviously the challenge is not on how to switch the packet between ingress interface to egress interface, but how to do so as fast as possible. And it has to be done consistently for different type of packets, for different size of packets, in massive amount to accommodate the demand of huge bandwidth nowadays. Then later on we will start facing more challenges on how to deploy some features that should be done in the hardware, for example to apply different treatment of packets based on priority on egress network card to ensure high priority packets can be transmitted first back to the network media or the wire. Or re-writing the layer 2 information to the packet should be done in the hardware too to ensure maximum performance.

If you have read this far, and you think all the information above is more than enough to help you in your daily job, and you think it’s more important to go back to all the headaches caused by the communication between routers, or protocols and features that need to be run in multiple routers, then you are completely welcomed to still see a router as a black box or a node with multiple interfaces where the packet is going in and out. And there is really no harm if you want to skip the next part and make decision not to bother at all with the internal packet switching process inside a router.

End of part one.

Friday, February 20, 2009

Title or Money?

According to the survey in the radio, people are more happy with their job designation compare to the salary. So it means they would rather have a nice job title even with less salary. Really? For me no, thanks, please give any title you want as long as the money is more.

But seriously, I don't think all of us agree with that survey. What's the meaning of job title? Once I refused a job offer with designation: Network Solution Architect for the whole region, because even the money was good too but the scope of the work is more in contract documentation and not doing the 'real' network solution.

So for me, it's not even the money that matters. It's more like what I do is the most important, compare to the title or the money. I want the job that makes me do things that I like, otherwise I will do it just to get survived. And fortunately we live in a strange world, where people who do things they like normally make more money :)

Just like in Batman: It's not who you are underneath, it's what you do that defines you.

Hmm, I should get a Bat Mobil instead of British 4WD.

Tuesday, February 17, 2009

Become CCIE with Simulator FAQ

I received many emails related to using simulator/emulator to practice CCIE lab. So I compiled FAQ for this topic.

Should I use emulator like dynamips or buy real lab?
Well, it depends. Dynamips is an emulator that somehow “tricks” the real IOS image so it will boot and run on standard PC. So far it can run IOS for 7200 routers, 3600, 3700 and 2600 series. So if you need to practice features outside those IOS, then you can’t do it with dynamips and must go with real lab.

What does exactly dynamips lack of?
Performance, even it doesn’t matter for CCIE practice lab, features that must be run in hardware such as certain QoS, and all the features outside the supported IOS for example L2 and switching features from a normal 3550 or 3560 switch. And we need to be aware that if there is any issue, we need to be able to identify if the issue is from wrong config, IOS bugs, or bug from the dynamips itself. With real lab, it’s just wrong config and IOS bugs.

Which CCIE track do you think can be done with emulator only?
For Service Provider track, you can practice almost 100% of the topic. The focus of the lab is on SP infrastructure so personally I don’t think you need to spend much time to practice L2 switch features. For Routing & Switching I think dynamips can still be used to cover almost 90%. Despite it has support Ethernet module but it still can’t be used to test real L2 switch features such as VTP and STP. But all L3 features from 3550/3560 switch can be tested or will have the same behavior just as if we use normal router. For Security track the emulator can be used to test IOS FW, IOS IPS, VPN between routers and security features in routers (NAT, ACL, RTBH etc). But more than half of the features for this track require Firewall, VPN, IDS and Cisco Secure ACS. For the rest of tracks, I would say the emulator won’t help that much. Check the CCIE lab blueprint and CCIE lab equipments to give you the idea.

What would I miss from the real lab?
Using real lab we would be able to test all the features required in CCIE lab, real router with real performance, capable to test hardware-dependent features, ability to sell it back when we are done and last but not least, the noise I guess. I used to sleep next to my lab for months so sometime I feel that I can still hear the noise inside my head until now.

What would be your suggestion to cover the lacks from dynamips?
There are several options. You may invest and buy a complete real lab. The challenge to have real lab is we need to replicate as close as possible to the lab equipments. It means, it can be expensive. But the good thing is, if our lab is still in decent condition after we are done, we may be able to sell it again (to another CCIE candidates) without losing a penny at all. Another option is to rent an online rack. It has advantage since we can connect to it as long we have Internet and we don’t need to invest big pile of money in the beginning, but obviously the money won’t go back after we are done. The option that you may want to consider is using dynamips to practice and cover as many features as possible (such as R&S and Security) then go to online rack rental a couple of weeks before the exam. For track like security, you may want to invest in Firewall and VPN hardware, then connect them to dynamips. To practice IDS and for final preparation before taking the exam, you can use online rack for several days. List down all your options then make the pros and cons from each of them before you decide.

Do you know people who passed using dynamips only?
Yes, I know many people have passed CCIE lab using dynamips/emulator. In fact, for my third lab which is Service Provider track I practiced only using the emulator similar like dynamips. And no, I won’t tell you what it is nor I would discuss about it in this blog.

Do you think the people who passed using dynamips/emulator only are not real CCIE, since they never touch real routers?
No, there is no such thing. Passing CCIE lab just means you pass a lab exam. What makes a difference later on is your experience and expertise in real life. So someone may pass CCIE using only emulator and never touch the real routers, and he is still a CCIE. Later on he can gain experience and expertise with real routers. That’s what matter at the end of the day.

Will you teach me how to configure dynamips/other emulator?
No. RTFM. Googling.

Will you send IOS for me to use in emulator?
No. It’s actually illegal to run IOS software without license, but for practice lab at home I don’t think Cisco would bother chasing you. But I won’t send any IOS.

How to find info if I have issue with dynamips?
Again, RTFM and googling. And you should join the forum and become active member to discuss it. As I mentioned above, if there is any issue when you practice CCIE with emulator, it may come from wrong config, IOS bugs or bug in dynamips. So by becoming active member in the forum, and if you are willing to use the emulator heavily, you can contribute if you think the issue is from dynamips itself. Help the community to maintain and develop this wonderful emulator.

So should I use emulator or buy real lab to practice CCIE?
????!@#$%^&* Scroll up and read again from beginning.

Saturday, January 31, 2009

In This Part Of The World

In this part of the world, people may gain respect not because of what they know, their reputation or what they have done. But probably because of what they own, what they drive, how they look, what they wear, what family name they have, even to silly thing such as what passport they hold.

But respect is not the reason why I bought my latest toy.

When I got it, what I had in mind was just to have fun. I made a list of 10-things I want to do in life and one is related to skill and ability to ride different type of vehicle in all kind of terrains. Another reason is probably because I also want to tease my buddy who kept telling me I don’t deserve a British car >:)

In case you were wondering, nope, it’s not an Aston Martin. It was in Quantum of Solace movie too indeed, but it has 4WD, more doors and a bit lower BHP.

In this part of the world, all we need to do is slowing our pace a bit, lowering down our expectation, and everything will be alright.

The name is Bond. James Bond.
No, wait.
The name is Bourne. Jason Bourne.
Nah, never mind.

Stop whining. Start living. And live like you mean it.
Sand dunes, here I come.

Tuesday, January 27, 2009

No Interview Means Good News?

"There’s no bad news or good news. There’s only news"
- Master Ogway, Kungfu Panda :)

Some people might think interview in CCIE lab is difficult. Some people think the opposite. And I'm one of second group. I can say this out loud because I did the 2-day CCIE lab, with interview after the first day and troubleshooting section to explain what I did, how did I find the problem, which debug command I used, two times.

For those who are in doubt about their English, think like this: I believe all CCIE proctors are aware that not everyone is native speaker. So even they decide to conduct a real interview in the lab, I guess they don't expect the candidates to have a perfect English. What matters is the ability to communicate the keywords. And with one-to-one interview that I had back there in 2001, I could use the white board drawing to put more explanation.
More drawing, less talk.

For me, it looks easier then explaining the answer in detail in writing. And again, I have done it that way. My CCIE in Routing and Switching track is the witness.

If you think writing the answer is easier, well, it depends. I know many engineers out there who don't like to write documents, who don't like to write long emails. And yet they still exist. Just as some guys I know who like to talk and explain in white board, but not writing it in detail.

Good luck, anyway.

Wednesday, January 14, 2009

Interview @ CCIE Goes Official

Update: I heard it won't be interview but computer based questions instead. Oh, crap.

Just got it today. I wrote similar idea in my previous post. Good luck to all candidates. And let the real CCIE candidates win! :)

Changes to CCIE Lab and Written Exam Question Format and Scoring

Effective February 1, 2009, Cisco will introduce a new type of question format to CCIE Routing and Switching lab exams. In addition to the live configuration scenarios, candidates will be asked a series of four or five open-ended questions, drawn from a pool of questions based on the material covered on the lab blueprint. No new topics are being added. The exams are not been increased in difficulty and the well-prepared candidate should have no trouble answering the questions. The length of the exam will remain eight hours. Candidates will need to achieve a passing score on both the open-ended questions and the lab portion in order to pass the lab and become certified. Other CCIE tracks will change over the next year, with exact dates announced in advance.

Effective February 17th, 2009, candidates will also see two other changes in CCIE written exams. First, candidates will now be required to answer each question before moving on to the next question; candidates will no longer be allowed to skip a question and come back to it at a later time. Second, there will be an update to the score report. The overall exam score and the exam passing score will now be reported as a scaled score, on a scale from 300-1000. This change will not affect the difficulty of the current set of exams and will assure CCIE written exams will be consistent with Cisco’s other career certification exams.

Monday, January 05, 2009

Summary of My Journey

I'm sleepless because for the past few days I've been staying up late for no reason. Probably because of so many challenges related to my relocation I have to deal with. Funny, I even thought once that the relocation process can really push me to the limit. It makes me standing on the edge, ready to free fall anytime. I have reached the bottom of my patience level, waiting to be exploded. This is a new experience to me, as I normally able to deal with the most difficult situation even with tough customers or painful projects.

Anyway, let's talk something not related at all just to break the pattern. Few days ago a friend of mine asked me the story of how I did all this journey. How did I start. How did I manage to reach my current state. And he insisted he couldn't find the answer by reading my previous posts in the blog. That's yet another simple question that requires lots of hours to answer. Actually I have presented How I Did It, Summary of My Journey when I gave the free lecture about CCIE in my former university. It was 4 hours lecture, most of the time I spent to talk about myself, my experience, my story. Well, I guess that's the only thing that I'm really good at.

So here I am. In the middle of my frustration with all this back office issues. I'm fed up trying to make the relocation process becomes smooth. So what I will do now, while listening to Adam Sandler's The Wedding Singer "Somebody kill me please.. put a bullet in my head.. " is to copy paste some bullet points from my slide to here.

This might be another cruel and useless post in this blog. But frankly speaking, I don't give a damn. Relocation sucks, and The Cure rocks!

How I Did It, Summary of My Journey.

- Mechanical Engineer from ITB, appointed in 1994
- Have been dreaming to work overseas since very young
- Not the smartest student, can be considered ordinary with GPA < 3.0 (out of 4.0)
- Graduated in the hardest time to find a job in home country, 1998 – 1999
- Applied to Schlumberger as Oil Engineer but rejected due to GPA < minimum requirement
- Worked as mechanical engineer but unhappy, wanted to do bigger and more interesting stuff, so quit after just couple of months even without any jobs in hand
- Learned about Microsoft MCSE in campus, learned about Oracle from freelancer
- Passed CCNA in 2000 and have been fall in love to networking
- Got the first IT job from the same company, Schlumberger, as engineer just several months after got rejected when applying as fresh graduate relying on GPA only
- Worked in shift in Network Operation Center (NOC) to maintain internal network
- Full of envy when seeing other colleagues got full training and facilities to become CCIE
- Decided to sleep in the office for 9 months, abusing the copy machine to make copy of training books at night
- Passed CCNP, CCDP and CCIE qualification test (R&S track) within the first 6 months in company without any training, and passed before the others who were sent to the training
- Got offer to move to IBM Global Services with triple salary and being the only CCIE candidate in IBM Indonesia, promised to get full support and lab to practice
- Found out that no lab equipment was available, so must “borrowed” from customers sometime
- Still lived separate with family even though company has provided rented house thanks to CCIE preparation
- Took the first business class flight in life to CCIE lab in Brussels just to fail with 5 points less
- Took another CCIE lab attempt in Tokyo 1 month after and passed, got the magic number #8171
- As the only CCIE in IBM Indonesia, was working as white collar pre-sales consultant without hands on
- Got a lesson in life when failed to troubleshoot networking issue as CCIE, really embarrassed and realized that "E" doesn't always mean Expert
- Decided to move to middle east in early 2002 to get overseas experience and better money
- Worked for Cisco Gold Partner in Dubai, local company, to serve local customers from Governments, Banks, Universities, Service Providers etc. And working in local company to serve local customers in a strange country is the worst you can expect in career
- Was working as consultant, pre-sales, technical project manager, team leader, senior engineer, designer, and the guy who mounted Cisco device in the rack
- Learned the hard way in how to become a good designer, how to handle under pressure situation, how to handle the customers, how to select the technology as what customer needs not what customer wants, and how to build relationship
- Passed CCIE Security track by self funding and spent USD 15k, supposed to be used as master degree fees
- Have been trying to join Cisco Systems since 2003 but can’t join from partner due to some dumb rule
- Worked together with Cisco Advanced Services in 2006 to migrate triple play network for residential area with 50,000 subscribers, have been fall in love to this team ever since and willing to join in any cost
- Quit the job in mid 2006 and started working as independent consultant/contractor, hoping to get hired by Cisco
- Got offer from many companies with high $$$ but none from Cisco Systems middle east
- Finally got offer to join Cisco Advanced Services in Oct 2006 to cover Asia Pacific
- Dropped salary by 30% to join Cisco, compare to others' offer dropped salary by 50%
- Living mobile mostly in South East Asia countries, and involved in several large scale deployments and critical migration projects i.e. Petronas, VDC, CAT, Telkomsel, Starhub
- Passed CCIE Service Provider fully funded by Cisco, 1st Indonesian Triple CCIE (and still the only one until date)
- Started feeling the comfort and can survive the projects even without using technical skills
- Applied to Cisco AS World Wide Service Provider practice team, team full of only architect and senior consultant, to cover Europe, Middle East and Africa customers
- Got accepted after the hardest technical interview, interviewed from 5 senior guys in different countries
- Was working on CRS-1, IPv6, NGN and Metro Ethernet migration project in Czech Republic and Slovakia
- Looking ahead for another journey in EMEA, will try to trace Jason Bourne’s path
- Writing this post in the middle of the night, trying to escape from all relocation issues. It seems work. Wait.... Nope :(