• Tidak ada hasil yang ditemukan

Linux Advanced Routing & Traffic Control HOWTO

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Membagikan "Linux Advanced Routing & Traffic Control HOWTO"

Copied!
158
0
0

Teks penuh

Dedication

Introduction

  • Disclaimer & License
  • Prior knowledge
  • What Linux can do for you
  • Housekeeping notes
  • Access, CVS & submitting updates
  • Mailing list
  • Layout of this document

Rusty Russell's Networking Concepts HOWTO. http://netfilter.samba.org/unreliable-guides/networking-concepts-HOWTO/index.html). Should be located in/usr/doc/HOWTO/NET3-4-HOWTO.txt, but can also be found online (http://www.linuxports.com/howto/networking).

Introduction to iproute2

  • Why iproute2?
  • iproute2 tour
  • Prerequisites
  • Exploring your current configuration
    • ip shows us our links
    • ip shows us our IP addresses
    • ip shows us our routes
  • ARP

However, this is not enough, so we need instructions on how to reach the world. This is how a machine on the foo.com network is able to communicate with another machine that is on the bar.net network.

Rules - routing policy database

Simple source policy routing

It is left as an exercise for the reader to do this in ip-up.

Routing for multiple uplinks/providers

  • Split access
  • Load balancing

In both cases, you'll want to add rules to choose which provider to move away from based on the machine's IP address on the local network. Instead of choosing one of two providers as your default route, now configure the default route to be a multipath route.

GRE and other tunnels

  • A few general remarks about tunnels
  • IP in IP tunneling
  • GRE tunneling
    • IPv4 Tunneling
    • IPv6 Tunneling
  • Userland tunnels

If you compiled the kernel with 'evaluators', the kernel can measure for each filter how much traffic is passing, more or less. When your network starts to get really big, or when you start treating the "Internet" as your network, you need tools that dynamically route your data.

IPv6 tunneling with Cisco and/or 6bone

IPv6 Tunneling

If you run into problems at this point, go look for documentation on compiling a Linux kernel to your own specifications. But if you don't have a Cisco at your disposal, try one of the many IPv6 tunnel brokers available on the Internet.

IPSEC: secure IP over the Internet

  • Intro with Manual Keying
  • Automatic keying
    • Theory
    • Example
    • Automatic keying using X.509 certificates
  • IPSEC tunnels
  • Other IPSEC software
  • IPSEC interoperation with other systems
    • Windows
    • Check Point VPN-1 NG

Output packets are tagged with the SA SPI ('as') which the kernel used for encryption and authentication so that the remote can request the appropriate authentication and decryption instructions. I use 240kbit/s as the ceiling speed only because it's as high as I can set it before the latency starts to increase, due to buffer overflows everywhere between us and the remote hosts.

Queueing Disciplines for Bandwidth Management

Queues and Queueing Disciplines explained

If you have a router and want to prevent certain hosts within your network from downloading too quickly, you should do your shaping on the *inside* interface of your router, the one that sends data to your own computers. If you have a 100Mbit NIC and you have a router that has a 256kbit link, you need to make sure you are not sending more data than your router can handle.

Simple, classless Queueing Disciplines

  • pfifo_fast
  • Token Bucket Filter
  • Stochastic Fairness Queueing

Tcpdump -v -v shows you the value of the entire TOS field, not just the four bits. The most important parameter of the bucket is its size, that is, the number of tokens it can store.

Advice for when to use which queue

Terminology

It is then fed into the Ingress Qdisc which can apply Filters to a packet and decide to drop it. It is therefore a very good place to remove traffic very early, without consuming too much CPU power. The packet can also be forwarded without entering an application, in which case it is destined for output.

Userspace programs can also deliver data, which is then examined and sent to the Egress Classifier.

Classful Queueing Disciplines

  • Flow within classful qdiscs & classes
  • The qdisc family: roots, handles, siblings and parents
  • The PRIO qdisc
  • The famous CBQ qdisc
  • Hierarchical Token Bucket

In this case, a filter attached to the root decided to send the packet directly to 12:2. When a packet is queued to qdisc PRIO, a class is selected based on the filter commands you provided. You can also add another qdisc to the 3 default classes, while pfifo_fast is limited to simple fifo qdisc.

Each time a packet is requested by the hardware layer to be sent to the network, a weighted round robin ("WRR") process starts, starting with the lower-numbered priority classes.

Classifying packets with filters

  • Some simple filtering examples
  • All the filtering commands you will normally need

As explained in the Classifier chapter, you can compare literally anything with a very complex syntax. The last order says that anything not matched by now must go into the 10:2 lane, which is the next highest priority. This assigns traffic up to 4.3.2.1 and traffic from 1.2.3.4 to the highest priority queue and the rest to the next highest.

If you don't want to understand the full tc filter syntax, just use iptables, and just learn to select on fwmark.

The Intermediate queueing device (IMQ)

  • Sample configuration

Load sharing over multiple interfaces

Caveats

While this isn't a problem for connections carrying many different TCP/IP sessions, you won't be able to merge multiple connections and get to ftp a single file much faster unless your receiving or sending OS is Linux, which isn't easily shaken by something simple rearrangements.

Other possibilities

Netfilter & iproute - marking packets

Advanced filters for (re-)classifying packets

The u32 classifier

  • U32 selector
  • General selectors
  • Specific selectors

Selector U32 contains the template definition, which will match the currently processed package. The Theat keyword means that the match must start at the specified offset (in bytes) -- in this case it is the start of the packet. If the nexthdr+keyword is given, the offset is relative to the start of the top layer's header.

The TOS field starts at the second byte of the packet and is one byte in size, so we could write an equivalent global selector: match u8 0x10 0xff to 1.

The route classifier

This gives us a hint about the internals of the U32 filter -- specific rules are always translated into general ones, and in this form they are stored in kernel memory. This leads to another conclusion -- the tcpandud selectors are exactly the same and that's why you can't use the singlematch tcp dport 53 0xffff selector to match TCP packets sent on the given port -- they will also match packets UDP sent to this port.

Policing filters

  • Ways to police
  • Overlimit actions
  • Examples

The TBF only matches traffic UP TO your configured bandwidth, if more is offered then only the excess is subject to the configured over limit action. Either the flow remains below the average value and the filter classifies the traffic to the configured classid, or your rate exceeds this value, in which case the specified action is taken, which is 'reclassify' by default. For example, you may have a name server that falls over if more than 5 mbit/s of packets are served, in which case an ingress filter can be used to ensure that no more is ever offered.

The only real known example is mentioned in the "Protecting your host from SYN floods" section.

Hashing filters for very fast massive filtering

The configuration is quite complex, but well worth it when you have so many rules. Then we select the source address that lives as position and 15 in the IP header and mark that we are only interested in the last part. It is quite complicated, but in practice it works and the efficiency will be amazing.

Note that this example could be improved to an ideal case where each chain contains 1 filter.

Filtering IPv6 Traffic

  • How come that IPv6 tc filters do not work?
  • Marking IPv6 packets using ip6tables
  • Using the u32 selector to match IPv6 packet

Kernel network parameters

Reverse Path Filtering

Obscure settings

  • Generic ipv4
  • Per device settings
  • Neighbor policy
  • Routing settings
  • Parameters & usage

Clark-Shenker-Zhang algorithm (CSZ)

DSMARK

  • Introduction
  • What is Dsmark related to?
  • Differentiated Services guidelines
  • Working with Dsmark
  • How SCH_DSMARK works
  • TC_INDEX Filter

Ingress qdisc

  • Parameters & usage

Random Early Detection (RED)

Generic Random Early Detection

VC/ATM emulation

Weighted Round Robin (WRR)

Cookbook

Running multiple sites with different SLAs

Protecting your host from SYN floods

Rate limit ICMP to prevent dDoS

You can perform measurements with tcpdump, letting it write to a file for a while and seeing how much ICMP gets through your network.

Prioritizing interactive traffic

The most common use is to set telnet and ftp control connections to "Minimum Latency" and FTP data to. The opposite seems to be done for you, eg telnet, ssh and friends set the entire TOS field in outgoing packets automatically. If you have an app that doesn't do this, you can always do it with netfilter.

Transparent web-caching using netfilter, iproute2, ipchains and squid

  • Traffic flow diagram after implementation

There are 3 common methods to ensure that "outbound" port 80 traffic is directed to the server running squid, and the fourth is being introduced here. If you can tell your gateway router to match packets that have the outgoing destination port of 80 to be sent to the squid server IP address. Silom's default gateway must be donmuang or it would create a loop of Internet traffic. all servers in my network had 10.0.0.1 as the default gateway which was the previous ip address of the donmuang router, so what i did was change the donmuang ip address to 10.0.0.3 and give the naret ip address of 10.0.0.1) .

Set the Squid server to silom, make sure it supports transparent caching/proxying, the default port is usually 3128, so all traffic for port 80 should be redirected locally to port 3128.

Circumventing Path MTU Discovery issues with per route MTU settings

  • Solution

What has happened now is that Path MTU Discovery works less and less well and fails for certain routes, leading to strange TCP/IP sessions that die after a while. When you encounter sites that suffer from this problem, you can disable Path MTU detection by setting it manually. Following problem: I set the mtu/mru of my leased line running ppp to 296 because it's only 33k6 and I can't affect the queue on the other side.

Bypass Path MTU Discovery issues with MSS Clamping (for ADSL, cable, PPPoE and PPtP users).

Circumventing Path MTU Discovery issues with MSS Clamping (for ADSL, cable, PPPoE &

The Ultimate Traffic Conditioner: Low Latency, Fast Up & Downloads

  • Why it doesn’t work well by default
  • The actual script (CBQ)
  • The actual script (HTB)

What remains to be done is to ensure that the interactive traffic jumps to the front of the top queue. To ensure that uploads do not corrupt downloads, we also move ACK packets to the front of the queue. This is the lower limit of latency you can achieve - divide your MTU by your upstream speed to calculate.

You can improve this script by adding 'bounded' to the line starting with 'tc class add.

Rate limiting a single host or netmask

Example of a full nat solution with QoS

  • Let’s begin optimizing that scarce bandwidth
  • Classifying packets
  • Improving our setup
  • Making all of the above start at boot

Using the Border Gateway Protocol for interdomain routing. http://www.cisco.com/univercd/cc/td/doc/cisintwk/ics/icsbgp4.htm. An excellent HOWTO on VLANs can be found here (http://scry.wanfear.com/~greear/vlan/cisco_howto.html). Virtual Router Redundancy Protocol implementation (site1 (http://off.net/~jme/vrrpd/), site2 (http://www.imagestream.com/VRRP.html).

Slides by Jamal Hadi Salim, one of the authors of Linux traffic control http://defiant.coinet.com/iproute2/ip-cref/.

Building bridges, and pseudo-bridges with Proxy ARP

State of bridging and iptables

You can also see 'ebtables' mentioned which is another project - it allows you to do wild things like MACNAT and 'brouting'.

Bridging and shaping

Pseudo-bridges with Proxy-ARP

  • ARP & Proxy-ARP
  • Implementing it

When we build a Pseudo bridge, we instruct the bridge to respond to these ARP packets, causing the hosts in the network to send their packets to the bridge. So in short, every time a host on one side of the bridge asks for the hardware address of a host on the other, the bridge responds with a packet saying 'hand it to me'. In this way, all data traffic is transferred to the right place and always passes through the bridge.

With Linux 2.4/2.5 (and possibly 2.2) this option has been retired and has been replaced by a flag in the /proc directory called 'proxy_arp'.

Setting up OSPF with Zebra

  • Prerequisites
  • Configuring Zebra
  • Running Zebra

Don't be intimidated by this diagram, zebra does most of the work automatically, so you won't have to work to set up all the paths with zebra. Now we need to edit ospfd.conf if you are still using IPV4 or ospf6d.conf if you are using IPV6. It's very nice to see the paths appear just a few seconds after running zebra and ospfd.

Zebra routers are automatic, you just add another router to the network, configure zebra and voila.

Setting up BGP4 with Zebra

  • Network Map (Example)
  • Configuration (Example)
  • Checking Configuration

Other possibilities

Configuring CBQ can be a bit daunting, especially if you just want to configure a few computers behind a router. Originally intended purely for routers, which need constant MAC addresses, it also works for other servers. Now disconnect that computer from the network and very soon one of the other computers will assume the 10.0.0.22 address as well as the MAC address.

Just after packet 4, I disconnected my P200 from the network and my 486 took over, which you can tell from the higher latency.

Further reading

Internet QoS: Architectures and Mechanisms for Quality of Service, Zheng Wang, ISBN Hardcover textbook covering topics related to Quality of Service.

Acknowledgements

Referensi

Dokumen terkait

Profitabilitas yang diukur dengan Return on Equity berpengaruh dan signifikan terhadap Struktur Modal Perusahaan Transportasi Laut yang terdaftar di Bursa Efek Indonesia

Achieved results showed that the direct effect of religious commitment on the empathy of 0/07 and the direct effect of religious involvement on empathy: /025, P≤0/0/1, while the