iproute2

Commit Graph

Author	SHA1	Message	Date
Stephen Hemminger	5f1df307b4	config: put CFLAGS/LDLIBS in config.mk This renames Config to config.mk and includes more Make input. Now configure generates all the required CFLAGS and LDLIBS for the optional libraries. Also, use pkg-config to test for libelf, rather than using a test program. This makes it consistent with other libraries. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-08-23 10:03:09 -07:00
Daniel Borkmann	8cc360fe48	bpf: unbreak libelf linkage for bpf obj loader Commit `69fed534a5` ("change how Config is used in Makefile's") moved HAVE_MNL specific CFLAGS/LDLIBS for building with libmnl out of the top level Makefile into sub-Makefiles. However, it also removed the HAVE_ELF specific CFLAGS/LDLIBS entirely, which breaks the BPF object loader for tc and ip with "No ELF library support compiled in." despite having libelf detected in configure script. Fix it similarly as in `69fed534a5` for HAVE_ELF. Fixes: `69fed534a5` ("change how Config is used in Makefile's") Reported-by: Jeffrey Panneman <jeffrey.panneman@tno.nl> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-08-10 16:40:02 -07:00
Stephen Hemminger	6ff66acc60	tc, ip: more Makefile updates for LIBMNL Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-08-09 08:38:51 -07:00
Roman Mashak	cba134ae70	tc: fix Makefile to build skbmod Signed-off-by: Roman Mashak <mrv@mojatatu.com>	2017-05-22 13:33:51 -07:00
Amir Vadai	f3e1b2448a	pedit: Introduce ipv6 support Add support for modifying IPv6 headers using pedit. Signed-off-by: Amir Vadai <amir@vadai.me>	2017-05-15 15:05:20 -07:00
Amir Vadai	3cd5149ecd	tc/pedit: p_eth: ETH header editor For example, forward tcp traffic to veth0 and set destination mac address to 11:22:33:44:55:66 : $ tc filter add dev enp0s9 protocol ip parent ffff: \ flower \ ip_proto tcp \ action pedit ex munge \ eth dst set 11:22:33:44:55:66 \ action mirred egress \ redirect dev veth0 Signed-off-by: Amir Vadai <amir@vadai.me>	2017-05-01 09:22:16 -07:00
Jiri Kosina	be67f81297	iproute2: tc: introduce build dependency on libnetlink Rebuilding libnetlink doesn't trigger rebuild of tc, which is wrong (especially so for builds where libnetlink.a gets statically linked into tc). Fix that by introducing an explicit dependency. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2017-02-24 15:11:32 -08:00
Yotam Gigi	0b1abd84fb	tc: Add support for the sample tc action The sample tc action allows sampling packets matching a classifier. It peeks randomly packets, and samples them using the psample netlink channel. The user can specify the psample group, which the packet will be sampled to, the sampling rate and the packet truncation (to save kernel-user traffic). The sampled packets contain informative metadata, for example, the input interface and the original packet length. The action syntax: tc filter add [...] \ action sample rate <RATE> group <GROUP> [trunc <SIZE>] [...] Where: RATE := The sampling rate which is the ratio of packets observed at the data source to the samples generated GROUP := the psample module sampling group SIZE := optional truncation size An example for a common usecase of the sample tc action: to sample ingress traffic from interface eth1, one may use the commands: tc qdisc add dev eth1 handle ffff: ingress tc filter add dev eth1 parent ffff: \ matchall action sample rate 12 group 4 Where the first command adds an ingress qdisc and the second starts sampling randomly with an average of one sampled packet per 12 packets on dev eth1 to psample group 4. Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com>	2017-02-06 14:24:52 -08:00
David Michael	bb18c98198	tc: make tc linking depend on libtc.a There was a race condition where the command to link the tc binary could (rarely) run before the libtc.a archive existed.	2017-01-09 12:06:58 -08:00
Amir Vadai	d57639a475	tc/act_tunnel: Introduce ip tunnel action This action could be used before redirecting packets to a shared tunnel device, or when redirecting packets arriving from a such a device. The 'unset' action is optional. It is used to explicitly unset the metadata created by the tunnel device during decap. If not used, the metadata will be released automatically by the kernel. The 'set' operation, will set the metadata with the specified values for the encap. For example, the following flower filter will forward all ICMP packets destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before redirecting, a metadata for the vxlan tunnel is created using the tunnel_key action and it's arguments: $ tc filter add dev net0 protocol ip parent ffff: \ flower \ ip_proto 1 \ dst_ip 11.11.11.2 \ action tunnel_key set \ src_ip 11.11.0.1 \ dst_ip 11.11.0.2 \ id 11 \ action mirred egress redirect dev vxlan0 Signed-off-by: Amir Vadai <amir@vadai.me>	2016-12-02 14:12:09 -08:00
Daniel Borkmann	e42256699c	bpf: make tc's bpf loader generic and move into lib This work moves the bpf loader into the iproute2 library and reworks the tc specific parts into generic code. It's useful as we can then more easily support new program types by just having the same ELF loader backend. Joint work with Thomas Graf. I hacked a rough start of a test suite to make sure nothing breaks [1] and looks all good. [1] https://github.com/borkmann/clsact/blob/master/test_bpf.sh Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Thomas Graf <tgraf@suug.ch>	2016-11-29 12:35:32 -08:00
Daniel Borkmann	4710e46ec3	tc, ipt: don't enforce iproute2 dependency on iptables-devel Since `5cd1adba79` ("Update to current iptables headers") compilation of iproute2 broke for systems without iptables-devel package [1]. Reason is that even though we fall back to build m_ipt.c, the include depends on a xtables-version.h header, which only ships with iptables-devel. Machines not having this package fail compilation with: [...] CC m_ipt.o In file included from ../include/iptables.h:5:0, from m_ipt.c:17: ../include/xtables.h:34:29: fatal error: xtables-version.h: No such file or directory compilation terminated. ../Config:31: recipe for target 'm_ipt.o' failed make[1]: *** [m_ipt.o] Error 1 The configure script only barks that package xtables was not found in the pkg-config search path. The generated Config then only contains f.e. TC_CONFIG_IPSET. In tc's Makefile we thus fall back to adding m_ipt.o to TCMODULES. m_ipt.c then includes the local include/iptables.h header copy, which includes the include/xtables.h copy. Latter then includes xtables-version.h, which only ships with iptables-devel. One way to resolve this is to skip this whole mess when pkg-config has no xtables config available. I've carried something along these lines locally for a while now, but it's just too annyoing. :/ Build works fine now also when xtables.pc is not available. [1] http://www.spinics.net/lists/netdev/msg366162.html Fixes: `5cd1adba79` ("Update to current iptables headers") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2016-10-26 10:58:22 -07:00
Yotam Gigi	d5cbf3ff05	tc: Add support for the matchall traffic classifier. The matchall classifier matches every packet and allows the user to apply actions on it. In addition, it supports the skip_sw and skip_hw (as can be found on u32 and flower filter) that direct the kernel to skip the software/hardware processing of the actions. This filter is very useful in usecases where every packet should be matched. For example, packet mirroring (SPAN) can be setup very easily using that filter. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com>	2016-09-01 08:37:01 -07:00
David Ahern	57bdf8b764	Make builds default to quiet mode Similar to the Linux kernel and perf add infrastructure to reduce the amount of output tossed to a user during a build. Full build output can be obtained with 'make V=1' Builds go from: make[1]: Leaving directory `/home/dsa/iproute2.git/lib' make[1]: Entering directory `/home/dsa/iproute2.git/ip' gcc -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -c -o ip.o ip.c gcc -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -c -o ipaddress.o ipaddress.c to: ... AR libutil.a ip CC ip.o CC ipaddress.o ... Signed-off-by: David Ahern <dsa@cumulusnetworks.com>	2016-05-31 12:13:07 -07:00
Jamal Hadi Salim	d3e511223f	tc: introduce IFE action This action allows for a sending side to encapsulate arbitrary metadata which is decapsulated by the receiving end. The sender runs in encoding mode and the receiver in decode mode. Both sender and receiver must specify the same ethertype. At some point we hope to have a registered ethertype and we'll then provide a default so the user doesnt have to specify it. For now we enforce the user specify it. Described in netdev01 paper: "Distributing Linux Traffic Control Classifier-Action Subsystem" Authors: Jamal Hadi Salim and Damascene M. Joachimpillai Also refer to IETF draft-ietf-forces-interfelfb-04.txt Lets show example usage where we encode icmp from a sender towards a receiver with an skbmark of 17; both sender and receiver use ethertype of 0xdead to interop. YYYY: Lets start with Receiver-side policy config: xxx: add an ingress qdisc sudo tc qdisc add dev $ETH ingress xxx: any packets with ethertype 0xdead will be subjected to ife decoding xxx: we then restart the classification so we can match on icmp at prio 3 sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \ u32 match u32 0 0 flowid 1:1 \ action ife decode reclassify xxx: on restarting the classification from above if it was an icmp xxx: packet, then match it here and continue to the next rule at prio 4 xxx: which will match based on skb mark of 17 sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \ u32 match ip protocol 1 0xff flowid 1:1 \ action continue xxx: match on skbmark of 0x11 (decimal 17) and accept sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \ handle 0x11 fw flowid 1:1 \ action ok xxx: Lets show the decoding policy sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead xxx: filter pref 2 u32 filter pref 2 u32 fh 800: ht divisor 1 filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 (rule hit 0 success 0) match 00000000/00000000 at 0 (success 0 ) action order 1: ife decode action reclassify type 0x0 allow mark allow prio index 11 ref 1 bind 1 installed 45 sec used 45 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 xxx: Observe that above lists all metadatum it can decode. Typically these submodules will already be compiled into a monolithic kernel or loaded as modules YYYY: Lets show the sender side now .. xxx: Add an egress qdisc on the sender netdev sudo tc qdisc add dev $ETH root handle 1: prio xxx: xxx: Match all icmp packets to 192.168.122.237/24, then xxx: tag the packet with skb mark of decimal 17, then xxx: Encode it with: xxx: ethertype 0xdead xxx: add skb->mark to whitelist of metadatum to send xxx: rewrite target dst MAC address to 02:15:15:15:15:15 xxx: sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 u32 \ match ip dst 192.168.122.237/24 \ match ip protocol 1 0xff \ flowid 1:2 \ action skbedit mark 17 \ action ife encode \ type 0xDEAD \ allow mark \ dst 02:15:15:15:15:15 xxx: Lets show the encoding policy filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2 (rule hit 118 success 0) match c0a87a00/ffffff00 at 16 (success 0 ) match 00010000/00ff0000 at 8 (success 0 ) action order 1: skbedit mark 17 index 11 ref 1 bind 1 installed 3 sec used 3 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: ife encode action pipe type 0xDEAD allow mark dst 02:15:15:15:15:15 index 12 ref 1 bind 1 installed 3 sec used 3 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 xxx: Now test by sending ping from sender to destination Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2016-05-16 11:13:26 -07:00
Daniel Borkmann	8f9afdd531	tc, clsact: add clsact frontend Add the tc part for the kernel commit 1f211a1b929c ("net, sched: add clsact qdisc"). Quoting example usage from that commit description: Example, adding qdisc: # tc qdisc add dev foo clsact # tc qdisc show dev foo qdisc mq 0: root qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc clsact ffff: parent ffff:fff1 Adding filters (deleting, etc works analogous by specifying ingress/egress): # tc filter add dev foo ingress bpf da obj bar.o sec ingress # tc filter add dev foo egress bpf da obj bar.o sec egress # tc filter show dev foo ingress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[ingress] direct-action # tc filter show dev foo egress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[egress] direct-action The ingress parent alias can also be used with ingress qdisc. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2016-01-18 11:41:27 -08:00
Jiri Pirko	30eb304ecd	tc: add support for Flower classifier Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2015-05-21 15:22:49 -07:00
Daniel Borkmann	4bd624467b	tc: built-in eBPF exec proxy This work follows upon commit `6256f8c9e4` ("tc, bpf: finalize eBPF support for cls and act front-end") and takes up the idea proposed by Hannes Frederic Sowa to spawn a shell (or any other command) that holds generated eBPF map file descriptors. File descriptors, based on their id, are being fetched from the same unix domain socket as demonstrated in the bpf_agent, the shell spawned via execvpe(2) and the map fds passed over the environment, and thus are made available to applications in the fashion of std{in,out,err} for read/write access, for example in case of iproute2's examples/bpf/: # env \| grep BPF BPF_NUM_MAPS=3 BPF_MAP1=6 <- BPF_MAP_ID_QUEUE (id 1) BPF_MAP0=5 <- BPF_MAP_ID_PROTO (id 0) BPF_MAP2=7 <- BPF_MAP_ID_DROPS (id 2) # ls -la /proc/self/fd [...] lrwx------. 1 root root 64 Apr 14 16:46 0 -> /dev/pts/4 lrwx------. 1 root root 64 Apr 14 16:46 1 -> /dev/pts/4 lrwx------. 1 root root 64 Apr 14 16:46 2 -> /dev/pts/4 [...] lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map The advantage (as opposed to the direct/native usage) is that now the shell is map fd owner and applications can terminate and easily reattach to descriptors w/o any kernel changes. Moreover, multiple applications can easily read/write eBPF maps simultaneously. To further allow users for experimenting with that, next step is to add a small helper that can get along with simple data types, so that also shell scripts can make use of bpf syscall, f.e to read/write into maps. Generally, this allows for prepopulating maps, or any runtime altering which could influence eBPF program behaviour (f.e. different run-time classifications, skb modifications, ...), dumping of statistics, etc. Reference: http://thread.gmane.org/gmane.linux.network/357471/focus=357860 Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com>	2015-04-27 16:39:23 -07:00
Felix Fietkau	b8d5c9a71b	tc: add support for connmark action Add ability to add the netfilter connmark support. Typical usage: ...lets tag outgoing icmp with mark 0x10.. iptables -tmangle -A PREROUTING -p icmp -j CONNMARK --set-mark 0x10 ..add on ingress of $ETH an extractor for connmark... tc filter add dev $ETH parent ffff: prio 4 protocol ip \ u32 match ip protocol 1 0xff \ flowid 1:1 \ action connmark continue ...if the connmark was 0x11, we police to a ridic rate of 10Kbps tc filter add dev $ETH parent ffff: prio 5 protocol ip \ handle 0x11 fw flowid 1:1 \ action police rate 10kbit burst 10k Other ways to use the connmark is to supply the zone, index and branching choice. Refer to help. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2015-04-13 10:49:45 -07:00
Daniel Borkmann	11c39b5e98	tc: add eBPF support to f_bpf This work adds the tc frontend for kernel commit e2e9b6541dd4 ("cls_bpf: add initial eBPF support for programmable classifiers"). A C-like classifier program (f.e. see e2e9b6541dd4) is being compiled via LLVM's eBPF backend into an ELF file, that is then being passed to tc. tc then loads, if any, eBPF maps and eBPF opcodes (with fixed-up eBPF map file descriptors) out of its dedicated sections, and via bpf(2) into the kernel and then the resulting fd via netlink down to cls_bpf. cls_bpf allows for annotations, currently, I've used the file name for that, so that the user can easily identify his filter when dumping configurations back. Example usage: clang -O2 -emit-llvm -c cls.c -o - \| llc -march=bpf -filetype=obj -o cls.o tc filter add dev em1 parent 1: bpf run object-file cls.o classid x:y tc filter show dev em1 [...] filter parent 1: protocol all pref 49152 bpf handle 0x1 flowid x:y cls.o I placed the parser bits derived from Alexei's kernel sample, into tc_bpf.c as my next step is to also add the same support for BPF action, so we can have a fully fledged eBPF classifier and action in tc. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com>	2015-03-24 15:45:23 -07:00
Jiri Pirko	86ab59a666	tc: add support for BPF based actions Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2015-02-05 10:38:13 -08:00
Jiri Pirko	1d129d191a	tc: push bpf common code into separate file Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2015-02-05 10:38:13 -08:00
Vadim Kochan	67e1d73be1	tc: Allow to easy change network namespace Added new '-netns' option to simplify executing following cmd: ip netns exec NETNS tc OPTIONS COMMAND OBJECT to tc -n[etns] NETNS OPTIONS COMMAND OBJECT e.g.: tc -net vnet0 qdisc Signed-off-by: Vadim Kochan <vadim4j@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2014-12-27 10:22:34 -08:00
Jiri Pirko	8b1c0216d8	tc: add support for vlan tc action Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Cong Wang <cwang@twopensource.com>	2014-12-03 09:29:21 -08:00
Terry Lam	ac74bd2a71	support for Heavy Hitter Filter (HHF) qdisc $tc qdisc add dev eth0 hhf help Usage: ... hhf [ limit PACKETS ] [ quantum BYTES] [ hh_limit NUMBER ] [ reset_timeout TIME ] [ admit_bytes BYTES ] [ evict_timeout TIME ] [ non_hh_weight NUMBER ] $tc -s -d qdisc show dev eth0 qdisc hhf 8005: root refcnt 32 limit 1000p quantum 1514 hh_limit 2048 reset_timeout 40.0ms admit_bytes 131072 evict_timeout 1.0s non_hh_weight 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 drop_overlimit 0 hh_overlimit 0 tot_hh 0 cur_hh 0 HHF qdisc parameters: - limit: max number of packets in qdisc (default 1000) - quantum: max deficit per RR round (default 1 MTU) - hh_limit: max number of HHs to keep states (default 2048) - reset_timeout: time to reset HHF counters (default 40ms) - admit_bytes: counter thresh to classify as HH (default 128KB) - evict_timeout: threshold to evict idle HHs (default 1s) - non_hh_weight: DRR weight for mice (default 2) Signed-off-by: Terry Lam <vtlam@google.com>	2014-05-09 12:10:47 -07:00
Vijay Subramanian	80dd880dd0	PIE: Proportional Integral controller Enhanced Proportional Integral controller Enhanced (PIE) is a scheduler to address the bufferbloat problem. We present here a lightweight design, PIE(Proportional Integral controller Enhanced) that can effectively control the average queueing latency to a target value. Simulation results, theoretical analysis and Linux testbed results have shown that PIE can ensure low latency and achieve high link utilization under various congestion situations. The design does not require per-packet timestamp, so it incurs very small overhead and is simple enough to implement in both hardware and software. " For more information, please see technical paper about PIE in the IEEE Conference on High Performance Switching and Routing 2013. A copy of the paper can be found at ftp://ftpeng.cisco.com/pie/. Please also refer to the IETF draft submission at http://tools.ietf.org/html/draft-pan-tsvwg-pie-00 All relevant code, documents and test scripts and results can be found at ftp://ftpeng.cisco.com/pie/. For problems with the iproute2/tc or Linux kernel code, please contact Vijay Subramanian (vijaynsu@cisco.com or subramanian.vijay@gmail.com) Mythili Prabhu (mysuryan@cisco.com) Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: Mythili Prabhu <mysuryan@cisco.com> CC: Dave Taht <dave.taht@bufferbloat.net>	2014-01-09 22:50:47 -08:00
Daniel Borkmann	d05df6861f	tc: add cls_bpf frontend This is the iproute2 part of the kernel patch "net: sched: add BPF-based traffic classifier". [Will re-submit later again for iproute2 when window for -next submissions opens.] Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Thomas Graf <tgraf@suug.ch>	2013-10-30 16:45:05 -07:00
Jamal Hadi Salim	087f46ee4e	tc: introduce simple action Simple action is already in the kernel for years now as an example. This complements it with user space control. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2013-09-30 21:29:34 -07:00
Eric Dumazet	bc113e46a3	pkt_sched: fq: Fair Queue packet scheduler Support for FQ packet scheduler $ tc qd add dev eth0 root fq help Usage: ... fq [ limit PACKETS ] [ flow_limit PACKETS ] [ quantum BYTES ] [ initial_quantum BYTES ] [ maxrate RATE ] [ buckets NUMBER ] [ [no]pacing ] $ tc -s -d qd qdisc fq 8002: dev eth0 root refcnt 32 limit 10000p flow_limit 100p buckets 256 quantum 3028 initial_quantum 15140 Sent 216532416 bytes 148395 pkt (dropped 0, overlimits 0 requeues 14) backlog 0b 0p requeues 14 511 flows (511 inactive, 0 throttled) 110 gc, 0 highprio, 0 retrans, 1143 throttled, 0 flows_plimit limit : max number of packets on whole Qdisc (default 10000) flow_limit : max number of packets per flow (default 100) quantum : the max deficit per RR round (default is 2 MTU) initial_quantum : initial credit for new flows (default is 10 MTU) maxrate : max per flow rate (default : unlimited) buckets : number of RB trees (default : 1024) in hash table. (consumes 8 bytes per bucket) [no]pacing : disable/enable pacing (default is enable) Usage : tc qdisc add dev $ETH root fq tc qdisc del dev $ETH root 2>/dev/null tc qdisc add dev $ETH root handle 1: mq for i in `seq 1 4` do tc qdisc add dev $ETH parent 1:$i est 1sec 4sec fq done Signed-off-by: Eric Dumazet <edumazet@google.com>	2013-09-20 09:43:40 -07:00
Benjamin Poirier	5ab3a4de5e	Use pkg-config to obtain xtables.h path On openSUSE 12.2 (at least) xtables.h is not installed in the system-wide include dir but in /usr/include/iptables-1.4.16.3/. This results in the following build failure: em_ipset.c:26:21: fatal error: xtables.h: No such file or directory Other includers of xtables.h already call out to pkg-config	2013-02-11 09:19:54 -08:00
Mike Frysinger	e4fc4ada33	allow pkg-config to be customized Rather than hard coding `pkg-config`, use ${PKG_CONFIG} so people can override it to their specific version (like when cross-compiling). This is the same way the upstream pkg-config code works. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2012-11-11 16:21:34 -08:00
Matt Burgess	92905c6e0d	iproute2-3.6.0 assumes presence of iptables Hi, When compiling iproute2-3.6.0 on a host that doesn't have iptables available, I get the following error: gcc -Wall -Wstrict-prototypes -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -DCONFIG_GACT -DCONFIG_GACT_PROB -DYY_NO_INPUT -c -o em_ipset.o em_ipset.c em_ipset.c:26:21: fatal error: xtables.h: No such file or directory Fixed by the following patch, which guards the building of em_ipset.o on the presence of suitable headers. Thanks, Matt.	2012-10-03 08:51:29 -07:00
Rostislav Lisovy	7b5f30e14f	Ematch used to classify CAN frames according to their identifiers This ematch enables effective filtering of CAN frames (AF_CAN) based on CAN identifiers with masking of compared bits. Implementation utilizes bitmap based classification for standard frame format (SFF) which is optimized for minimal overhead. Signed-off-by: Rostislav Lisovy <lisovy@gmail.com>	2012-08-20 13:11:55 -07:00
Florian Westphal	8194411a42	tc: add ipset ematch example usage: tc filter add dev $dev parent $id: basic match not ipset'(foobar src)' .. also updates iproute2/ematch_map, else tc complains: Error: Unable to find ematch "ipset" in /etc/iproute2/ematch_map Please assign a unique ID to the ematch kind the suggested entry is: 8 ipset when trying to use this ematch. (text ematch (5) only exists in kernel, a vlan ematch (6) exists neither in kernel nor userspace, but kernel headers define TCF_EM_VLAN == 6).	2012-08-13 08:33:50 -07:00
Eric Dumazet	c3524efc14	fq_codel: Fair Queue Codel AQM Fair Queue Codel packet scheduler Principles : - Packets are classified (internal classifier or external) on flows. - This is a Stochastic model (as we use a hash, several flows might be hashed on same slot) - Each flow has a CoDel managed queue. - Flows are linked onto two (Round Robin) lists, so that new flows have priority on old ones. - For a given flow, packets are not reordered (CoDel uses a FIFO) - head drops only. - ECN capability is on by default. - Very low memory footprint (64 bytes per flow) tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ] [ target TIME ] [ interval TIME ] [ noecn ] [ quantum BYTES ] Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Kathleen Nichols <nichols@pollere.com> Cc: Van Jacobson <van@pollere.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Changli Gao <xiaosuo@gmail.com>	2012-05-22 14:17:49 -07:00
Eric Dumazet	185d88f99b	tc_codel: Controlled Delay AQM An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson. http://queue.acm.org/detail.cfm?id=2209336 This AQM main input is no longer queue size in bytes or packets, but the delay packets stay in (FIFO) queue. As we don't have infinite memory, we still can drop packets in enqueue() in case of massive load, but mean of CoDel is to drop packets in dequeue(), using a control law based on two simple parameters : target : target sojourn time (default 5ms) interval : width of moving time window (default 100ms) Selected packets are dropped, unless ECN is enabled and packets can get ECN mark instead. Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ] [ interval TIME ] [ ecn ] qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0) rate 202365Kbit 16708pps backlog 113550b 75p requeues 0 count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us maxpacket 1514 ecn_mark 84399 drop_overlimit 0 CoDel must be seen as a base module, and should be used keeping in mind there is still a FIFO queue. So a typical setup will probably need a hierarchy of several qdiscs and packet classifiers to be able to meet whatever constraints a user might have. One possible example would be to use fq_codel, which combines Fair Queueing and CoDel, in replacement of sfq / sfq_red. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dave Taht <dave.taht@bufferbloat.net>	2012-05-22 14:13:52 -07:00
Christoph J. Thompson	5c434a9e5a	iproute2 - Fix up and simplify variables pointing to install directories Define where is the are located the iproute2 config files. Get rid of trailing slashes for paths in several file. Signed-off-by: Christoph J. Thompson <cjsthompson@gmail.com>	2012-04-12 09:49:10 -07:00
Yegor Yefremov	8ced4fcd50	iproute2: cleanup dependencies LIBNETLINK will be defined in the main Makefile, so both ../lib/libnetlink.a ../lib/libutil.a will be automatically appended during linking. Otherwise ../lib/libnetlink.a ../lib/libutil.a will appear twice during linking. Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>	2012-02-27 08:27:54 -08:00
Jan Engelhardt	d7aa57d450	iproute2: proper detection of libxtables position and flags Upstream: not sent yet Any tests involving iptables _MUST_ utilize pkg-config to find the proper locations of the installation.	2012-01-03 15:05:25 -08:00
Stephen Hemminger	155ad8023b	ematch: fix warning about unused input() Use existing compile flag to indicate that input() is not used by tc ematch, fixes compiler warning.	2012-01-03 13:55:59 -08:00
Stephen Hemminger	93ba481acb	cleanup ematch yacc files make clean needs to remove all the yacc output files for ematch.	2011-11-02 16:39:36 -07:00
Mike Frysinger	aa48b5931a	tc: fix parallel build file with lex/yacc Building iproute2 in parallel might hit the race failure: emp_ematch.l:2:30: fatal error: emp_ematch.yacc.h: No such file or directory make[1]: *** [emp_ematch.lex.o] Error 1 This is because we currently allow the yacc/lex files to generate and compile in parallel. So add a simple dependency to make sure yacc has finished before we attempt to compile the lex output. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-10-18 15:02:21 -07:00
Stephen Hemminger	c441bd4c1b	Add QFQ scheduler Basic configuration support for QFQ. Still need to add manual page.	2011-07-13 13:46:34 -07:00
John Fastabend	914953046a	iproute2: tc add mqprio qdisc support Add mqprio qdisc support. Output matches the following, qdisc mq 0: dev eth1 root qdisc mq 0: dev eth2 root qdisc mqprio 8001: dev eth3 root tc 8 map 0 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1 queues:(0:7) (8:15) (16:23) (24:31) (32:39) (40:47) (48:55) (56:63) And usage is, Usage: ... mclass [num_tc NUMBER] [map P0 P1...] [offset txq0 txq1 ...] [count cnt0 cnt1 ...] [hw 1\|0] Signed-off-by: John Fastabend <john.r.fastabend@intel.com>	2011-04-12 14:28:19 -07:00
Juliusz Chroboczek	d7f3299d59	tc : SFB flow scheduler Supports SFB qdisc (included in linux-2.6.39) 1) Setup phase : accept non default parameters 2) dump information qdisc sfb 11: parent 1:11 limit 1 max 25 target 20 increment 0.00050 decrement 0.00005 penalty rate 10 burst 20 (600000ms 60000ms) Sent 47991616 bytes 521648 pkt (dropped 549245, overlimits 549245 requeues 0) rate 7193Kbit 9774pps backlog 0b 0p requeues 0 earlydrop 0 penaltydrop 0 bucketdrop 0 queuedrop 549245 childdrop 0 marked 0 maxqlen 0 maxprob 0.00000 avgprob 0.00000 Signed-off-by: Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>	2011-04-12 14:27:37 -07:00
Stephen Hemminger	a4eca97cff	CHOKe scheduler TC commands for CHOKe qdisc	2011-01-31 09:09:50 -08:00
Gregoire Baron	3822cc986c	tc: add ACT_CSUM action support (csum) Add the iproute2 support for the ACT_CSUM action. Can be used as following, certainly in conjunction with the ACT_PEDIT action (pedit): # In order to DNAT (stateless) IPv4 packet from 192.168.1.100 to # 0x12345678 (18.52.86.120), and update the IPv4 header checksum and # the UDP checksum (the last one, only if the packet is UDP). tc filter add eth0 prio 1 protocol ip parent ffff: \ u32 match ip src 192.168.1.100/32 flowid :1 \ action pedit munge offset 16 u32 set 0x12345678 \ pipe csum ip and udp # In order to alter destination address of IPv6 TCP packets from fc00::1 # and correct the TCP checksum (nothing happened? except maybe for # checksums in the TCP payload ...). tc filter add eth0 prio 1 protocol ipv6 parent ffff: \ u32 match ip6 src fc00::1/128 match ip6 protocol 0x06 0xff flowid :1 \ action pedit munge offset 24 u32 set 0x12345678 \ pipe csum tcp	2010-12-01 11:17:46 -08:00
Mike Frysinger	bf512683e0	tc: revert "echo" in install target The recent commit "iproute2: add option to build m_xt as a tc module" (`ab814d6355`) looks like it wrongly included debug changes in the install target. So drop the `echo` so the tc binary actually gets installed again. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2010-07-23 12:28:25 -07:00
Andreas Henriksson	ab814d6355	iproute2: add option to build m_xt as a tc module (v3) This will build the xt module (action ipt) of tc as a shared object that is linked at runtime by tc if used, rather then built into tc. This is similar to how the atm qdisc support is handled (q_atm.so). Signed-off-by: Andreas Henriksson <andreas@xxxxxxxx>	2010-04-12 11:40:29 -07:00
Andreas Henriksson	12ddfff76c	iproute2: detect iptables modules dir in configure. Try to automatically detect iptables modules directory. Make the configure script look for iptables modules. This also makes it possible to specify it on the command line while building via "make IPT_LIB_DIR=/foo/bar". Signed-off-by: Andreas Henriksson <andreas@fatal.se>	2010-03-29 15:10:20 -07:00

1 2

98 Commits