iproute2

Commit Graph

Author	SHA1	Message	Date
Daniel Borkmann	8f9afdd531	tc, clsact: add clsact frontend Add the tc part for the kernel commit 1f211a1b929c ("net, sched: add clsact qdisc"). Quoting example usage from that commit description: Example, adding qdisc: # tc qdisc add dev foo clsact # tc qdisc show dev foo qdisc mq 0: root qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc clsact ffff: parent ffff:fff1 Adding filters (deleting, etc works analogous by specifying ingress/egress): # tc filter add dev foo ingress bpf da obj bar.o sec ingress # tc filter add dev foo egress bpf da obj bar.o sec egress # tc filter show dev foo ingress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[ingress] direct-action # tc filter show dev foo egress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[egress] direct-action The ingress parent alias can also be used with ingress qdisc. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2016-01-18 11:41:27 -08:00
Daniel Borkmann	0d45c4b420	tc, ingress: clean up ingress handling a bit Clean it up a bit, we can also get rid of some ugly ifdefs as in our case TC_H_INGRESS is always defined. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2016-01-18 11:41:27 -08:00
Stephen Hemminger	2505780c20	Merge branch 'net-next'	2016-01-18 09:37:45 -08:00
Stephen Hemminger	bc223ab861	Revert "tc: fix compilation with old gcc (< 4.6)" This reverts commit `8f80d450c3`.	2016-01-18 09:37:38 -08:00
Jamal Hadi Salim	488b41d020	tc: flower no need to specify the ethertype since all tc classifiers are required to specify ethertype as part of grammar By not allowing eth_type to be specified we remove contradiction for example when a user specifies: tc filter add ... priority xxx protocol ip flower eth_type ipv6 This patch removes that contradiction Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2016-01-11 08:24:01 -08:00
Julien Floret	8f80d450c3	tc: fix compilation with old gcc (< 4.6) gcc < 4.6 does not handle C11 syntax for the static initialization of anonymous struct/union, hence the following error: tc_bpf.c:260: error: unknown field map_type specified in initializer Signed-off-by: Julien Floret <julien.floret@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2016-01-11 08:23:36 -08:00
Phil Sutter	de7db5d857	tc: m_connmark: Fix help text When specifying a conntrack zone, the 'zone' keyword has to be used before the actual zone index. Signed-off-by: Phil Sutter <phil@nwl.cc>	2016-01-07 10:35:08 -08:00
Stephen Hemminger	e49b51d663	monitor: fix file handle leak In some cases passing file to monitor left file open. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2015-12-30 17:26:38 -08:00
Daniel Borkmann	fd7f9c7fd1	bpf: minor fix in api and bpf_dump_error() usage Fix a whitespace in bpf_dump_error() usage, and also a missing closing bracket in ntohl() macro for eBPF programs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-12-17 17:22:25 -08:00
Daniel Borkmann	91d88eeb10	{f,m}_bpf: allow updates on program arrays Since we have all infrastructure in place now, allow atomic live updates on program arrays. This can be very useful e.g. in case programs that are being tail-called need to be replaced, f.e. when classifier functionality needs to be changed, new protocols added/removed during runtime, etc. Thus, provide a way for in-place code updates, minimal example: Given is an object file cls.o that contains the entry point in section 'classifier', has a globally pinned program array 'jmp' with 2 slots and id of 0, and two tail called programs under section '0/0' (prog array key 0) and '0/1' (prog array key 1), the section encoding for the loader is <id/key>. Adding the filter loads everything into cls_bpf: tc filter add dev foo parent ffff: bpf da obj cls.o Now, the program under section '0/1' needs to be replaced with an updated version that resides in the same section (also full path to tc's subfolder of the mount point can be passed, e.g. /sys/fs/bpf/tc/globals/jmp): tc exec bpf graft m:globals/jmp obj cls.o sec 0/1 In case the program resides under a different section 'foo', it can also be injected into the program array like: tc exec bpf graft m:globals/jmp key 1 obj cls.o sec foo If the new tail called classifier program is already available as a pinned object somewhere (here: /sys/fs/bpf/tc/progs/parser), it can be injected into the prog array like: tc exec bpf graft m:globals/jmp key 1 fd m:progs/parser In the kernel, the program on key 1 is being atomically replaced and the old one's refcount dropped. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Daniel Borkmann	f6793eec46	{f, m}_bpf: allow for user-defined object pinnings The recently introduced object pinning can be further extended in order to allow sharing maps beyond tc namespace. F.e. maps that are being pinned from tracing side, can be accessed through this facility as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Daniel Borkmann	9e607f2e72	{f, m}_bpf: check map attributes when fetching as pinned Make use of the new show_fdinfo() facility and verify that when a pinned map is being fetched that its basic attributes are the same as the map we declared from the ELF file. I.e. when placed into the globalns, collisions could occur. In such a case warn the user and bail out. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Daniel Borkmann	910b543dcc	{f,m}_bpf: make tail calls working Now that we have the possibility of sharing maps, it's time we get the ELF loader fully working with regards to tail calls. Since program array maps are pinned, we can keep them finally alive. I've noticed two bugs that are being fixed in bpf_fill_prog_arrays() with this patch. Example code comes as follow-up. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Daniel Borkmann	32e93fb7f6	{f,m}_bpf: allow for sharing maps This larger work addresses one of the bigger remaining issues on tc's eBPF frontend, that is, to allow for persistent file descriptors. Whenever tc parses the ELF object, extracts and loads maps into the kernel, these file descriptors will be out of reach after the tc instance exits. Meaning, for simple (unnested) programs which contain one or multiple maps, the kernel holds a reference, and they will live on inside the kernel until the program holding them is unloaded, but they will be out of reach for user space, even worse with (also multiple nested) tail calls. For this issue, we introduced the concept of an agent that can receive the set of file descriptors from the tc instance creating them, in order to be able to further inspect/update map data for a specific use case. However, while that is more tied towards specific applications, it still doesn't easily allow for sharing maps accross multiple tc instances and would require a daemon to be running in the background. F.e. when a map should be shared by two eBPF programs, one attached to ingress, one to egress, this currently doesn't work with the tc frontend. This work solves exactly that, i.e. if requested, maps can now be _arbitrarily_ shared between object files (PIN_GLOBAL_NS) or within a single object (but various program sections, PIN_OBJECT_NS) without "loosing" the file descriptor set. To make that happen, we use eBPF object pinning introduced in kernel commit b2197755b263 ("bpf: add support for persistent maps/progs") for exactly this purpose. The shipped examples/bpf/bpf_shared.c code from this patch can be easily applied, for instance, as: - classifier-classifier shared: tc filter add dev foo parent 1: bpf obj shared.o sec egress tc filter add dev foo parent ffff: bpf obj shared.o sec ingress - classifier-action shared (here: late binding to a dummy classifier): tc actions add action bpf obj shared.o sec egress pass index 42 tc filter add dev foo parent ffff: bpf obj shared.o sec ingress tc filter add dev foo parent 1: bpf bytecode '1,6 0 0 4294967295,' \ action bpf index 42 The toy example increments a shared counter on egress and dumps its value on ingress (if no sharing (PIN_NONE) would have been chosen, map value is 0, of course, due to the two map instances being created): [...] <idle>-0 [002] ..s. 38264.788234: : map val: 4 <idle>-0 [002] ..s. 38264.788919: : map val: 4 <idle>-0 [002] ..s. 38264.789599: : map val: 5 [...] ... thus if both sections reference the pinned map(s) in question, tc will take care of fetching the appropriate file descriptor. The patch has been tested extensively on both, classifier and action sides. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-11-23 16:10:44 -08:00
Stephen Hemminger	037660b351	qfq: fix parse_opt dead code Fix Coverity warning from dead code.	2015-10-27 15:46:20 +09:00
Stephen Hemminger	86c392f958	Merge branch 'master' into net-next	2015-10-23 15:46:08 -07:00
Stephen Hemminger	753ef5bbd6	tc: remove extra whitespace No blank lines at EOF, or trailing whitespace.	2015-10-23 15:43:28 -07:00
Phil Sutter	40eb737ebb	tc: u32 filter coding style cleanup Add missing spaces around operators to increase readability. Aside from that, make "preference" match a real synonym for "tos" and "dsfield" as it's effect was identical to them. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-10-23 15:37:26 -07:00
Phil Sutter	0a83e1eaf7	tc: improve filter help texts a bit This fixes a few syntax errors and changes route filter help text to use classid instead of flowid to be consistent with other filters' help texts. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-10-23 15:37:26 -07:00
Daniel Borkmann	343dc90854	m_bpf: don't require default opcode on ebpf actions After the patch, the most minimal command to load an eBPF action for late binding with auto index selection through tc is: tc actions add action bpf obj prog.o We already set TC_ACT_PIPE in tc as default opcode, so if nothing further has been specified, just use it. Also, allow "ok" next to "pass" for matching cmdline on TC_ACT_OK. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-10-12 09:44:52 -07:00
Daniel Borkmann	faa8a46300	f_bpf: allow for optional classid and add flags When having optional classid, most minimal command can be sth like: tc filter add dev foo parent X: bpf obj prog.o Therefore, adapt the code so that a next argument will not be enforced as the case currently. Also, minor cleanup on the classid, where we should rather have used addattr32(), and add flags for exec configuration, for example (using short notation): tc filter add dev foo parent X: bpf da obj prog.o Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com>	2015-10-12 09:41:05 -07:00
Stephen Hemminger	8fe9839857	fq: fix whitespace	2015-09-25 12:40:00 -07:00
Eric Dumazet	8d5bd8c302	tc: fq: allow setting and retrieving orphan_mask linux-3.19 fq packet scheduler got a new attribute, controlling number of 'flows' holding packets not attached to a socket (forwarding usage) kernel commit is 06eb395fa9856b5a87cf7d80baee2a0ed3cdb9d7 ("pkt_sched: fq: better control of DDOS traffic") This patch adds corresponding code to tc command. tc qd replace dev eth0 root fq orphan_mask 511 Signed-off-by: Eric Dumazet <edumazet@google.com>	2015-09-25 12:37:09 -07:00
Eric Dumazet	32a6fbe563	tc : add timestamps to tc monitor Support -timestamp and -tshort options for tc monitor like ip monitor. # tc -tshort monitor [2015-09-23T16:39:11.260555] qdisc fq 8003: dev eth0 root refcnt 2 limit 10000p flow_limit 100p buckets 1024 quantum 3028 initial_quantum 15140 refill_delay 40.0ms Signed-off-by: Eric Dumazet <edumazet@google.com>	2015-09-25 12:35:46 -07:00
Phil Sutter	565af7b816	tc: fq: allow setting and retrieving flow refill delay Code to parse and export this tuneable via netlink is already present in sched_fq.c of the kernel, so not making it accessible for users would be a waste of resources. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-09-23 16:02:13 -07:00
Phil Sutter	5c32fa1d69	comment: Fix remaining listings of wrong FSF address This patch follows the changes of commit `4d98ab0` ("Fix FSF address in file headers"), fixing file headers added after it. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-09-23 15:58:54 -07:00
Stephen Hemminger	9a6422c243	Merge branch 'master' into net-next	2015-08-13 19:42:41 -07:00
Stephen Hemminger	bcb4a7aa5b	tc: fix return after invarg	2015-08-13 14:20:40 -07:00
Daniel Borkmann	baed90842a	m_bpf: add frontend support for late binding Frontend support for kernel commit a5c90b29e5cc ("act_bpf: properly support late binding of bpf action to a classifier"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-08-10 11:19:11 -07:00
Nicolas Dichtel	611f70b287	tc: fix bpf compilation with old glibc Error was: f_bpf.o: In function `bpf_parse_opt': f_bpf.c:(.text+0x88f): undefined reference to `secure_getenv' m_bpf.o: In function `parse_bpf': m_bpf.c:(.text+0x587): undefined reference to `secure_getenv' collect2: error: ld returned 1 exit status There is no special reason to use the secure version of getenv, thus let's simply use getenv(). CC: Daniel Borkmann <daniel@iogearbox.net> Fixes: `88eea53954` ("tc: {f,m}_bpf: allow to retrieve uds path from env") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Yegor Yefremov <yegorslists@googlemail.com>	2015-07-27 14:35:42 -07:00
Stephen Hemminger	69be46c562	Merge branch 'master' into net-next	2015-06-26 00:04:04 -04:00
Daniel Borkmann	88eea53954	tc: {f,m}_bpf: allow to retrieve uds path from env Allow to retrieve uds path from the environment, facilitates also dealing with export a bit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-06-25 15:13:16 -04:00
Daniel Borkmann	473d7840c3	tc: {f,m}_bpf: add tail call support for parser Kernel commit 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs") added support for tail calls, this patch here adds tc front end parts for the object parser to prepopulate a given eBPF prog array before the root prog is pushed down for classifier creation. The prepopulation works with any number of prog arrays in any dependencies, e.g. prog or normal maps could also be used from progs that are tail-called themself, etc. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-06-25 15:13:16 -04:00
Maciej Żenczykowski	0bbca0422f	iproute2: tc/m_pedit.c - remove dead code The initializers are simply not needed. These if-blocks are outright dead code, because '0 > unsigned' is always false, so only else clause triggers and regardless of which clause triggers it only updates 'ind' which is later unconditionally written to before being used anyway. Otherwise we get errors from clang: m_pedit.c:166:8: error: comparison of 0 > unsigned expression is always false [-Werror,-Wtautological-compare] if (0 > tkey->off) { ~ ^ ~~~~~~~~~ m_pedit.c:209:8: error: comparison of 0 > unsigned expression is always false [-Werror,-Wtautological-compare] if (0 > tkey->off) { ~ ^ ~~~~~~~~~ 2 errors generated. Change-Id: I3c9e9092915088fc56f992e5df736851541a4458	2015-06-25 08:52:06 -04:00
Stephen Hemminger	f975059a51	Merge branch 'master' into net-next	2015-06-25 08:01:51 -04:00
Daniel Borkmann	ad1fe0d8e9	tc: util: fix print_rate for ludicrous speeds The for loop should only probe up to G[i]bit rates, so that we end up with T[i]bit as the last max units[] slot for snprintf(3), and not possibly an invalid pointer in case rate is multiple of kilo. Fixes: `8cecdc2837` ("tc: more user friendly rates") Reported-by: Jose R. Guzman Mosqueda <jose.r.guzman.mosqueda@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-06-24 23:34:20 -04:00
Stephen Hemminger	03371c7d98	Merge branch 'master' into net-next Conflicts: include/linux/tcp.h lib/libnetlink.c	2015-05-28 09:18:01 -07:00
Stephen Hemminger	c079e121a7	libnetlink: add size argument to rtnl_talk There have been several instances where response from kernel has overrun the stack buffer from the caller. Avoid future problems by passing a size argument. Also drop the unused peer and group arguments to rtnl_talk.	2015-05-27 13:00:21 -07:00
David Ward	aacee2695a	tc: gred: Add support for TCA_GRED_LIMIT attribute Allow the qdisc limit to be set, which is particularly useful when the default VQ is not configured with RED parameters. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 15:30:39 -07:00
Nicolas Dichtel	0628cddd9d	libnetlink: introduce rtnl_listen_filter_t There is no functional change with this commit. It only prepares the next one. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>	2015-05-21 15:28:56 -07:00
Eric Dumazet	df1c7d9138	codel: add ce_threshold support to codel & fc_codel codel & fq_codel packet schedulers are now able to have a threshold for CE marking packets, regardless of the drop/nodrop decision taken by CoDel. This is particularly useful for dctcp and variants, that do not use traditional ECN. Note that fq_codel users would have to specify noecn if ce_threshold is used, otherwise results would be not very interesting, as ecn is default on for fq_codel. $ tc -s qdisc show dev eth1 qdisc codel 8002: root refcnt 45 limit 1000p target 5.0ms ce_threshold 1.0ms interval 100.0ms Sent 4908469888317 bytes 3351813967 pkt (dropped 0, overlimits 0 requeues 21624365) rate 37671Mbit 3231836pps backlog 4904740b 250p requeues 21624365 count 0 lastcount 0 ldelay 1.1ms drop_next 0us maxpacket 68130 ecn_mark 0 drop_overlimit 0 ce_mark 410861803 Signed-off-by: Eric Dumazet <edumazet@google.com>	2015-05-21 15:25:05 -07:00
Jiri Pirko	30eb304ecd	tc: add support for Flower classifier Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2015-05-21 15:22:49 -07:00
David Ward	357c45ad3a	tc: gred: Adopt the term VQ in the command syntax and output In the GRED kernel source code, both of the terms "drop parameters" (DP) and "virtual queue" (VQ) are used to refer to the same thing. Each "DP" is better understood as a "set of drop parameters", since it has values for limit, min, max, avpkt, etc. This terminology can result in confusion when creating a GRED qdisc having multiple DPs. Netlink attributes and struct members with the DP name seem to have been left intact for compatibility, while the term VQ was otherwise adopted in the code, which is more intuitive. Use the VQ term in the tc command syntax and output (but maintain compatibility with the old syntax). Rewrite the usage text to be concise and similar to other qdiscs. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	eb6d7d6af1	tc: gred: Handle unsigned values properly in option parsing/printing DPs, def_DP, and DP are unsigned values that are sent and received in TCA_GRED_* netlink attributes; handle them properly when they are parsed or printed. Use MAX_DPs as the initial value for def_DP and DP, and fix the operator used for bounds checking them. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	1693a4d392	tc: gred: Improve parameter/statistics output Make the output more consistent with the RED qdisc, and only show details/statistics if the appropriate flag is set when calling tc. Show the parameters used with "gred setup". Add missing statistics "pdrop" and "other". Fix format specifiers for unsigned values. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	a77905ef6a	tc: gred: Print usage text if no arguments appear after "gred" This is more helpful to the user, since the command takes two forms, and the message that would otherwise appear about missing parameters assumes one of those forms. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	d73e0408e2	tc: gred: Fix whitespace issues in code Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	7bf17a2264	tc: red: Mark "bandwidth" parameter as optional in usage text Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	d93c909a4c	tc: red, gred: Notify when using the default value for "bandwidth" The "bandwidth" parameter is optional, but ensure the user is aware of its default value, to proactively avoid configuration problems. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	6c99695da2	tc: red, gred: Fix format specifier in burst size warning burst is an unsigned value. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
David Ward	9d9a67c756	tc: red, gred: Rename overloaded variable wlog It is used when parsing three different parameters, only one of which is Wlog. Change the name to make the code less confusing. Signed-off-by: David Ward <david.ward@ll.mit.edu>	2015-05-21 14:16:03 -07:00
Daniel Borkmann	ec6f5abcea	tc: minor cleanup on ingress Fix whitespacing and remove the unnecessary condition. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-05-11 09:18:10 -07:00
WANG Cong	285e7768e8	tc: fill in handle before checking argc When deleting a specific basic filter with handle, tc command always ignores the 'handle' option, so tcm_handle is always 0 and kernel deletes all filters in the selected group. This is wrong, we should respect 'handle' in cmdline. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>	2015-05-11 09:13:20 -07:00
Daniel Borkmann	d937a74b6d	tc: {m, f}_ebpf: add option for dumping verifier log Currently, only on error we get a log dump, but I found it useful when working with eBPF to have an option to also dump the log on success. Also spotted a typo in a header comment, which is fixed here as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com>	2015-05-04 08:43:08 -07:00
Daniel Borkmann	4bd624467b	tc: built-in eBPF exec proxy This work follows upon commit `6256f8c9e4` ("tc, bpf: finalize eBPF support for cls and act front-end") and takes up the idea proposed by Hannes Frederic Sowa to spawn a shell (or any other command) that holds generated eBPF map file descriptors. File descriptors, based on their id, are being fetched from the same unix domain socket as demonstrated in the bpf_agent, the shell spawned via execvpe(2) and the map fds passed over the environment, and thus are made available to applications in the fashion of std{in,out,err} for read/write access, for example in case of iproute2's examples/bpf/: # env \| grep BPF BPF_NUM_MAPS=3 BPF_MAP1=6 <- BPF_MAP_ID_QUEUE (id 1) BPF_MAP0=5 <- BPF_MAP_ID_PROTO (id 0) BPF_MAP2=7 <- BPF_MAP_ID_DROPS (id 2) # ls -la /proc/self/fd [...] lrwx------. 1 root root 64 Apr 14 16:46 0 -> /dev/pts/4 lrwx------. 1 root root 64 Apr 14 16:46 1 -> /dev/pts/4 lrwx------. 1 root root 64 Apr 14 16:46 2 -> /dev/pts/4 [...] lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map The advantage (as opposed to the direct/native usage) is that now the shell is map fd owner and applications can terminate and easily reattach to descriptors w/o any kernel changes. Moreover, multiple applications can easily read/write eBPF maps simultaneously. To further allow users for experimenting with that, next step is to add a small helper that can get along with simple data types, so that also shell scripts can make use of bpf syscall, f.e to read/write into maps. Generally, this allows for prepopulating maps, or any runtime altering which could influence eBPF program behaviour (f.e. different run-time classifications, skb modifications, ...), dumping of statistics, etc. Reference: http://thread.gmane.org/gmane.linux.network/357471/focus=357860 Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com>	2015-04-27 16:39:23 -07:00
Nicolas Dichtel	afa5158f02	tc: fix compilation warning on 32bits arch The warning was: m_simple.c: In function ‘parse_simple’: m_simple.c:142:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘size_t’ [-Wformat] Useful to be able to compile with -Werror. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>	2015-04-27 11:41:46 -07:00
Vadim Kochan	46679bbbe8	tc util: Fix possible buffer overflow when print class id Use correct handle buffer length. Signed-off-by: Vadim Kochan <vadim4j@gmail.com>	2015-04-20 10:06:02 -07:00
Felix Fietkau	b8d5c9a71b	tc: add support for connmark action Add ability to add the netfilter connmark support. Typical usage: ...lets tag outgoing icmp with mark 0x10.. iptables -tmangle -A PREROUTING -p icmp -j CONNMARK --set-mark 0x10 ..add on ingress of $ETH an extractor for connmark... tc filter add dev $ETH parent ffff: prio 4 protocol ip \ u32 match ip protocol 1 0xff \ flowid 1:1 \ action connmark continue ...if the connmark was 0x11, we police to a ridic rate of 10Kbps tc filter add dev $ETH parent ffff: prio 5 protocol ip \ handle 0x11 fw flowid 1:1 \ action police rate 10kbit burst 10k Other ways to use the connmark is to supply the zone, index and branching choice. Refer to help. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2015-04-13 10:49:45 -07:00
Daniel Borkmann	6256f8c9e4	tc, bpf: finalize eBPF support for cls and act front-end This work finalizes both eBPF front-ends for the classifier and action part in tc, it allows for custom ELF section selection, a simplified tc command frontend (while keeping compat), reusing of common maps between classifier and actions residing in the same object file, and exporting of all map fds to an eBPF agent for handing off further control in user space. It also adds an extensive example of how eBPF can be used, and a minimal self-contained example agent that dumps map data. The example is well documented and hopefully provides a good starting point into programming cls_bpf and act_bpf. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>	2015-04-10 13:31:19 -07:00
Stephen Hemminger	bd733e4088	Merge branch 'master' into net-next Conflicts: man/man8/ip-route.8.in	2015-04-07 08:56:14 -07:00
Vadim Kochan	8b90a9907e	tc class: Ignore if default class name file does not exist If '-nm' specified that do not fail if there is no default class names file in /etc/iproute2. Changed default class name file cls_names -> tc_cls. Signed-off-by: Vadim Kochan <vadim4j@gmail.com>	2015-04-07 08:31:56 -07:00
Daniel Borkmann	11c39b5e98	tc: add eBPF support to f_bpf This work adds the tc frontend for kernel commit e2e9b6541dd4 ("cls_bpf: add initial eBPF support for programmable classifiers"). A C-like classifier program (f.e. see e2e9b6541dd4) is being compiled via LLVM's eBPF backend into an ELF file, that is then being passed to tc. tc then loads, if any, eBPF maps and eBPF opcodes (with fixed-up eBPF map file descriptors) out of its dedicated sections, and via bpf(2) into the kernel and then the resulting fd via netlink down to cls_bpf. cls_bpf allows for annotations, currently, I've used the file name for that, so that the user can easily identify his filter when dumping configurations back. Example usage: clang -O2 -emit-llvm -c cls.c -o - \| llc -march=bpf -filetype=obj -o cls.o tc filter add dev em1 parent 1: bpf run object-file cls.o classid x:y tc filter show dev em1 [...] filter parent 1: protocol all pref 49152 bpf handle 0x1 flowid x:y cls.o I placed the parser bits derived from Alexei's kernel sample, into tc_bpf.c as my next step is to also add the same support for BPF action, so we can have a fully fledged eBPF classifier and action in tc. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com>	2015-03-24 15:45:23 -07:00
Daniel Borkmann	51cf36756c	tc: m_bpf: fix next arg selection after tc opcode Next argument after the tc opcode/verdict is optional, using NEXT_ARG() requires to have another argument after that one otherwise tc will bail out. Therefore, we need to advance to the next argument manually as done elsewhere. Fixes: `86ab59a666` ("tc: add support for BPF based actions") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Pirko <jiri@resnulli.us>	2015-03-24 15:39:53 -07:00
Vadim Kochan	4612d04d6b	tc class: Show class names from file It is possible to use class names from file /etc/iproute2/cls_names which tc will use when showing class info: # tc/tc -nm class show dev lo class htb 1:10 parent 1:1 leaf 10: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b class htb 1:1 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b class htb web#1:20 parent 1:1 leaf 20: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b class htb 1:2 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b class htb 1:30 parent 1:1 leaf 30: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b class htb voip#1:40 parent 1:2 leaf 40: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b class htb 1:50 parent 1:2 leaf 50: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b class htb 1:60 parent 1:2 leaf 60: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b or to specify via file path: # tc/tc -nm -cf /tmp/cls_names class show dev lo Class names file contains simple "maj:min name" structure: 1:20 web 1:40 voip Signed-off-by: Vadim Kochan <vadim4j@gmail.com>	2015-03-15 12:27:40 -07:00
Daniel Borkmann	32caee9fc7	m_bpf: remove unrelevant help lines Left-overs when copying this over from cls_bpf. ;) Lets remove them. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us>	2015-02-27 19:00:51 -08:00
Jiri Pirko	86ab59a666	tc: add support for BPF based actions Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2015-02-05 10:38:13 -08:00
Jiri Pirko	1d129d191a	tc: push bpf common code into separate file Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2015-02-05 10:38:13 -08:00
Jamal Hadi Salim	564663b4ca	actions: Get vlan action to work in pipeline When specified in a graph such as: action vlan ... action foobar the vlan action chewed more than it can swallow Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2015-01-13 17:22:44 -08:00
Vadim Kochan	67e1d73be1	tc: Allow to easy change network namespace Added new '-netns' option to simplify executing following cmd: ip netns exec NETNS tc OPTIONS COMMAND OBJECT to tc -n[etns] NETNS OPTIONS COMMAND OBJECT e.g.: tc -net vnet0 qdisc Signed-off-by: Vadim Kochan <vadim4j@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us>	2014-12-27 10:22:34 -08:00
Vadim Kochan	d954b34a1f	tc class: Show classes as ASCII graph Added new '-g[raph]' option which shows classes in the graph view. Meanwhile only generic stats info output is supported. e.g.: $ tc/tc -g class show dev tap0 +---(1:2) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b \| +---(1:40) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b \| +---(1:50) htb rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b \| \| +---(1:51) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b \| \| \| +---(1:60) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b \| +---(1:1) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b +---(1:10) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b +---(1:20) htb prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b +---(1:30) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b $ tc/tc -g -s class show dev tap0 +---(1:2) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b \| \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| \| rate 0bit 0pps backlog 0b 0p requeues 0 \| \| \| +---(1:40) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b \| \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| \| rate 0bit 0pps backlog 0b 0p requeues 0 \| \| \| +---(1:50) htb rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b \| \| \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| \| \| rate 0bit 0pps backlog 0b 0p requeues 0 \| \| \| \| \| +---(1:51) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b \| \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| \| rate 0bit 0pps backlog 0b 0p requeues 0 \| \| \| +---(1:60) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| rate 0bit 0pps backlog 0b 0p requeues 0 \| +---(1:1) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| rate 0bit 0pps backlog 0b 0p requeues 0 \| +---(1:10) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| rate 0bit 0pps backlog 0b 0p requeues 0 \| +---(1:20) htb prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b \| Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) \| rate 0bit 0pps backlog 0b 0p requeues 0 \| +---(1:30) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 Signed-off-by: Vadim Kochan <vadim4j@gmail.com>	2014-12-27 10:16:51 -08:00
Stephen Hemminger	5c2c10b17e	Merge branch 'net-next'	2014-12-24 12:23:00 -08:00
Stephen Hemminger	3d0b7439df	whitespace cleanup Remove all trailing whitespace and space before tabs.	2014-12-20 15:47:17 -08:00
Stephen Hemminger	c9b8aef6ae	Merge branch 'master' into net-next	2014-12-09 16:33:59 -08:00
Stephen Hemminger	b2e116d6c3	tc: minor spelling fixes	2014-12-03 19:28:34 -08:00
Jiri Pirko	8b1c0216d8	tc: add support for vlan tc action Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Cong Wang <cwang@twopensource.com>	2014-12-03 09:29:21 -08:00
Stephen Hemminger	edd3979272	emp: fix warning on deprecated bison directive emp_ematch.y:12.1-13: warning: deprecated directive, use ‘%name-prefix’ [-Wdeprecated] %name-prefix="ematch_" ^^^^^^^^^^^^^	2014-10-09 08:31:10 -07:00
Jamal Hadi Salim	863ecb04b4	discourage use of direct policer interface Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2014-10-09 08:26:57 -07:00
Jamal Hadi Salim	287bf3a990	route classifier support for multiple actions route can now use the action syntax Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2014-10-09 08:26:57 -07:00
Jamal Hadi Salim	08139c2ffb	tcindex classifier support for multiple actions tcindex can now use the action syntax Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2014-10-09 08:26:56 -07:00
Andy Furniss	a07c6d6135	add missing underscore to man page and example nf_mark ematch The man page and the "fail" example are missing an underscore in the nf_mark ematch. eg. tc filter add dev eth0 parent ffff: basic match 'meta(nfmark gt 24)' classid 2:4 meta: unknown meta id ... >>meta(nfmark gt 24)<< ... ... meta(>>nfmark<< gt 24)... Usage: meta(OBJECT { eq \| lt \| gt } OBJECT) where: OBJECT := { META_ID \| VALUE } META_ID := id [ shift SHIFT ] [ mask MASK ] Example: meta(nfmark gt 24) meta(indev shift 1 eq "ppp") meta(tcindex mask 0xf0 eq 0xf0) For a list of meta identifiers, use meta(list). Illegal "ematch" meta(list) does correctly show nf_mark and the above test works with nf_mark. Signed-off-by: Andy Furniss adf.lists@gmail.com	2014-10-09 08:24:00 -07:00
Jamal Hadi Salim	10f5a375ea	rsvp classifier support for multiple actions Example setup: sudo tc qdisc del dev eth0 root handle 1:0 prio sudo tc qdisc add dev eth0 root handle 1:0 prio sudo tc filter add dev eth0 pref 10 proto ip parent 1:0 \ rsvp session 10.0.0.1 ipproto icmp \ classid 1:1 \ action police rate 1kbit burst 90k pipe \ action ok tc -s filter show dev eth0 parent 1:0 filter protocol ip pref 10 rsvp filter protocol ip pref 10 rsvp fh 0x0001100a flowid 1:1 session 10.0.0.1 ipproto icmp action order 1: police 0x5 rate 1Kbit burst 23440b mtu 2Kb action pipe overhead 0b ref 1 bind 1 Action statistics: Sent 98000 bytes 1000 pkt (dropped 0, overlimits 761 requeues 0) backlog 0b 0p requeues 0 action order 2: gact action pass random type none pass val 0 index 2 ref 1 bind 1 installed 60 sec used 3 sec Action statistics: Sent 74578 bytes 761 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Tested-by: John Fastabend <john.r.fastabend@intel.com>	2014-09-29 08:47:33 -07:00
Jamal Hadi Salim	954de6c72b	actions: BugFix action stats to display with -s Was broken by commit `288abf513f` Lets not be too clever and have a separate call to print flushed actions info. Broken looks like: root@moja-1:~# tc actions add action drop index 4 root@moja-1:~# tc -s actions ls action gact action order 0: gact action drop random type none pass val 0 index 4 ref 1 bind 0 installed 9 sec used 4 sec The fixed version looks like: action order 0: gact action drop random type none pass val 0 index 4 ref 1 bind 0 installed 9 sec used 4 sec Sent 108948 bytes 1297 pkts (dropped 1297, overlimits 0) Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2014-09-29 08:47:19 -07:00
Jay Vosburgh	3757185b29	tc/netem: loss gemodel options fixes First, the default value for 1-k is documented as being 0, but is currently being set to 1. (100%). This causes all packets to be dropped in the good state if 1-k is not explicitly specified. Fix this by setting the default to 0. Second, the 1-h option is parsed correctly, however, the kernel is expecting "h", not 1-h. Fix this by inverting the "1-h" percentage before sending to and after receiving from the kernel. This does change the behavior, but makes it consistent with the netem documentation and the literature on the Gilbert-Elliot model, which refer to "1-h" and "1-k," not "h" or "k" directly. Last, fix a minor formatting issue for the options reporting. Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>	2014-08-04 10:15:10 -07:00
Yang Yingliang	aeb199d5ce	fq: allow options of fair queue set to ~0U Some options of fair queue cannot be (~0U). It leads to maxrate cannot be reset to unlimited because it cannot be (~0U). Allow the options being ~0U. Tested by the following command: # tc qdisc add dev eth4 root handle 1: fq limit 2000 flow_limit 200 maxrate 100mbit quantum 2000 initial_quantum 1600 # tc -s -d qdisc show qdisc fq 1: dev eth4 root refcnt 2 limit 2000p flow_limit 200p buckets 1024 quantum 2000 initial_quantum 1600 maxrate 100Mbit Sent 1492 bytes 10 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 1 flows (0 inactive, 0 throttled) 0 gc, 0 highprio, 0 throttled # tc qdisc change dev eth4 root handle 1: fq limit 4294967295 flow_limit 4294967295 maxrate 34359738360 quantum 4294967295 initial_quantum 4294967295 # tc -s -d qdisc show qdisc fq 1: dev eth4 root refcnt 2 limit 4294967295p flow_limit 4294967295p buckets 1024 quantum 4294967295 initial_quantum 4294967295 Sent 38372 bytes 216 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 2 flows (1 inactive, 0 throttled) 0 gc, 2 highprio, 7 throttled Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>	2014-06-09 12:42:36 -07:00
Sergey V. Lobanov	3ff10e82c1	Fixed 'tc qdisc show' for tbf when latency<0 When limit<burst latency becomes <0, for example: # tc qdisc add dev eth0 root handle 1: tbf limit 100K burst 256K rate 256kbit # tc qdisc show qdisc tbf 1: dev eth0 root refcnt 2 rate 256Kbit burst 256Kb lat 4290.0s If latency<0 there is no reason to show it. Limit will be printed instead of latency when latency<0: # tc qdisc show qdisc tbf 1: dev eth0 root refcnt 2 rate 256Kbit burst 256Kb limit 100Kb Signed-off-by: Sergey V. Lobanov <sergey@lobanov.in>	2014-05-28 17:08:16 -07:00
Jamal Hadi Salim	288abf513f	actions: correctly report the number of actions flushed This also fixes a long standing bug of not sanely reporting the action chain ordering Sample scenario test on window 1(event window): run "tc monitor" and observe events on window 2: sudo tc actions add action drop index 10 sudo tc actions add action ok index 12 sudo tc actions ls action gact sudo tc actions flush action gact See the event window reporting two entries (doing another listing should show empty generic actions) Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2014-05-28 16:54:31 -07:00
Jamal Hadi Salim	9282d08d93	actions: keyword flowid or classid terminates action pipeline scenario testcase: TC="sudo ./tc/tc" DEV="dev eth0" $TC qdisc del $DEV ingress $TC qdisc add $DEV ingress $TC filter add $DEV parent ffff: protocol ip u32 match ip src 10.0.0.0/24 action police rate 6Mbit burst 6Mbit drop flowid :1 $TC filter add $DEV parent ffff: protocol ip u32 match ip dst 10.0.0.0/24 action police rate 1Gbit burst 1Gbit pass flowid :1 $TC -s filter ls $DEV parent ffff: protocol ip $TC qdisc del $DEV ingress $TC qdisc add $DEV ingress $TC filter add $DEV parent ffff: protocol ip u32 match ip src 10.0.0.0/24 flowid 1:1 action police rate 6Mbit burst 6Mbit drop $TC filter add $DEV parent ffff: protocol ip u32 match ip dst 10.0.0.0/24 flowid 1:2 action police rate 1Gbit burst 1Gbit pass $TC -s filter ls $DEV parent ffff: protocol ip $TC qdisc del $DEV ingress $TC qdisc add $DEV ingress $TC filter add $DEV parent ffff: protocol ip pref 10 \ u32 match ip protocol 1 0xff \ flowid 1:10 \ action skbedit mark 11 \ action police rate 10kbit burst 10k pipe index 1 \ action skbedit mark 12 \ action police rate 20kbit burst 20k pipe index 2 \ action mirred egress mirror dev dummy0 $TC -s filter ls $DEV parent ffff: protocol ip $TC qdisc del $DEV ingress $TC qdisc add $DEV ingress $TC filter add $DEV parent ffff: protocol ip pref 10 \ u32 match ip protocol 1 0xff \ action skbedit mark 11 \ action police rate 10kbit burst 10k pipe index 1 \ action skbedit mark 12 \ action police rate 20kbit burst 20k pipe index 2 \ action mirred egress mirror dev dummy0 \ flowid 1:10 $TC -s filter ls $DEV parent ffff: protocol ip Reported-by: Seann Herdejurgen <seann@herdejurgen.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2014-05-28 16:54:28 -07:00
Jamal Hadi Salim	cacba03b10	Remove unnecessary debug statement Reported-by: Seann Herdejurgen <seann@herdejurgen.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2014-05-28 16:54:26 -07:00
Natanael Copa	dd9cc0ee81	iproute2: various header include fixes for compiling with musl libc We need limits.h for LONG_MIN and LONG_MAX, sys/param.h for MIN and sys/select for struct timeval. This fixes the following compile errors with musl libc: f_bpf.c: In function 'bpf_parse_opt': f_bpf.c:181:12: error: 'LONG_MIN' undeclared (first use in this function) if (h == LONG_MIN \|\| h == LONG_MAX) { ^ ... tc_util.o: In function `print_tcstats2_attr': tc_util.c:(.text+0x13fe): undefined reference to `MIN' tc_util.c:(.text+0x1465): undefined reference to `MIN' tc_util.c:(.text+0x14ce): undefined reference to `MIN' tc_util.c:(.text+0x154c): undefined reference to `MIN' tc_util.c:(.text+0x160a): undefined reference to `MIN' tc_util.o:tc_util.c:(.text+0x174e): more undefined references to `MIN' follow ... tc_stab.o: In function `print_size_table': tc_stab.c:(.text+0x40f): undefined reference to `MIN' ... fdb.c:247:30: error: 'ULONG_MAX' undeclared (first use in this function) (vni >> 24) \|\| vni == ULONG_MAX) ^ lnstat.h:28:17: error: field 'last_read' has incomplete type struct timeval last_read; /* last time of read */ ^ Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>	2014-05-28 16:51:39 -07:00
Andreas Greve	6e2e5ec28b	fix print_ipt: segfault if more then one filter with action -j MARK. BUG: tc filter show ... produce a segmentation fault if more than one filter rule with action -j MARK exists. Reason: In print_ipt(...) xtables will be initialzed with a pointer to the static struct tcipt_globals at xtables_init_all(). Later on the fields .opts and .options_offset of tcipt_globals are modified. The call of xtables_free_opts(1) at the end of print(...) does not restore the original values of tcipt_globals for the modified fields. It only frees some allocated memory and sets .opts to NULL. This leads to a segmentation fault when print_ipt() is called for the next filter rule with action -j MARK. Fix: Cloneing tcipt_globals on the stack as tmp_tcipt_globals and use it instead of tcipt_globals, so tcipt_globals will be not modified. Signed-off-by: Andreas Greve <andreas.greve@a-greve.de>	2014-05-13 13:10:31 -07:00
Terry Lam	ac74bd2a71	support for Heavy Hitter Filter (HHF) qdisc $tc qdisc add dev eth0 hhf help Usage: ... hhf [ limit PACKETS ] [ quantum BYTES] [ hh_limit NUMBER ] [ reset_timeout TIME ] [ admit_bytes BYTES ] [ evict_timeout TIME ] [ non_hh_weight NUMBER ] $tc -s -d qdisc show dev eth0 qdisc hhf 8005: root refcnt 32 limit 1000p quantum 1514 hh_limit 2048 reset_timeout 40.0ms admit_bytes 131072 evict_timeout 1.0s non_hh_weight 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 drop_overlimit 0 hh_overlimit 0 tot_hh 0 cur_hh 0 HHF qdisc parameters: - limit: max number of packets in qdisc (default 1000) - quantum: max deficit per RR round (default 1 MTU) - hh_limit: max number of HHs to keep states (default 2048) - reset_timeout: time to reset HHF counters (default 40ms) - admit_bytes: counter thresh to classify as HH (default 128KB) - evict_timeout: threshold to evict idle HHs (default 1s) - non_hh_weight: DRR weight for mice (default 2) Signed-off-by: Terry Lam <vtlam@google.com>	2014-05-09 12:10:47 -07:00
Jay Vosburgh	8f9672af7a	tc/netem: fix loss state display and p14 parsing The display of the entire netem loss state is shown as if it were gemodel state, as the loss state information is assigned to the wrong pointer. Correct this by assigning the loss state to the correct pointer. Additionally, attempting to set netem loss state will result in random values in the p14 state probability because the option value passed to the kernel by tc netem is not parsed or initialized. Fix this by supplying a default value of 0 for p14 and parsing the p14 value if one is supplied. Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>	2014-05-09 12:06:58 -07:00
Hiroaki SHIMODA	4d4da09e00	htb: Move direct_qlen code part to htb_parse_opt(). The direct_qlen command option is used with qdisc operation. It happened to be implemented in htb_parse_class_opt() which is called with class operation. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com>	2014-03-21 14:20:06 -07:00
WANG Cong	1c9af05071	pedit: do not print debugging information by default Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>	2014-02-10 14:43:52 -08:00
Yang Yingliang	dad2f72bef	netem: add 64bit rates support netem support 64bit rates start from linux-3.13. Add 64bit rates support in tc tools. tc qdisc show dev eth0 qdisc netem 1: dev eth4 root refcnt 2 limit 1000 rate 35Gbit Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com>	2014-01-20 12:32:15 -08:00
Yang Yingliang	a01de0a336	tbf: support sending burst/mtu to kernel directly To avoid loss when transforming burst to buffer in userspace, send burst/mtu to kernel directly. Kernel commit 2e04ad424b("sch_tbf: add TBF_BURST/TBF_PBURST attribute") make it can handle burst/mtu. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>	2014-01-20 12:32:14 -08:00
Vijay Subramanian	80dd880dd0	PIE: Proportional Integral controller Enhanced Proportional Integral controller Enhanced (PIE) is a scheduler to address the bufferbloat problem. We present here a lightweight design, PIE(Proportional Integral controller Enhanced) that can effectively control the average queueing latency to a target value. Simulation results, theoretical analysis and Linux testbed results have shown that PIE can ensure low latency and achieve high link utilization under various congestion situations. The design does not require per-packet timestamp, so it incurs very small overhead and is simple enough to implement in both hardware and software. " For more information, please see technical paper about PIE in the IEEE Conference on High Performance Switching and Routing 2013. A copy of the paper can be found at ftp://ftpeng.cisco.com/pie/. Please also refer to the IETF draft submission at http://tools.ietf.org/html/draft-pan-tsvwg-pie-00 All relevant code, documents and test scripts and results can be found at ftp://ftpeng.cisco.com/pie/. For problems with the iproute2/tc or Linux kernel code, please contact Vijay Subramanian (vijaynsu@cisco.com or subramanian.vijay@gmail.com) Mythili Prabhu (mysuryan@cisco.com) Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: Mythili Prabhu <mysuryan@cisco.com> CC: Dave Taht <dave.taht@bufferbloat.net>	2014-01-09 22:50:47 -08:00
Stephen Hemminger	ef056b2190	Merge branch 'master' into net-next-for-3.13	2014-01-09 22:44:17 -08:00
Jamal Hadi Salim	f24a7e7205	dont skip action order attached. cheers, jamal commit 58d78f9f6447df324cdeb99262442c5e3f1f924b Author: Jamal Hadi Salim <jhs@mojatatu.com> Date: Sun Dec 22 10:34:18 2013 -0500 dont skip displaying of action chains or lists by TCA_ACT_MAX_PRIO Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2013-12-28 10:57:34 -08:00
Jamal Hadi Salim	b159a7f1ae	allow batch gets of actions Attached. cheers, jamal commit c5f30cabef14c951596210b96bc9b423b0d39592 Author: Jamal Hadi Salim <hadi@mojatatu.com> Date: Sun Dec 22 10:24:17 2013 -0500 Allow batching of action gets Example: ---- tc actions get \ action gact index 100 \ action gact index 4 ---- Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>	2013-12-28 10:57:34 -08:00

1 2 3 4 5 ...

550 Commits