Implement support for classifier/action terse dump using new TCA_DUMP_FLAGS
tlv with only available flag value TCA_DUMP_FLAGS_TERSE. Set the flag when
user requested it with following example CLI (-br for 'brief'):
$ tc -s -br filter show dev ens1f0 ingress
filter protocol ip pref 49151 flower chain 0
filter protocol ip pref 49151 flower chain 0 handle 0x1
not_in_hw
action order 1: gact Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
filter protocol ip pref 49152 flower chain 0
filter protocol ip pref 49152 flower chain 0 handle 0x1
not_in_hw
action order 1: gact Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
In terse mode dump only outputs essential data needed to identify the
filter and action (handle, cookie, etc.) and stats, if requested by the
user. The intention is to significantly improve rule dump rate by omitting
all static data that do not change after rule is created.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The kernel now requires all new nested attributes to set the
NLA_F_NESTED flag. Enable tc {qdisc,class,filter} to parse
attributes that have the NLA_F_NESTED flag set.
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
In oneline mode the line seperator should be \
but several parts of tc aren't doing it right.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The revert of batchsize accidently reverted more than it should
and broke shared block functionality. Fix this by restoring the
original functionality.
To reproduce:
dst_ip 192.0.2.0/24 action drop
Unknown filter "block", hence option "10" is unparsable
Fixes: e991c04d64 ("Revert "tc: Add batchsize feature for filter and actions"")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Many tc modules were printing error messages to stdout.
This is problematic if using JSON or other output formats.
Change all these places to use fprintf(stderr, ...) instead.
Also, remove unnecessary initialization and places
where else is used after error return.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
No function, filter, or print function uses the sockaddr_nl arg,
so just drop it.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
There is a couple of places where we report error in case of no network
device is found. In all of them we output message in the same format to
stderr and either return -1 or 1 to the caller or exit with -1.
Introduce new helper function nodev() that takes name of the network
device caused error and returns -1 to it's caller. Either call exit()
or return to the caller to preserve behaviour before change.
Use -nodev() in traffic control (tc) code to return 1.
Simplify expression for checking for argument being 0/NULL in @if
statement.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Implement the -color option; in this case -co is ambiguous
since it was already used for -conf.
For now this just means putting device name in color.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
So far, qdisc was the only handle that could be used to manipulate
filters. Kernel added support for using block to manipulate it. So add
the support to use block index to manipulate filters. The magic
TCM_IFINDEX_MAGIC_BLOCK indicates the block index is in use.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this support, at most 128
commands can be accumulated before sending to kernel.
Now it only works for the following successive commands:
1. filter add/delete/change/replace
2. actions add/change/replace
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Move resolving device name into an ifindex before calling filter
specific callbacks. This way if filters need the ifindex, they
can read it from the request.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
For places where tc is expecting device name use IFNAMSIZ.
For others where it is a filter name, introduce a new constant.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This is an update for 460c03f3f3 ("iplink: double the buffer size also in
iplink_get()"). After update, we will not need to double the buffer size
every time when VFs number increased.
With call like rtnl_talk(&rth, &req.n, NULL, 0), we can simply remove the
length parameter.
With call like rtnl_talk(&rth, nlh, nlh, sizeof(req), I add a new variable
answer to avoid overwrite data in nlh, because it may has more info after
nlh. also this will avoid nlh buffer not enough issue.
We need to free answer after using.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
The later check for 'k[0] != 0' requires a non-empty filter name,
otherwise NULL pointer dereference in 'q' might happen.
Signed-off-by: Phil Sutter <phil@nwl.cc>
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol ip \
u32 match u32 0 0 flowid 1:1 \
action ok
sudo $TC filter add dev $ETH parent ffff: prio 1 protocol ip \
u32 match ip protocol 1 0xff flowid 1:10 \
action ok
now dump to see all rules..
$TC -s filter ls dev $ETH parent ffff: protocol ip
....
filter pref 1 u32
filter pref 1 u32 fh 801: ht divisor 1
filter pref 1 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10 (rule hit 0 success 0)
match 00010000/00ff0000 at 8 (success 0 )
action order 1: gact action drop
random type none pass val 0
index 6 ref 1 bind 1 installed 4 sec used 4 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
filter pref 2 u32
filter pref 2 u32 fh 800: ht divisor 1
filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 (rule hit 336 success 336)
match 00000000/00000000 at 0 (success 336 )
action order 1: gact action pass
random type none pass val 0
index 5 ref 1 bind 1 installed 38 sec used 4 sec
Action statistics:
Sent 24864 bytes 336 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
....
..get filter 801::800
$TC -s filter get dev $ETH parent ffff: protocol ip \
handle 801:0:800 prio 2 u32
....
filter parent ffff: protocol ip pref 1 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10 (rule hit 260 success 130)
match 00010000/00ff0000 at 8 (success 130 )
action order 1: gact action drop
random type none pass val 0
index 6 ref 1 bind 1 installed 348 sec used 0 sec
Action statistics:
Sent 11440 bytes 130 pkt (dropped 130, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
....
..get other one
$TC -s filter get dev $ETH parent ffff: protocol ip \
handle 800:0:800 prio 2 u32
....
filter parent ffff: protocol ip pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 (rule hit 514 success 514)
match 00000000/00000000 at 0 (success 514 )
action order 1: gact action pass
random type none pass val 0
index 5 ref 1 bind 1 installed 506 sec used 4 sec
Action statistics:
Sent 35544 bytes 514 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
....
..try something that doesnt exist
$TC -s filter get dev $ETH parent ffff: protocol ip handle 800:0:803 prio 2 u32
.....
RTNETLINK answers: No such file or directory
We have an error talking to the kernel
.....
Note, added NLM_F_ECHO is for backward compatibility. old kernels never
before Eric's patch will not respond without it and newer kernels (after Erics patch)
will ignore it.
In old kernels there is a side effect:
In addition to a response to the GET you will receive an event (if you do tc mon).
But this is still better than what it was before (not working at all).
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
This big patch was compiled by vimgrepping for memset calls and changing
to C99 initializer if applicable. One notable exception is the
initialization of union bpf_attr in tc/tc_bpf.c: changing it would break
for older gcc versions (at least <=3.4.6).
Calls to memset for struct rtattr pointer fields for parse_rtattr*()
were just dropped since they are not needed.
The changes here allowed the compiler to discover some unused variables,
so get rid of them, too.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
There have been several instances where response from kernel
has overrun the stack buffer from the caller. Avoid future problems
by passing a size argument.
Also drop the unused peer and group arguments to rtnl_talk.
This work follows upon commit 6256f8c9e4 ("tc, bpf: finalize eBPF
support for cls and act front-end") and takes up the idea proposed by
Hannes Frederic Sowa to spawn a shell (or any other command) that holds
generated eBPF map file descriptors.
File descriptors, based on their id, are being fetched from the same
unix domain socket as demonstrated in the bpf_agent, the shell spawned
via execvpe(2) and the map fds passed over the environment, and thus
are made available to applications in the fashion of std{in,out,err}
for read/write access, for example in case of iproute2's examples/bpf/:
# env | grep BPF
BPF_NUM_MAPS=3
BPF_MAP1=6 <- BPF_MAP_ID_QUEUE (id 1)
BPF_MAP0=5 <- BPF_MAP_ID_PROTO (id 0)
BPF_MAP2=7 <- BPF_MAP_ID_DROPS (id 2)
# ls -la /proc/self/fd
[...]
lrwx------. 1 root root 64 Apr 14 16:46 0 -> /dev/pts/4
lrwx------. 1 root root 64 Apr 14 16:46 1 -> /dev/pts/4
lrwx------. 1 root root 64 Apr 14 16:46 2 -> /dev/pts/4
[...]
lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map
lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map
lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map
The advantage (as opposed to the direct/native usage) is that now the
shell is map fd owner and applications can terminate and easily reattach
to descriptors w/o any kernel changes. Moreover, multiple applications
can easily read/write eBPF maps simultaneously.
To further allow users for experimenting with that, next step is to add
a small helper that can get along with simple data types, so that also
shell scripts can make use of bpf syscall, f.e to read/write into maps.
Generally, this allows for prepopulating maps, or any runtime altering
which could influence eBPF program behaviour (f.e. different run-time
classifications, skb modifications, ...), dumping of statistics, etc.
Reference: http://thread.gmane.org/gmane.linux.network/357471/focus=357860
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Because we use the high 16 bits of tcm_info to pass prio value to
kernel, thus it's range would be [0, 0xffff], without validation
in tc when user pass a lager(>65535) priority, the actual priority
set in kernel would confuse the user.
So, add a validation to ensure prio in the range.
Both rtnl_talk and rtnl_dump had a callback for handling portions
of netlink message that do not match the correct pid or seq.
But this callback was never used by any part of iproute2 so remove
it.
> # tc filter show dev eth1 | grep 4:29:d1
> filter parent 1: protocol ip pref 5 u32 fh 4:29:d1 order 209 key ht 4
> bkt 29 flowid 1:b7aa
>
> # tc filter del dev eth1 parent 1: pref 5 handle 4:29:d1 u32
> RTNETLINK answers: Invalid argument
> We have an error talking to the kernel
>
> after rollback to package"sys-apps/iproute2-2.6.24.20080108" all
> deleted normal...
The current iproute version uses "protocol all" by default
if its not specified. This is actually only useful for creating
new filters, on deletion an unset protocol is treated as wildcard.
makes protocol accessible ..
cheers,
jamal
[PATCH 2/3] [TC/FILTERS] Expose the filter protocol
Expose the filter protocol so it can be used by underlying
classifiers when they need it.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
commit c504ffd627ac211eebf5ed34ef0fbfd7f1dbb347
Author: Patrick McHardy <kaber@trash.net>
Date: Wed Mar 26 07:38:43 2008 +0100
[IPROUTE]: Fix classifier help
The new check whether the user has specified a protocol makes
"ip filter <type> help" fails with "protocol is required".
This could be fixed by moving it further down, but a more user-friendly
way it to simply use ETH_P_ALL as default if nothing is specified.
Signed-off-by: Patrick McHardy <kaber@trash.net>
In order to support these new flags add current
linux/if.h into the directory with the local copies.
This caused troubles with outdated redefinitions from net/if.h
so I've removed the dependency on it.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
* "tc [class|qdisc|filter] get" doesn't exist, remove it from inline help.
* Add "replace" to "tc [class|filter] get" inline help.
* Fix "tc [class|qdisc|filter] help" output:
~$ tc class help
[snip]
Command "help" is unknown, try "tc class help".
~$
with my best wishes,
--
Hasso Tepper
Elion Enterprises Ltd. [AS3249]
Data Communication Network Administrator
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
User runs "tc monitor" (without quotes) and watches events of
addition, deletion and updates from qdiscs, classes, filters and
actions as they happen.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>