Commit Graph

575 Commits

Author SHA1 Message Date
Phil Sutter c15feb99a4 tc/m_gact: Fix action_a2n() return code check
The function returns zero on success.

Reported-by: Mark Bloch <markb@mellanox.com>
Fixes: 69f5aff63c ("tc: use action_a2n() everywhere")
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-08-08 08:52:47 -07:00
Phil Sutter 9579afb24e tc: Fix for missing estimator initialization
When switching to C99 initializers, I forgot to add this one. This means
that when trying to set an estimator value, tc would complain about
spurious duplicate estimator parameter. But much worse, the random
variable content is sent to the kernel regardless of whether an
estimator was given or not.

Fixes: d17b136f7d ("Use C99 style initializers everywhere")
Reported-by: Stas Nichiporovich <stasn77@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-08-06 10:14:06 -07:00
Phil Sutter 7093200611 tc: util: No need for action_n2a() to be reentrant
This allows to remove some buffers here and there. While at it, make it
return a const value.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-07-25 08:10:43 -07:00
Phil Sutter 69f5aff63c tc: use action_a2n() everywhere
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-07-25 08:10:43 -07:00
Phil Sutter 53aadc5286 tc: util: bore up action_a2n()
It's a pitty this function is used nowhere, so let's polish it for use:

* Loop over branch names, makes it clear that every former conditional
  was exactly identical.
* Support 'pipe' branch name, too.
* Make number parsing optional.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-07-25 08:10:43 -07:00
Phil Sutter 9ffc80b1e4 tc: Reformat tc_util.h
* Drop 'extern' keyword before function declarations.
* Add parameter names where they were missing for matters of
  consistency.
* Drop fancy indenting (e.g. tab between type and name).
* Break long lines to not exceed 80 columns.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-07-25 08:10:43 -07:00
Phil Sutter 247ace6115 tc: ematch: Ignore all-zero mask value when printing filters
The optional mask which may be added to int values is considered by the
kernel only if it is non-zero, therefore tc should only then also print
it.

Without this, not passing a mask value like so:

| # tc filter add dev d0 parent 8001: \
| 	basic match meta\(vlan eq 1\) \
| 	classid 8001:1

Would lead to tc printing an all-zero mask later:

| # tc filter show dev d0
| filter parent 8001: protocol all pref 49151 basic
| filter parent 8001: protocol all pref 49151 basic handle 0x1 flowid 8001:1
|   meta(vlan mask 0x00000000 eq 1)

This is obviously confusing as an all-zero mask strictly means to
eliminate all bits from the value, but the opposite is the case.

Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-07-20 12:20:13 -07:00
Phil Sutter 30a8842c49 No need to initialize rtattr fields before parsing
Since parse_rtattr_flags() calls memset already, there is no need for
callers to do so themselves.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
2016-07-20 12:05:24 -07:00
Phil Sutter f89bb0210f Replace malloc && memset by calloc
This only replaces occurrences where the newly allocated memory is
cleared completely afterwards, as in other cases it is a theoretical
performance hit although code would be cleaner this way.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
2016-07-20 12:05:24 -07:00
Phil Sutter d17b136f7d Use C99 style initializers everywhere
This big patch was compiled by vimgrepping for memset calls and changing
to C99 initializer if applicable. One notable exception is the
initialization of union bpf_attr in tc/tc_bpf.c: changing it would break
for older gcc versions (at least <=3.4.6).

Calls to memset for struct rtattr pointer fields for parse_rtattr*()
were just dropped since they are not needed.

The changes here allowed the compiler to discover some unused variables,
so get rid of them, too.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
2016-07-20 12:05:24 -07:00
Phil Sutter d892aaf740 tc: m_action: Improve conversion to C99 style initializers
This improves my initial change in the following points:

- Flatten embedded struct's initializers.
- No need to initialize variables to zero as the key feature of C99
  initializers is to do this implicitly.
- By relocating the declaration of struct rtattr *tail, it can be
  initialized at the same time.

Fixes: a0a73b298a ("tc: m_action: Use C99 style initializers for struct req")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
2016-07-20 12:05:24 -07:00
Daniel Borkmann e77fa41d4c bpf: also check elf for official e_machine value
Use the official BPF ELF e_machine value that was assigned recently [1]
and will be propagated to glibc, libelf et al. LLVM will switch to it
in 3.9 release, therefore we need to prepare tc to check for EM_ELF as
well, older version still have the EM_NONE.

  [1] 36b9c09330

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2016-07-20 11:54:53 -07:00
Amir Vadai cfcabf18d8 tc: flower: Add skip_{hw|sw} support
On devices that support TC flower offloads, these flags enable a filter to be
added only to HW or only to SW. skip_sw and skip_hw are mutually exclusive
flags. By default without any flags, the filter is added to both HW and SW,
but no error checks are done in case of failure to add to HW.
With skip-sw, failure to add to HW is treated as an error.

Here is a sample script that adds 2 filters, one with skip_sw and the other
with skip_hw flag.

   # add ingress qdisc
   tc qdisc add dev enp0s9 ingress

   # enable hw tc offload.
   ethtool -K enp0s9 hw-tc-offload on

   # add a flower filter with skip-sw flag.
   tc filter add dev enp0s9 protocol ip parent ffff: flower \
	   ip_proto 1 indev enp0s9 skip_sw \
	   action drop

   # add a flower filter with skip-hw flag.
   tc filter add dev enp0s9 protocol ip parent ffff: flower \
	   ip_proto 3 indev enp0s9 skip_hw \
	   action drop

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
2016-07-06 21:24:48 -07:00
Phil Sutter 5f6a467f59 tc: m_action: Drop unused variable nladdr in tc_action_gd()
This has been there since the introduction of tc/m_action.c back in 2004
and was apparently never in use.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-16 09:41:55 -07:00
Phil Sutter a0a73b298a tc: m_action: Use C99 style initializers for struct req
Instead of initializing fields after (or sometimes even before) zeroing
the whole struct via memset(), initialize the whole thing at declaration
time.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-16 09:41:55 -07:00
Alexander Aring 9b32f89693 tc: let m_ipt work with new iptables API headers
Since commit 5cd1adb ("Update to current iptables headers") the build
with m_ipt.o and the following config will fail:

TC_CONFIG_XT:=n
TC_CONFIG_XT_OLD:=n
TC_CONFIG_XT_OLD_H:=n

This patch renames "iptables_target" to "xtables_target" and some other
things which gets renamed and I noticed while reading iptables git log.
Functions which are not used in m_ipt.c and not exported by the header
are removed, if they still used in m_ipt.c I added a static to the function.

Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Signed-off-by: Alexander Aring <aar@pengutronix.de>
2016-06-14 18:03:30 -07:00
Stephen Hemminger 4b83a08c28 m_xt: whitespace cleanup
Make it 99% checkpatch clean.
2016-06-14 14:40:53 -07:00
Phil Sutter 2ef4008585 tc: m_xt: Introduce get_xtables_target_opts()
This pulls common code from parse_ipt() and print_ipt() functions
together.

While here, also fix for incorrect use of the global 'optarg' variable
in print_ipt().

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter f6ddd9c5da tc: m_xt: Simplify argc adjusting in parse_ipt()
And while at it, also improve the error message in case too few
parameters have been given.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter 28432f370e tc: m_xt: Get rid of iargc variable in parse_ipt()
After dropping the unused decrement of argc in the function's tail, it
can fully take over what iargc has been used for.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter ab8f52fc4a tc: m_xt: Get rid of rargc in parse_ipt()
No need to copy the passed parameter, it's changed only once right
before function return.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter b0ba018576 tc: m_xt: Drop unused variable fw in parse_ipt()
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter b45f9141c2 tc: m_xt: Get rid of one indentation level in parse_ipt()
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter f1a7c7d830 tc: m_xt: Fix indenting
By exiting early if xtables_find_target() fails, one indenting level can
be dropped. Some of the wrongly indented code then happens to sit at the
right spot by accident which is why this patch is smaller than expected.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter 8eee75a835 tc: m_xt: Fix segfault when adding multiple actions at once
Without this, the following call to tc would segfault:

| tc filter add dev d0 parent ffff: u32 match u32 0 0 \
| 	action xt -j MARK --set-mark 0x1 \
| 	action xt -j MARK --set-mark 0x1

The reason is basically the same as for 6e2e5ec28b ("fix print_ipt:
segfault if more then one filter with action -j MARK.") but in
parse_ipt() instead of print_ipt().

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Phil Sutter 445745221a tc: m_xt: Prevent segfault with standard targets
Iptables standard targets like DROP or REJECT don't implement the print
callback in libxtables. Hence the following command would segfault:

| tc filter add dev d0 parent ffff: u32 match u32 0 0 action xt -j DROP

With this patch standard targets still can't be used (and are not really
useful anyway), but at least it doesn't crash anymore.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-14 14:35:56 -07:00
Stephen Hemminger 8b625177ba pedit: fix whitespace etc
Minor changes from checkpatch
2016-06-14 14:32:27 -07:00
Jamal Hadi Salim d8694a30a4 action pedit: stylistic changes
More modern layout.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-06-14 14:29:20 -07:00
Stephen Hemminger 622812052a tc: f_u32 cleanup indentation and long lines
Several long lines and too long messages here.
2016-06-08 16:45:26 -07:00
Samudrala, Sridhar 5e5b3008d1 tc: f_u32: Add support for skip_hw and skip_sw flags
On devices that support TC U32 offloads, these flags enable a filter to be
added only to HW or only to SW. skip_sw and skip_hw are mutually exclusive
flags. By default without any flags, the filter is added to both HW and SW,
but no error checks are done in case of failure to add to HW.
With skip-sw, failure to add to HW is treated as an error.

Here is a sample script that adds 2 filters, one with skip_sw and the other
with skip_hw flag.

   # add ingress qdisc
   tc qdisc add dev p4p1 ingress

   # enable hw tc offload.
   ethtool -K p4p1 hw-tc-offload on

   # add u32 filter with skip-sw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:1 u32 ht 800: flowid 800:1 \
      skip-sw \
      match ip src 192.168.1.0/24 \
      action drop

   # add u32 filter with skip-hw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:2 u32 ht 800: flowid 800:2 \
      skip-hw \
      match ip src 192.168.2.0/24 \
      action drop

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
2016-06-08 16:39:30 -07:00
Sabrina Dubroca 9f7401fa49 utils: add get_be{16, 32, 64}, use them where possible
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Phil Sutter <phil@nwl.cc>
2016-06-08 09:30:37 -07:00
Eric Dumazet 4de4b5ca14 fq_codel: add per queue memory limit
This patch adds support for TCA_FQ_CODEL_MEMORY_LIMIT attribute.

..
qdisc fq_codel 8008: root refcnt 257 limit 10240p flows 1024
 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 2083566791363 bytes 1376214889 pkt (dropped 4994406, overlimits 0
requeues 21705223)
 rate 9841Mbit 812549pps backlog 3906120b 376p requeues 21705223
  maxpacket 68130 drop_overlimit 4994406 new_flow_count 28855414
  ecn_mark 0 memory_used 4190048 drop_overmemory 4994406
new_flows_len 1 old_flows_len 177

Signed-off-by: Eric Dumazet <edumazet@google.com>
2016-06-08 08:42:00 -07:00
Jamal Hadi Salim ead954cbd4 tc action policer: enable timestamp display
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-31 13:03:13 -07:00
Jamal Hadi Salim 82e6efe2e3 tc filter u32: Coding style fixes
"handle" was being used several times for different things.
Fix the 80 character limit abuse and other little issues while at it.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-31 12:33:48 -07:00
Stephen Hemminger e6263c8583 tc: action result is u32
In kernel action result is u32 not int in netlink messages.
2016-05-31 12:22:45 -07:00
Jamal Hadi Salim 45c6837911 tc action policer: Avoid nonsensical input
The user must at least specify a choice of the token bucket or
ewma policing or late binding index. TB policing requires at minimal
a rate and burst.

In addition fix formatting issues (80 chars etc).

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-31 12:16:45 -07:00
David Ahern 57bdf8b764 Make builds default to quiet mode
Similar to the Linux kernel and perf add infrastructure to reduce the
amount of output tossed to a user during a build. Full build output
can be obtained with 'make V=1'

Builds go from:

make[1]: Leaving directory `/home/dsa/iproute2.git/lib'
make[1]: Entering directory `/home/dsa/iproute2.git/ip'
gcc -Wall -Wstrict-prototypes  -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE    -c -o ip.o ip.c
gcc -Wall -Wstrict-prototypes  -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE    -c -o ipaddress.o ipaddress.c

to:

...
    AR       libutil.a

ip
    CC       ip.o
    CC       ipaddress.o
...

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2016-05-31 12:13:07 -07:00
Jamal Hadi Salim e70b9f16ea tc simple action: bug fix
Failed compile
m_simple.c: In function ‘parse_simple’:
m_simple.c:154:6: warning: too many arguments for format [-Wformat-extra-args]
      *argv);
      ^
m_simple.c:103:14: warning: unused variable ‘maybe_bind’ [-Wunused-variable]

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-31 12:11:52 -07:00
Jamal Hadi Salim a78a2dba27 tc fix ife late binding
following late binding didn't work

sudo tc actions add action ife encode \
type 0xDEAD allow mark dst 02:15:15:15:15:15 index 1

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-23 16:15:31 -07:00
Daniel Borkmann 1a0320727c f_bpf: fix filling of handle when no further arg is provided
We need to fill handle when provided by the user, even if no further
argument is provided. Thus, move the test for arg to the correct location,
so that it works correctly:

  # tc filter show dev foo egress
  filter protocol all pref 1 bpf
  filter protocol all pref 1 bpf handle 0x1 bpf.o:[classifier] direct-action
  filter protocol all pref 1 bpf handle 0x2 bpf.o:[classifier] direct-action
  # tc filter del dev foo egress prio 1 handle 2 bpf
  # tc filter show dev foo egress
  filter protocol all pref 1 bpf
  filter protocol all pref 1 bpf handle 0x1 bpf.o:[classifier] direct-action

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2016-05-23 16:14:18 -07:00
Daniel Borkmann a2de651e64 ingress, clsact: don't add TCA_OPTIONS to nl msg
In ingress and clsact qdisc TCA_OPTIONS are ignored, since it's
parameterless. In tc, we add an empty addattr_l(... TCA_OPTIONS,
NULL, 0) to the netlink message nevertheless. This has the
side effect that when someone tries a 'tc qdisc replace' and
already an existing such qdisc is present, tc fails with
EINVAL here.

Reason is that in the kernel, this invokes qdisc_change() when
such requested qdisc is already present. When TCA_OPTIONS are
passed to modify parameters, it looks whether qdisc implements
.change() callback, and if not present (like in both cases here)
it returns with error. Rather than adding an empty stub to the
kernel that ignores TCA_OPTIONS again, just don't add TCA_OPTIONS
to the netlink message in the first place.

Before:

  # tc qdisc replace dev foo clsact    # first try
  # tc qdisc replace dev foo clsact    # second one
  RTNETLINK answers: Invalid argument

After:

  # tc qdisc replace dev foo clsact
  # tc qdisc replace dev foo clsact
  # tc qdisc replace dev foo clsact

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2016-05-16 11:20:50 -07:00
Jamal Hadi Salim fdf1bdd0f1 tc simple action update and breakage
Brings it closer to more serious actions (adding branching
and allowing for late binding)

Unfortunately this breaks old syntax of the simple action.
But because simple is a pedagogical example unlikely to be used
in production environments (i.e its role is to serve as an example
on how to write actions), then this is ok.

New syntax for simple has new keyword "sdata". Example usage is:

sudo tc actions add action simple sdata "foobar" index 1
or
tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
match ip dst 17.0.0.1/32 flowid 1:10 action simple sdata "foobar"

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-16 11:15:12 -07:00
Jamal Hadi Salim 43726b750a tc: don't ignore ok as an action branch
This is what used to happen before:

tc filter add dev tap1 parent ffff: protocol 0xfefe prio 10 \
     u32 match u32 0 0 flowid 1:16 \
     action ife decode allow mark ok

tc -s filter ls dev tap1 parent ffff:
filter protocol [65278] pref 10 u32
filter protocol [65278] pref 10 u32 fh 800: ht divisor 1
filter protocol [65278] pref 10 u32 fh 800::800 order 2048 key ht 800
bkt 0 flowid 1:16
  match 00000000/00000000 at 0
        action order 1: ife decode action pipe
         index 2 ref 1 bind 1 installed 4 sec used 4 sec
         type: 0x0
         Metadata: allow mark
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: gact action pass
         random type none pass val 0
         index 1 ref 1 bind 1 installed 4 sec used 4 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

Note the extra action added at the end..

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-16 11:13:58 -07:00
Jamal Hadi Salim d3e511223f tc: introduce IFE action
This action allows for a sending side to encapsulate arbitrary metadata
which is decapsulated by the receiving end.
The sender runs in encoding mode and the receiver in decode mode.
Both sender and receiver must specify the same ethertype.
At some point we hope to have a registered ethertype and we'll
then provide a default so the user doesnt have to specify it.
For now we enforce the user specify it.

Described in netdev01 paper:
   "Distributing Linux Traffic Control Classifier-Action Subsystem"
    Authors: Jamal Hadi Salim and Damascene M. Joachimpillai

Also refer to IETF draft-ietf-forces-interfelfb-04.txt

Lets show example usage where we encode icmp from a sender towards
a receiver with an skbmark of 17; both sender and receiver use
ethertype of 0xdead to interop.

YYYY: Lets start with Receiver-side policy config:
xxx: add an ingress qdisc
sudo tc qdisc add dev $ETH ingress

xxx: any packets with ethertype 0xdead will be subjected to ife decoding
xxx: we then restart the classification so we can match on icmp at prio 3
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify

xxx: on restarting the classification from above if it was an icmp
xxx: packet, then match it here and continue to the next rule at prio 4
xxx: which will match based on skb mark of 17
sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue

xxx: match on skbmark of 0x11 (decimal 17) and accept
sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \
handle 0x11 fw flowid 1:1 \
action ok

xxx: Lets show the decoding policy
sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead
xxx:
filter pref 2 u32
filter pref 2 u32 fh 800: ht divisor 1
filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 0 success 0)
  match 00000000/00000000 at 0 (success 0 )
	action order 1: ife decode action reclassify type 0x0
	 allow mark allow prio
	 index 11 ref 1 bind 1 installed 45 sec used 45 sec
	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0

xxx:
Observe that above lists all metadatum it can decode. Typically these
submodules will already be compiled into a monolithic kernel or
loaded as modules

YYYY: Lets show the sender side now ..
xxx: Add an egress qdisc on the sender netdev
sudo tc qdisc add dev $ETH root handle 1: prio
xxx:
xxx: Match all icmp packets to 192.168.122.237/24, then
xxx: tag the packet with skb mark of decimal 17, then
xxx: Encode it with:
xxx:    ethertype 0xdead
xxx:    add skb->mark to whitelist of metadatum to send
xxx:    rewrite target dst MAC address to 02:15:15:15:15:15
xxx:
sudo $TC filter add dev $ETH parent 1: protocol ip prio 10  u32 \
match ip dst 192.168.122.237/24 \
match ip protocol 1 0xff \
flowid 1:2 \
action skbedit mark 17 \
action ife encode \
type 0xDEAD \
allow mark \
dst 02:15:15:15:15:15

xxx: Lets show the encoding policy
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2  (rule hit 118 success 0)
  match c0a87a00/ffffff00 at 16 (success 0 )
  match 00010000/00ff0000 at 8 (success 0 )
	action order 1:  skbedit mark 17
	 index 11 ref 1 bind 1 installed 3 sec used 3 sec
 	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0

	action order 2: ife encode action pipe type 0xDEAD
	 allow mark dst 02:15:15:15:15:15
	 index 12 ref 1 bind 1 installed 3 sec used 3 sec
	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
	backlog 0b 0p requeues 0
xxx:

Now test by sending ping from sender to destination

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-05-16 11:13:26 -07:00
Gustavo Zacarias 5c5a0f3df9 iproute2: tc_bpf.c: fix building with musl libc
We need limits.h for PATH_MAX, fixes:

tc_bpf.c: In function ‘bpf_map_selfcheck_pinned’:
tc_bpf.c:222:12: error: ‘PATH_MAX’ undeclared (first use in this
function)
  char file[PATH_MAX], buff[4096];

Signed-off-by: Gustavo Zacarias <gustavo@zacarias.com.ar>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2016-04-11 22:09:57 +00:00
Daniel Borkmann 4dd3f50af4 tc, bpf: add support for map pre/allocation
Follow-up to kernel commit 6c9059817432 ("bpf: pre-allocate hash map
elements"). Add flags support, so that we can pass in BPF_F_NO_PREALLOC
flag for disallowing preallocation. Update examples accordingly and also
remove the BPF_* map helper macros from them as they were not very useful.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2016-04-11 21:54:47 +00:00
Daniel Borkmann afc1a2000b tc, bpf: further improve error reporting
Make it easier to spot issues when loading the object file fails. This
includes reporting in what pinned object specs differ, better indication
when we've reached instruction limits. Don't retry to load a non relo
program once we failed with bpf(2), and report out of bounds tail call key.

Also, add truncation of huge log outputs by default. Sometimes errors are
quite easy to spot by only looking at the tail of the verifier log, but
logs can get huge in size e.g. up to few MB (due to verifier checking all
possible program paths). Thus, by default limit output to the last 4096
bytes and indicate that it's truncated. For the full log, the verbose option
can be used.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2016-04-11 21:53:58 +00:00
Jiri Pirko 4952b45946 include: add linked list implementation from kernel
Rename hlist.h to list.h while adding it to be aligned with kernel

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2016-03-27 10:56:11 -07:00
Stephen Hemminger e9e9365b56 scrub out whitespace issues
Run script that removes trailing whitespace everywhere.
2016-03-27 10:50:14 -07:00
Phil Sutter 7faf1588a7 lib/utils: introduce rt_addr_n2a_rta()
This simple macro eases calling rt_addr_n2a() with data from an rt_attr
pointer.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-03-27 10:37:35 -07:00