Commit Graph

747 Commits

Author SHA1 Message Date
Stephen Hemminger 2ecb169280 Merge branch 'master' into net-next 2017-05-30 17:40:57 -07:00
Phil Sutter f6fc1055e4 tc: m_xt: Prevent a segfault in libipt
This happens with NAT targets, such as SNAT, DNAT and MASQUERADE. These
are still not usable with this patch, but at least tc doesn't crash
anymore when one tries to use them.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2017-05-30 17:38:19 -07:00
Roman Mashak cba134ae70 tc: fix Makefile to build skbmod
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
2017-05-22 13:33:51 -07:00
Jiri Pirko d19f72f789 tc/actions: introduce support for goto chain action
Allow user to set control action "goto" with filter chain index as
a parameter.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-05-22 13:31:51 -07:00
Jiri Pirko e67aba5595 tc: actions: add helpers to parse and print control actions
Each tc action is terminated by a control action. Each action parses and
prints then intividually. Introduce set of helpers and allow to share
this code.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-05-22 13:31:51 -07:00
Jiri Pirko 732f03461b tc_filter: add support for chain index
Allow user to put filter to a specific chain identified by index.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-05-22 13:31:51 -07:00
Khem Raj ae717baf15 tc: include stdint.h explicitly for UINT16_MAX
Fixes
| tc_core.c:190:29: error: 'UINT16_MAX' undeclared (first use in this function); did you mean '__INT16_MAX__'?
|    if ((sz >> s->size_log) > UINT16_MAX) {
|                              ^~~~~~~~~~

Signed-off-by: Khem Raj <raj.khem@gmail.com>
2017-05-22 11:41:53 -07:00
Amir Vadai f3e1b2448a pedit: Introduce ipv6 support
Add support for modifying IPv6 headers using pedit.

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-15 15:05:20 -07:00
Amir Vadai a13426fe1a pedit: Check for extended capability in protocol parser
Do not allow using eth and udp header types if non-extended pedit kABI
is being used. Other protocol parsers already have this check.

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-15 15:05:20 -07:00
Amir Vadai cdca191862 pedit: Do not allow using retain for too big fields
Using retain for fields longer than 32 bits is not supported.
Do not allow user to do it.

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-15 15:05:20 -07:00
Amir Vadai 290cdc058d pedit: Fix a typo in warning
'ex' attribute should be placed after 'action pedit' and not after
'munge'.

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-15 15:05:20 -07:00
Or Gerlitz e57285b81a tc: Reflect HW offload status
Currently there is no way of querying whether a filter is
offloaded to HW or not when using "both" policy (where none
of skip_sw or skip_hw flags are set by user-space).

Add two new flags, "in hw" and "not in hw" such that user
space can determine if a filter is actually offloaded to
hw or not. The "in hw" UAPI semantics was chosen so it's
similar to the "skip hw" flag logic.

If none of these two flags are set, this signals running
over older kernel.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
2017-05-05 09:49:25 -07:00
Stephen Hemminger d2b9100a08 Merge branch 'master' into net-next 2017-05-01 09:26:51 -07:00
Stephen Hemminger 1e600da057 pedit: fix whitespace
Add newlines to break long lines.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-05-01 09:25:22 -07:00
Or Gerlitz 3d2a7781ec tc/pedit: p_udp: introduce pedit udp support
For example, forward udp traffic destined to port 999 to veth0 and set
tcp port to 888:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
    flower \
      ip_proto udp \
      dst_port 999 \
    action pedit ex munge \
      udp dport set 888 \
    action mirred egress \
      redirect dev veth0

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Amir Vadai 2c6eb12ab8 tc/pedit: p_tcp: introduce pedit tcp support
For example, forward tcp traffic destined to port 80 to veth0 and set
tcp port to 8080:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
    flower \
      ip_proto tcp \
      dst_port 80 \
    action pedit ex munge \
      tcp dport set 8080 \
    action mirred egress \
      redirect dev veth0

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Amir Vadai 3cd5149ecd tc/pedit: p_eth: ETH header editor
For example, forward tcp traffic to veth0 and set
destination mac address to 11:22:33:44:55:66 :
$ tc filter add dev enp0s9 protocol ip parent ffff: \
    flower \
      ip_proto tcp \
    action pedit ex munge \
      eth dst set 11:22:33:44:55:66 \
    action mirred egress \
      redirect dev veth0

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Amir Vadai fa4652ff3b tc/pedit: Support fields bigger than 32 bits
Make parse_val() accept fields up to 128 bits long, this should be
enough for current use cases and involves a minimal change to code.

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Amir Vadai 8d193d9607 tc/pedit: p_ip: introduce editing ttl header
Enable user to edit IP header ttl field.

For example, to forward any TCP packet and decrease its TTL by one:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
    flower \
      ip_proto tcp \
    action pedit ex munge \
      ip ttl add 0xff pipe \
    action mirred egress \
      redirect dev veth0

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Amir Vadai c05ddaf9e0 tc/pedit: Introduce 'add' operation
This command could be useful to increase/decrease fields value.

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Amir Vadai 7c71a40cbd tc/pedit: Extend pedit to specify offset relative to mac/transport headers
Utilize the extended pedit netlink to set an offset relative to a
specific header type. Old netlink only enabled the user to set
approximated  offset relative to the IPv4 header.

To use this extended functionality need to use the 'ex' keyword after
'pedit' and before any 'munge'.
e.g:
$ tc filter add dev ens9 protocol ip parent ffff: \
    flower \
      ip_proto udp \
      dst_port 80 \
    action pedit ex munge \
      ip dst set 1.1.1.1 \
      pipe \
    action mirred egress redirect dev veth0

Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Amir Vadai 51536ebbe8 tc/pedit: Fix a typo in pedit usage message
Signed-off-by: Amir Vadai <amir@vadai.me>
2017-05-01 09:22:16 -07:00
Stephen Hemminger 590dde3a98 Merge branch 'master' into net-next 2017-04-23 09:14:35 -07:00
Jamal Hadi Salim fd8b3d2c1b actions: Add support for user cookies
Make use of 128b user cookies

Introduce optional 128-bit action cookie.
Like all other cookie schemes in the networking world (eg in protocols
like http or existing kernel fib protocol field, etc) the idea is to
save user state that when retrieved serves as a correlator. The kernel
_should not_ intepret it. The user can store whatever they wish in the
128 bits.

Sample exercise(showing variable length use of cookie)

.. create an accept action with cookie a1b2c3d4
sudo $TC actions add action ok index 1 cookie a1b2c3d4

.. dump all gact actions..
sudo $TC -s actions ls action gact

    action order 0: gact action pass
     random type none pass val 0
     index 1 ref 1 bind 0 installed 5 sec used 5 sec
    Action statistics:
    Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0
    cookie a1b2c3d4

.. bind the accept action to a filter..
sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 1

... send some traffic..
$ ping 127.0.0.1 -c 3
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.038 ms

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2017-04-23 09:10:02 -07:00
Stephen Hemminger f4878dfae4 Merge branch 'master' into net-next 2017-04-04 14:56:41 -07:00
Roman Mashak 878babffec tc: print skbedit action when dumping actions.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
2017-04-04 14:48:54 -07:00
Jiri Kosina 7c581a124d iproute2: add support for invisible qdisc dumping
Support the new TCA_DUMP_INVISIBLE netlink attribute that allows asking
kernel to perform 'full qdisc dump', as for historical reasons some of the
default qdiscs are being hidden by the kernel.

The command syntax is being extended by voluntary 'invisible' argument to
'tc qdisc show'.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-03-14 16:37:08 -07:00
Stephen Hemminger 60ccfcd7f2 pie: remove always false condition
When built with GCC warnings enabled:
q_pie.c: In function ‘pie_parse_opt’:
q_pie.c:78:38: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
        (alpha > ALPHA_MAX) || (alpha < ALPHA_MIN)) {
                                      ^
q_pie.c:85:35: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
        (beta > BETA_MAX) || (beta < BETA_MIN)) {
                                   ^

This is because MIN is 0 and unsigned number can never be less than 0.
Therefore just remove the _MIN values.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-03-10 08:58:01 -08:00
Stephen Hemminger a59b616200 tc: use rta_getattr_u32
Don't cast RTA_DATA use newish accessors.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-02-24 15:24:34 -08:00
Jiri Kosina be67f81297 iproute2: tc: introduce build dependency on libnetlink
Rebuilding libnetlink doesn't trigger rebuild of tc, which is wrong
(especially so for builds where libnetlink.a gets statically linked into
tc). Fix that by introducing an explicit dependency.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-02-24 15:11:32 -08:00
Stephen Hemminger 9f1370c0e5 netlink route attribute cleanup
Use the new helper functions rta_getattr_u* instead of direct
cast of RTA_DATA().  Where RTA_DATA() is a structure, then remove
the unnecessary cast since RTA_DATA() is void *

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-02-24 08:56:38 -08:00
Daniel Borkmann e37d706b56 {f,m}_bpf: dump tag over insns
We already export TCA_BPF_TAG resp. TCA_ACT_BPF_TAG from kernel commit
f1f7714ea51c ("bpf: rework prog_digest into prog_tag"), thus also dump
it when filter/actions are shown.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-02-23 09:02:19 -08:00
Roi Dayan 164a9ff401 tc: flower: Fix parsing ip address
Fix order of arguments when passed to __flower_parse_ip_addr.

Fixes: ("f888f4e20534 tc: flower: Support matching ARP")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
2017-02-23 09:01:15 -08:00
Stephen Hemminger 732b18af97 Merge branch 'merge-4.10' into next-merge 2017-02-17 15:32:28 -08:00
Simon Horman 6374961a00 tc: flower: support masked ICMP code and type match
Extend ICMP code and type match to support masks.

Also add missing documentation to synopsis in manpage.

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
	indev eth0 ip_proto icmpv6 type 128/240 code 0 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-02-17 15:32:03 -08:00
Simon Horman 9d36e54f36 tc: flower: provide generic masked u8 print helper
Provide generic masked u8 print helper and use it to print arp operations.

Also:
* Make name parameter of arp op print helper const.
* Consistently use __u8 rather than uint8_t, in keeping with the
  pervasive style in the file.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-02-17 15:32:03 -08:00
Simon Horman 180136e540 tc: flower: provide generic masked u8 parser helper
Provide generic masked u8 paser helper and use it to parse arp operations.

Also consistently use __u8 rather than uint8_t, in keeping with the
pervasive style in the file.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-02-17 15:32:03 -08:00
Or Gerlitz afdc1fed24 tc: matchall: Print skip flags when dumping a filter
Print the skip flags when we dump a filter.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
2017-02-17 15:25:24 -08:00
Simon Horman c7ec052bb8 tc: flower: Update documentation to indicate ARP takes IPv4 prefixes
Unlike other PREFIXes documented in the usage for tc flower, which accept
both IPv4 and IPv6 prefixes, arp_sip and arp_tip only accepts IPv4
prefixes.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-02-08 11:39:33 -08:00
Simon Horman 81f6e5a727 tc: flower: use correct type when calling flower_icmp_attr_type
Use enum flower_icmp_field rather than bool as type of third parameter
when calling flower_icmp_attr_type.

Fixes: eb3b5696f1 ("tc: flower: support matching on ICMP type and code")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-02-08 11:37:44 -08:00
Yotam Gigi 0b1abd84fb tc: Add support for the sample tc action
The sample tc action allows sampling packets matching a classifier. It
peeks randomly packets, and samples them using the psample netlink
channel. The user can specify the psample group, which the packet will be
sampled to, the sampling rate and the packet truncation (to save
kernel-user traffic).

The sampled packets contain informative metadata, for example, the input
interface and the original packet length.

The action syntax:
tc filter add [...] \
	action sample rate <RATE> group <GROUP> [trunc <SIZE>]
	[...]

Where:
  RATE := The sampling rate which is the ratio of packets observed at the
	  data source to the samples generated
  GROUP := the psample module sampling group
  SIZE := optional truncation size

An example for a common usecase of the sample tc action: to sample ingress
traffic from interface eth1, one may use the commands:

tc qdisc add dev eth1 handle ffff: ingress

tc filter add dev eth1 parent ffff: \
       matchall action sample rate 12 group 4

Where the first command adds an ingress qdisc and the second starts
sampling randomly with an average of one sampled packet per 12 packets
on dev eth1 to psample group 4.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
2017-02-06 14:24:52 -08:00
Stephen Hemminger fefc93bb28 Merge branch 'master' into net-next 2017-01-29 20:30:05 -08:00
Roman Mashak 31951c47e9 tc: distinguish Add/Replace action operations.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Phil Sutter <phil@nwl.cc>
2017-01-29 20:26:44 -08:00
Benjamin LaHaise 4f7d406f5d f_flower: don't set TCA_FLOWER_KEY_ETH_TYPE for "protocol all"
v2 - update to address changes in 00697ca19a.

When using the tc flower filter, rules marked with "protocol all" do not
actually match all packets.  This is due to a bug in f_flower.c that passes
in ETH_P_ALL in the TCA_FLOWER_KEY_ETH_TYPE attribute when adding a rule.
Fix this by omitting TCA_FLOWER_KEY_ETH_TYPE if the protocol is set to
ETH_P_ALL.

Fixes: 488b41d020 ("tc: flower no need to specify the ethertype")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Benjamin LaHaise <benjamin.lahaise@netronome.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2017-01-29 20:23:58 -08:00
Paul Blakey 08f66c80c0 tc: flower: Refactor matching flags to be more user friendly
Instead of "magic numbers" we can now specify each flag
by name. Prefix of "no"  (e.g nofrag) unsets the flag,
otherwise it wil be set.

Example:
    # add a flower filter that will drop fragmented packets
    tc filter add dev ens4f0 protocol ip parent ffff: \
            flower \
            src_mac e4:1d:2d:fd:8b:01 \
            dst_mac e4:1d:2d:fd:8b:02 \
            indev ens4f0 \
            ip_flags frag \
    action drop

    # add a flower filter that will drop non-fragmented packets
    tc filter add dev ens4f0 protocol ip parent ffff: \
            flower \
            src_mac e4:1d:2d:fd:8b:01 \
            dst_mac e4:1d:2d:fd:8b:02 \
            indev ens4f0 \
            ip_flags nofrag \
    action drop

Fixes: 22a8f01989 ('tc: flower: support matching flags')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-01-20 10:36:45 -08:00
Davide Caratti 6561cb28f2 tc: m_csum: add support for SCTP checksum
'sctp' parameter can now be used as 'csum' target to enable CRC32c
computation on SCTP packets.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2017-01-20 09:32:08 -08:00
Stephen Hemminger 9174b4cf3e Merge branch 'master' into net-next 2017-01-20 09:27:57 -08:00
Roi Dayan 00697ca19a tc: flower: Fix incorrect error msg about eth type
addattr16 may return an error about the nl msg size
but not about incorrect eth type.

Fixes: 488b41d020 ("tc: flower no need to specify the ethertype")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
2017-01-20 09:27:34 -08:00
Roi Dayan c85609b25f tc: flower: Add missing err check when parsing flower options
addattr32 may return an error.

Fixes: cfcabf18d8 ("tc: flower: Add skip_{hw|sw} support")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
2017-01-20 09:27:34 -08:00
Roi Dayan b2141de1ad tc: flower: Fix flower output for src and dst ports
This fix a missing use case after the introduction of enum flower_endpoint.

Fixes: 6910d65661 ("tc: flower: introduce enum flower_endpoint")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
2017-01-17 08:45:22 -08:00
Phil Sutter a05b9557f4 tc: m_xt: Drop needless parentheses from #if checks
Signed-off-by: Phil Sutter <phil@nwl.cc>
2017-01-13 16:33:54 -08:00
Simon Horman f888f4e205 tc: flower: Support matching ARP
Support matching on ARP operation, and hardware and protocol addresses
for Ethernet hardware and IPv4 protocol addresses.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol arp parent ffff: flower indev eth0 \                    arp_op request arp_sip 10.0.0.1 action drop
tc filter add dev eth0 protocol rarp parent ffff: flower indev eth0 \                   arp_op reply arp_tha 52:54:3f:00:00:00/24 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-01-12 17:46:37 -08:00
Stephen Hemminger 51dd3455a3 Merge branch 'master' into net-next 2017-01-12 17:44:44 -08:00
Phil Sutter 97a02cabef tc: m_xt: Fix segfault with iptables-1.6.0
Said iptables version introduced struct xtables_globals field
'compat_rev', a function pointer. Initializing it is mandatory as
libxtables calls it without existence check.

Without this, tc segfaults when using the xt action like so:

| tc filter add dev d0 parent ffff: u32 match u32 0 0 \
|	action xt -j MARK --set-mark 20

Signed-off-by: Phil Sutter <phil@nwl.cc>
2017-01-12 17:32:26 -08:00
Simon Horman a5ae170ed8 tc: flower: Update dest UDP port documentation
Since 41aa17ff46 ("tc/cls_flower: Add dest UDP port to tunnel params")
tc flower supports setting the dest UDP port.

* Use "port_number" to be consistent with other man-page text
* Re-add "enc_dst_port" documentation to manpage which was
  accidently removed by b2a1f740aa ("tc: flower: document that *_ip
  parameters take a PREFIX as an argument.")

Cc: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-01-09 12:09:46 -08:00
Stephen Hemminger 1693e4f257 Merge branch 'master' into net-next 2017-01-09 12:08:34 -08:00
David Michael bb18c98198 tc: make tc linking depend on libtc.a
There was a race condition where the command to link the tc binary
could (rarely) run before the libtc.a archive existed.
2017-01-09 12:06:58 -08:00
Paul Blakey 22a8f01989 tc: flower: support matching flags
Enhance flower to support matching on flags.

The 1st flag allows to match on whether the packet is
an IP fragment.

Example:

	# add a flower filter that will drop fragmented packets
	# (bit 0 of control flags)
	tc filter add dev ens4f0 protocol ip parent ffff: \
		flower \
		src_mac e4:1d:2d:fd:8b:01 \
		dst_mac e4:1d:2d:fd:8b:02 \
		indev ens4f0 \
		matching_flags 0x1/0x1 \
	action drop

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2016-12-29 10:42:08 -08:00
Stephen Hemminger d34adf67b5 Merge branch 'master' into net-next 2016-12-29 10:31:44 -08:00
Baruch Siach d421bb4efe tc: add missing limits.h header
This fixes under musl build issues like:

f_matchall.c: In function ‘matchall_parse_opt’:
f_matchall.c:48:12: error: ‘LONG_MIN’ undeclared (first use in this function)
   if (h == LONG_MIN || h == LONG_MAX) {
            ^
f_matchall.c:48:12: note: each undeclared identifier is reported only once for each function it appears in
f_matchall.c:48:29: error: ‘LONG_MAX’ undeclared (first use in this function)
   if (h == LONG_MIN || h == LONG_MAX) {
                             ^

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
2016-12-29 10:24:35 -08:00
Hadar Hen Zion f6d3126ef9 tc/m_tunnel_key: Add to the usage encapsulation dest UDP port
tunnel key set parameters includes also dest UDP port, add it to the
usage.

Fixes: 449c709c38 ("tc/m_tunnel_key: Add dest UDP port to tunnel key action")
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reported-by: Simon Horman <simon.horman@netronome.com>
2016-12-22 11:02:00 -08:00
Hadar Hen Zion bf73c650ac tc/cls_flower: Add to the usage encapsulation dest UDP port
Encapsulation dest UDP port is part of the classifier matching
parameters, add it to the usage.

Fixes: 41aa17ff46 ("tc/cls_flower: Add dest UDP port to tunnel params")
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reported-by: Simon Horman <simon.horman@netronome.com>
2016-12-22 11:02:00 -08:00
Simon Horman c2078f8dc4 tc: flower: Allow *_mac options to accept a mask
* The argument to src_mac and dst_mac may now take an optional mask
  to limit the scope of matching.
* This address is is documented as a LLADDR in keeping with ip-link(8).
* The formats accepted match those already output when dumping flower
  filters from the kernel.

Example of use of LLADDR with and without a mask:

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:01:00:00:00/ff:ff:00:00:00:01 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:00:00:00:00/23 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:00:00:00:00 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-21 16:07:53 -08:00
Simon Horman b2a1f740aa tc: flower: document that *_ip parameters take a PREFIX as an argument.
* The argument to src_ip, dst_ip, enc_src_ip and enc_dst_ip take an
  optional prefix length which is used to provide a mask to limit the scope
  of matching.
* This is documented as a PREFIX in keeping with ip-route(8).

Example of uses of IPv4 and IPv6 prefixes

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 dst_ip 192.168.1.1 action drop
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 src_ip 10.0.0.0/8 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 src_ip 2001:DB8:1::/48 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 dst_ip 2001:DB8::1 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-21 16:07:41 -08:00
Stephen Hemminger 8578bb731d Revert "tc: flower: Allow *_mac options to accept a mask"
This reverts commit 0390185078.
2016-12-21 16:06:49 -08:00
Stephen Hemminger 10da552800 Revert "tc: flower: document that *_ip parameters take a PREFIX as an argument."
This reverts commit a8a1dccd2a.
2016-12-21 16:06:35 -08:00
Simon Horman 0390185078 tc: flower: Allow *_mac options to accept a mask
* The argument to src_mac and dst_mac may now take an optional mask
  to limit the scope of matching.
* This address is is documented as a LLADDR in keeping with ip-link(8).
* The formats accepted match those already output when dumping flower
  filters from the kernel.

Example of use of LLADDR with and without a mask:

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:01:00:00:00/ff:ff:00:00:00:01 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:00:00:00:00/23 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
	src_mac 52:54:00:00:00:00 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-21 15:56:39 -08:00
Simon Horman a8a1dccd2a tc: flower: document that *_ip parameters take a PREFIX as an argument.
* The argument to src_ip, dst_ip, enc_src_ip and enc_dst_ip take an
  optional prefix length which is used to provide a mask to limit the scope
  of matching.
* This is documented as a PREFIX in keeping with ip-route(8).

Example of uses of IPv4 and IPv6 prefixes

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 dst_ip 192.168.1.1 action drop
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 src_ip 10.0.0.0/8 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 src_ip 2001:DB8:1::/48 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 dst_ip 2001:DB8::1 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-21 15:56:39 -08:00
Roman Mashak 530753184a tc: pass correct conversion specifier to print 'unsigned int' action index.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-12-14 19:00:36 -08:00
Hadar Hen Zion 449c709c38 tc/m_tunnel_key: Add dest UDP port to tunnel key action
Enhance tunnel key action parameters by adding destination UDP port.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2016-12-13 10:15:11 -08:00
Hadar Hen Zion 41aa17ff46 tc/cls_flower: Add dest UDP port to tunnel params
Enhance IP tunnel parameters by adding destination UDP port.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
2016-12-13 10:15:11 -08:00
Simon Horman eb3b5696f1 tc: flower: support matching on ICMP type and code
Support matching on ICMP type and code.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol ip parent ffff: flower \
	indev eth0 ip_proto icmp type 8 code 0 action drop

tc filter add dev eth0 protocol ipv6 parent ffff: flower \
	indev eth0 ip_proto icmpv6 type 128 code 0 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-09 12:46:34 -08:00
Simon Horman 6910d65661 tc: flower: introduce enum flower_endpoint
Introduce enum flower_endpoint and use it instead of a bool
as the type for paramatising source and destination.

This is intended to improve read-ability and provide some type
checking of endpoint parameters.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-09 12:45:59 -08:00
Simon Horman 6bd5b80cdc tc: flower: make use of flower_port_attr_type() safe and silent
Make use of flower_port_attr_type() safe:
* flower_port_attr_type() may return a valid index into tb[] or -1.
  Only access tb[] in the case of the former.
* Do not access null entries in tb[]

Also make usage silent - it is valid for ip_proto to be invalid,
for example if it is not specified as part of the filter.

Fixes: a1fb0d4842 ("tc: flower: Support matching on SCTP ports")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-05 10:13:26 -08:00
Simon Horman 61dff9ac10 tc: flower: correct name of ip_proto parameter to flower_parse_port()
This corrects a typo.

Fixes: a1fb0d4842 ("tc: flower: Support matching on SCTP ports")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-05 10:13:26 -08:00
Simon Horman 6ad7e60c1f tc: flower: document SCTP ip_proto
Add SCTP ip_proto to help text and man page.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-05 10:13:26 -08:00
Amir Vadai d57639a475 tc/act_tunnel: Introduce ip tunnel action
This action could be used before redirecting packets to a shared tunnel
device, or when redirecting packets arriving from a such a device.

The 'unset' action is optional. It is used to explicitly unset the
metadata created by the tunnel device during decap. If not used, the
metadata will be released automatically by the kernel.
The 'set' operation, will set the metadata with the specified values for
the encap.

For example, the following flower filter will forward all ICMP packets
destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before
redirecting, a metadata for the vxlan tunnel is created using the
tunnel_key action and it's arguments:

$ tc filter add dev net0 protocol ip parent ffff: \
    flower \
      ip_proto 1 \
      dst_ip 11.11.11.2 \
    action tunnel_key set \
      src_ip 11.11.0.1 \
      dst_ip 11.11.0.2 \
      id 11 \
    action mirred egress redirect dev vxlan0

Signed-off-by: Amir Vadai <amir@vadai.me>
2016-12-02 14:12:09 -08:00
Amir Vadai bb9b63b18e tc/cls_flower: Classify packet in ip tunnels
Introduce classifying by metadata extracted by the tunnel device.
Outer header fields - source/dest ip and tunnel id, are extracted from
the metadata when classifying.

For example, the following will add a filter on the ingress Qdisc of shared
vxlan device named 'vxlan0'. To forward packets with outer src ip
11.11.0.2, dst ip 11.11.0.1 and tunnel id 11. The packets will be
forwarded to tap device 'vnet0':

$ tc filter add dev vxlan0 protocol ip parent ffff: \
    flower \
      enc_src_ip 11.11.0.2 \
      enc_dst_ip 11.11.0.1 \
      enc_key_id 11 \
      dst_ip 11.11.11.1 \
    action mirred egress redirect dev vnet0

Signed-off-by: Amir Vadai <amir@vadai.me>
2016-12-02 14:12:09 -08:00
Amir Vadai aab0f61043 libnetlink: Introduce rta_getattr_be*()
Add the utility functions rta_getattr_be16() and rta_getattr_be32(), and
change existing code to use it.

Signed-off-by: Amir Vadai <amir@vadai.me>
2016-12-02 14:12:09 -08:00
Stephen Hemminger 328374dcfe Merge branch 'master' into net-next 2016-12-01 10:29:12 -08:00
Roman Mashak 98df0c81da tc: distinguish Add/Replace filter operations
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-11-29 13:26:10 -08:00
Daniel Borkmann e42256699c bpf: make tc's bpf loader generic and move into lib
This work moves the bpf loader into the iproute2 library and reworks
the tc specific parts into generic code. It's useful as we can then
more easily support new program types by just having the same ELF
loader backend. Joint work with Thomas Graf. I hacked a rough start
of a test suite to make sure nothing breaks [1] and looks all good.

  [1] https://github.com/borkmann/clsact/blob/master/test_bpf.sh

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
2016-11-29 12:35:32 -08:00
Stephen Hemminger 512caeb273 tc: flower checkpatch cleanups
break long lines and minor whitespace changes.
2016-11-29 11:48:52 -08:00
Simon Horman a1fb0d4842 tc: flower: Support matching on SCTP ports
Support matching on SCTP ports in the same way that matching
on TCP and UDP ports is already supported.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol ip parent ffff: \
        flower indev eth0 ip_proto sctp dst_port 80 \
        action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-11-29 11:44:46 -08:00
Stephen Hemminger b932e6f372 tc: cleanup style of qdisc code
Get rid of lingering mismatches with kernel style.
2016-11-29 11:41:58 -08:00
Roman Mashak d42e1444f2 tc: print raw qdisc handle.
This is v2 patch with fixed code indentation.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-11-29 11:41:58 -08:00
Roman Mashak 4b5451c4cd tc: improved usage help for fw classifier.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-11-29 11:41:58 -08:00
Paul Blakey d9c3995ab7 tc: flower: Fix usage message
Remove left over usage from removal of eth_type argument.

Fixes: 488b41d020 ('tc: flower no need to specify the ethertype')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
2016-11-12 10:19:06 +03:00
Shmulik Ladkani 5eca0a3701 tc: m_mirred: Add support for ingress redirect/mirror
So far, only the 'egress' direction was implemented.

Allow specifying 'ingress' as the direction packet appears on the target
interface.

For example, this takes incoming 802.1q frames on veth0 and redirects
them for input on dummy0:

 # tc filter add dev veth0 parent ffff: pref 1 protocol 802.1q basic \
     action mirred ingress redirect dev dummy0

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
2016-10-26 11:20:47 -07:00
Daniel Borkmann 4710e46ec3 tc, ipt: don't enforce iproute2 dependency on iptables-devel
Since 5cd1adba79 ("Update to current iptables headers") compilation
of iproute2 broke for systems without iptables-devel package [1].
Reason is that even though we fall back to build m_ipt.c, the include
depends on a xtables-version.h header, which only ships with
iptables-devel. Machines not having this package fail compilation with:

    [...]
    CC       m_ipt.o
In file included from ../include/iptables.h:5:0,
                 from m_ipt.c:17:
../include/xtables.h:34:29: fatal error: xtables-version.h: No such file or directory
compilation terminated.
../Config:31: recipe for target 'm_ipt.o' failed
make[1]: *** [m_ipt.o] Error 1

The configure script only barks that package xtables was not found in
the pkg-config search path. The generated Config then only contains f.e.
TC_CONFIG_IPSET. In tc's Makefile we thus fall back to adding m_ipt.o
to TCMODULES. m_ipt.c then includes the local include/iptables.h header
copy, which includes the include/xtables.h copy. Latter then includes
xtables-version.h, which only ships with iptables-devel.

One way to resolve this is to skip this whole mess when pkg-config has
no xtables config available. I've carried something along these lines
locally for a while now, but it's just too annyoing. :/ Build works fine
now also when xtables.pc is not available.

  [1] http://www.spinics.net/lists/netdev/msg366162.html

Fixes: 5cd1adba79 ("Update to current iptables headers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2016-10-26 10:58:22 -07:00
Jakub Kicinski 87e46a5198 tc: cls_bpf: handle skip_sw and skip_hw flags
Add support for controling hardware offload using (now standard)
skip_sw and skip_hw flags in cls_bpf.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
2016-10-17 05:27:59 -07:00
Stephen Hemminger ec2e005fe5 tc_filter: style cleanup
Break long lines and whtespace changes.
2016-10-12 15:21:13 -07:00
Jamal Hadi Salim 120f556d15 tc filters: add support to get individual filters by handle
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol ip \
u32 match u32 0 0 flowid 1:1 \
action ok
sudo $TC filter add dev $ETH parent ffff: prio 1 protocol ip \
u32 match ip protocol 1 0xff flowid 1:10 \
action ok

now dump to see all rules..
$TC -s filter ls dev $ETH parent ffff: protocol ip
 ....
filter pref 1 u32
filter pref 1 u32 fh 801: ht divisor 1
filter pref 1 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10  (rule hit 0 success 0)
  match 00010000/00ff0000 at 8 (success 0 )
        action order 1: gact action drop
         random type none pass val 0
         index 6 ref 1 bind 1 installed 4 sec used 4 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

filter pref 2 u32
filter pref 2 u32 fh 800: ht divisor 1
filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 336 success 336)
  match 00000000/00000000 at 0 (success 336 )
        action order 1: gact action pass
         random type none pass val 0
         index 5 ref 1 bind 1 installed 38 sec used 4 sec
        Action statistics:
        Sent 24864 bytes 336 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
 ....

..get filter 801::800
$TC -s filter get dev $ETH parent ffff: protocol ip \
handle 801:0:800 prio 2  u32

 ....
filter parent ffff: protocol ip pref 1 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10  (rule hit 260 success 130)
  match 00010000/00ff0000 at 8 (success 130 )
        action order 1: gact action drop
         random type none pass val 0
         index 6 ref 1 bind 1 installed 348 sec used 0 sec
        Action statistics:
        Sent 11440 bytes 130 pkt (dropped 130, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
 ....

..get other one
$TC -s filter get dev $ETH parent ffff: protocol ip \
handle 800:0:800 prio 2  u32

....
filter parent ffff: protocol ip pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 514 success 514)
  match 00000000/00000000 at 0 (success 514 )
        action order 1: gact action pass
         random type none pass val 0
         index 5 ref 1 bind 1 installed 506 sec used 4 sec
        Action statistics:
        Sent 35544 bytes 514 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
....

..try something that doesnt exist
$TC -s filter get dev $ETH parent ffff: protocol ip  handle 800:0:803 prio 2  u32

.....
RTNETLINK answers: No such file or directory
We have an error talking to the kernel
.....

Note, added NLM_F_ECHO is for backward compatibility. old kernels never
before Eric's patch will not respond without it and newer kernels (after Erics patch)
will ignore it.
In old kernels there is a side effect:
In addition to a response to the GET you will receive an event (if you do tc mon).
But this is still better than what it was before (not working at all).

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:14:47 -07:00
Stephen Hemminger 557b705445 tc: skbmod style cleanup
break long lines
2016-10-12 15:12:51 -07:00
Jamal Hadi Salim da65128998 actions: add skbmod action
This action is intended to be an upgrade from a usability perspective
from pedit (as well as operational debugability).
Compare this:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action pedit munge offset -14 u8 set 0x02 \
    munge offset -13 u8 set 0x15 \
    munge offset -12 u8 set 0x15 \
    munge offset -11 u8 set 0x15 \
    munge offset -10 u16 set 0x1515 \
    pipe

to:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod dmac 02:15:15:15:15:15

Or worse, try to debug a policy with destination mac, source mac and
etherype. Then make that a hundred rules and you'll get my point.

The most important ethernet use case at the moment is when redirecting or
mirroring packets to a remote machine. The dst mac address needs a re-write
so that it doesn't get dropped or confuse an interconnecting (learning) switch
or dropped by a target machine (which looks at the dst mac).

In the future common use cases on pedit can be migrated to this action
(as an example different fields in ip v4/6, transports like tcp/udp/sctp
etc). For this first cut, this allows modifying basic ethernet header.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Craig Dillabaugh 883c6708e4 action gact: list pipe as a valid action
Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Jamal Hadi Salim 8da6ff35cd actions ife: Introduce encoding and decoding of tcindex metadata
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Roman Mashak 1b600f4b54 ife: improve help text
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Roman Mashak 57ee4430f9 ife: print prio, mark and hash as unsigned
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Roman Mashak 9a56cca3f3 ife action: allow specifying index in hex
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00