Commit Graph

1115 Commits

Author SHA1 Message Date
Vinicius Costa Gomes ee000bf217 taprio: Add support for setting flags
This allows a new parameter, flags, to be passed to taprio. Currently, it
only supports enabling the txtime-assist mode. But, we plan to add
different modes for taprio (e.g. hardware offloading) and this parameter
will be useful in enabling those modes.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-18 15:46:31 -07:00
Vedang Patel d9114263d0 etf: Add skip_sock_check
ETF Qdisc currently checks for a socket with SO_TXTIME socket option. If
either is not present, the packet is dropped. In the future commits, we
want other Qdiscs to add packet with launchtime to the ETF Qdisc. Also,
there are some packets (e.g. ICMP packets) which may not have a socket
associated with them.  So, add an option to skip this check.

Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-18 15:44:21 -07:00
Paul Blakey 2fffb1c030 tc: flower: Add matching on conntrack info
Matches on conntrack state, zone, mark, and label.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-18 15:41:30 -07:00
Paul Blakey c8a494314c tc: Introduce tc ct action
New tc action to send packets to conntrack module, commit
them, and set a zone, labels, mark, and nat on the connection.

It can also clear the packet's conntrack state by using clear.

Usage:
   ct clear
   ct commit [force] [zone] [mark] [label] [nat]
   ct [nat] [zone]

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-18 15:41:02 -07:00
Paul Blakey 18aa9f5583 tc: add NLA_F_NESTED flag to all actions options nested block
Strict netlink validation now requires this flag on all nested
attributes, add it for action options.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-18 15:38:09 -07:00
Andrea Claudi 6bc13e4a20 tc: util: constrain percentage in 0-100 interval
parse_percent() currently allows to specify negative percentages
or value above 100%. However this does not seems to make sense,
as the function is used for probabilities or bandiwidth rates.

Moreover, using negative values leads to erroneous results
(using Bernoulli loss model as example):

$ ip link add test type dummy
$ ip link set test up
$ tc qdisc add dev test root netem loss gemodel -10% limit 10
$ tc qdisc show dev test
qdisc netem 800c: root refcnt 2 limit 10 loss gemodel p 90% r 10% 1-h 100% 1-k 0%

Using values above 100% we have instead:

$ ip link add test type dummy
$ ip link set test up
$ tc qdisc add dev test root netem loss gemodel 140% limit 10
$ tc qdisc show dev test
qdisc netem 800f: root refcnt 2 limit 10 loss gemodel p 40% r 60% 1-h 100% 1-k 0%

This commit changes parse_percent() with a check to ensure
percentage values stay between 1.0 and 0.0.
parse_percent_rate() function, which already employs a similar
check, is adjusted accordingly.

With this check in place, we have:

$ ip link add test type dummy
$ ip link set test up
$ tc qdisc add dev test root netem loss gemodel -10% limit 10
Illegal "loss gemodel p"

Fixes: 927e3cfb52 ("tc: B.W limits can now be specified in %.")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-07-15 13:45:59 -07:00
Stephen Hemminger d5ddb441a5 tc: print all error messages to stderr
Many tc modules were printing error messages to stdout.
This is problematic if using JSON or other output formats.
Change all these places to use fprintf(stderr, ...) instead.

Also, remove unnecessary initialization and places
where else is used after error return.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-07-11 15:35:07 -07:00
David Ahern 1f250b6c53 Merge branch 'master' into next
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-10 14:41:13 -07:00
John Hurley fb57b0920f tc: add mpls actions
Create a new action type for TC that allows the pushing, popping, and
modifying of MPLS headers.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-10 14:06:32 -07:00
Roman Mashak 82f3df2028 tc: added mask parameter in skbedit action
Add 32-bit missing mask attribute in iproute2/tc, which has been long
supported by the kernel side.

v2: print value in hex with print_hex() as suggested by Stephen Hemminger.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-07-09 17:31:16 -07:00
David Ahern 830ac9abe6 Merge branch 'master' into next
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-07-09 14:26:44 -07:00
Andrea Claudi 90f0b587d8 tc: netem: fix r parameter in Bernoulli loss model
As the man page for tc netem states:

    To use the Bernoulli model, the only needed parameter is p while the
    others will be set to the default values r=1-p, 1-h=1 and 1-k=0.

However r parameter is erroneusly set to 1, and not to 1-p.
Fix this using the same approach of the 4-state loss model.

Fixes: 3c7950af59 ("netem: add support for 4 state and GE loss model")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-07-08 08:17:22 -07:00
Andrea Claudi 1e5746d5e1 utils: move parse_percent() to tc_util
As parse_percent() is used only in tc.

This reduces ip, bridge and genl binaries size:

$ bloat-o-meter -t bridge/bridge bridge/bridge.new
add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-109 (-109)
Total: Before=50973, After=50864, chg -0.21%

$ bloat-o-meter -t genl/genl genl/genl.new
add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-109 (-109)
Total: Before=30298, After=30189, chg -0.36%

$ bloat-o-meter ip/ip ip/ip.new
add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-109 (-109)
Total: Before=674164, After=674055, chg -0.02%

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-06-28 16:06:26 -07:00
Jakub Kicinski b3cf1167e7 tc: q_netem: JSON-ify the output
Add JSON output support to q_netem.

The normal output is untouched.

In JSON output always use seconds as the base of time units,
and non-percentage numbers (0.01 instead of 1%). Try to always
report the fields, even if they are zero.
All this should make the output more machine-friendly.

v2: less macroes

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-06-21 15:51:35 -07:00
Hangbin Liu ca697cee4c ip: add a new parameter -Numeric
Add a new parameter '-Numeric' to show the number of protocol, scope,
dsfield, etc directly instead of converting it to human readable name.
Do the same on tc and ss.

This patch is based on David Ahern's previous patch.

Suggested-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-06-18 08:37:47 -07:00
David Ahern 9a4f0ba478 Merge branch 'master' into next
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-06-10 10:32:07 -07:00
Kevin Darbyshire-Bryant d7f2bccd0f tc: add support for action act_ctinfo
ctinfo is a tc action restoring data stored in conntrack marks to
various fields.  At present it has two independent modes of operation,
restoration of DSCP into IPv4/v6 diffserv and restoration of conntrack
marks into packet skb marks.

It understands a number of parameters specific to this action in
additional to the usual action syntax.  Each operating mode is
independent of the other so all options are optional, however not
specifying at least one mode is a bit pointless.

Usage: ... ctinfo [dscp mask [statemask]] [cpmark [mask]] [zone ZONE]
		  [CONTROL] [index <INDEX>]

DSCP mode

dscp enables copying of a DSCP stored in the conntrack mark into the
ipv4/v6 diffserv field.  The mask is a 32bit field and specifies where
in the conntrack mark the DSCP value is located.  It must be 6
contiguous bits long. eg. 0xfc000000 would restore the DSCP from the
upper 6 bits of the conntrack mark.

The DSCP copying may be optionally controlled by a statemask.  The
statemask is a 32bit field, usually with a single bit set and must not
overlap the dscp mask.  The DSCP restore operation will only take place
if the corresponding bit/s in conntrack mark ANDed with the statemask
yield a non zero result.

eg. dscp 0xfc000000 0x01000000 would retrieve the DSCP from the top 6
bits, whilst using bit 25 as a flag to do so.  Bit 26 is unused in this
example.

CPMARK mode

cpmark enables copying of the conntrack mark to the packet skb mark.  In
this mode it is completely equivalent to the existing act_connmark
action.  Additional functionality is provided by the optional mask
parameter, whereby the stored conntrack mark is logically ANDed with the
cpmark mask before being stored into skb mark.  This allows shared usage
of the conntrack mark between applications.

eg. cpmark 0x00ffffff would restore only the lower 24 bits of the
conntrack mark, thus may be useful in the event that the upper 8 bits
are used by the DSCP function.

Usage: ... ctinfo [dscp mask [statemask]] [cpmark [mask]] [zone ZONE]
		  [CONTROL] [index <INDEX>]
where :
	dscp MASK is the bitmask to restore DSCP
	     STATEMASK is the bitmask to determine conditional restoring
	cpmark MASK mask applied to restored packet mark
	ZONE is the conntrack zone
	CONTROL := reclassify | pipe | drop | continue | ok |
		   goto chain <CHAIN_INDEX>

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-06-10 10:24:38 -07:00
Davide Caratti 0ee4d17954 tc: simple: don't hardcode the control action
the following TDC test case:

 b776 - Replace simple action with invalid goto chain control

checks if the kernel correctly validates the 'goto chain' control action,
when it is specified in 'act_simple' rules. The test systematically fails
because the control action is hardcoded in parse_simple(), i.e. it is not
parsed by command line arguments, so its value is constantly TC_ACT_PIPE.
Because of that, the following command:

 # tc action add action simple sdata "test" drop index 7

installs an 'act_simple' rule that never drops packets, and whose 'index'
is the first IDR available, plus an 'act_gact' rule with 'index' equal to
7, that drops packets.

Use parse_action_control_dflt(), like we did on many other TC actions, to
make the control action configurable also with 'act_simple'. The expected
results of test b776 are summarized below:

 iproute2
   v       kernel->| 5.1-rc2 (and previous)  | 5.1-rc3 (and subsequent)
 ------------------+-------------------------+-------------------------
 5.1.0             | FAIL (bad IDR)          | FAIL (bad IDR)
 5.1.0(patched)    | FAIL (no rule/bad sdata)| PASS

Changes since v1:
 - reword commit message, thanks Stephen Hemminger

Fixes: 087f46ee4e ("tc: introduce simple action")
CC: Andrea Claudi <aclaudi@redhat.com>
CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-06-06 14:43:08 -07:00
Roman Mashak fa49588973 tc: Fix binding of gact action by index.
The following operation fails:
% sudo tc actions add action pipe index 1
% sudo tc filter add dev lo parent ffff: \
       protocol ip pref 10 u32 match ip src 127.0.0.2 \
       flowid 1:10 action gact index 1

Bad action type index
Usage: ... gact <ACTION> [RAND] [INDEX]
Where:  ACTION := reclassify | drop | continue | pass | pipe |
                  goto chain <CHAIN_INDEX> | jump <JUMP_COUNT>
        RAND := random <RANDTYPE> <ACTION> <VAL>
        RANDTYPE := netrand | determ
        VAL : = value not exceeding 10000
        JUMP_COUNT := Absolute jump from start of action list
        INDEX := index value used

However, passing a control action of gact rule during filter binding works:

% sudo tc filter add dev lo parent ffff: \
       protocol ip pref 10 u32 match ip src 127.0.0.2 \
       flowid 1:10 action gact pipe index 1

Binding by reference, i.e. by index, has to consistently work with
any tc action.

Since tc is sensitive to the order of keywords passed on the command line,
we can teach gact to skip parsing arguments as soon as it sees 'gact'
followed by 'index' keyword.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-06-06 14:41:31 -07:00
Lukasz Czapnik 767b6fd620 tc: flower: fix port value truncation
sscanf truncates read port values silently without any error. As sscanf
man says:
(...) sscanf() conform to C89 and C99 and POSIX.1-2001. These standards
do not specify the ERANGE error.

Replace sscanf with safer get_be16 that returns error when value is out
of range.

Example:
tc filter add dev eth0 protocol ip parent ffff: prio 1 flower ip_proto
tcp dst_port 70000 hw_tc 1

Would result in filter for port 4464 without any warning.

Fixes: 8930840e67 ("tc: flower: Classify packets based port ranges")
Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-05-28 12:27:01 -07:00
Paolo Abeni 6eccf7ecdb m_mirred: don't bail if the control action is missing
The mirred act admits an optional control action, defaulting
to TC_ACT_PIPE. The parsing code currently emits an error message
if the control action is not provided on the command line, even
if the command itself completes with no error.

This change shuts down the error message, using the appropriate
parsing helper.

Fixes: e67aba5595 ("tc: actions: add helpers to parse and print control actions")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-05-22 11:51:31 -07:00
Matteo Croce 8589eb4efd treewide: refactor help messages
Every tool in the iproute2 package have one or more function to show
an help message to the user. Some of these functions print the help
line by line with a series of printf call, e.g. ip/xfrm_state.c does
60 fprintf calls.
If we group all the calls to a single one and just concatenate strings,
we save a lot of libc calls and thus object size. The size difference
of the compiled binaries calculated with bloat-o-meter is:

        ip/ip:
        add/remove: 0/0 grow/shrink: 5/15 up/down: 103/-4796 (-4693)
        Total: Before=672591, After=667898, chg -0.70%
        ip/rtmon:
        add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-54 (-54)
        Total: Before=48879, After=48825, chg -0.11%
        tc/tc:
        add/remove: 0/2 grow/shrink: 31/10 up/down: 882/-6133 (-5251)
        Total: Before=351912, After=346661, chg -1.49%
        bridge/bridge:
        add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-459 (-459)
        Total: Before=70502, After=70043, chg -0.65%
        misc/lnstat:
        add/remove: 0/1 grow/shrink: 1/0 up/down: 48/-486 (-438)
        Total: Before=9960, After=9522, chg -4.40%
        tipc/tipc:
        add/remove: 0/0 grow/shrink: 1/1 up/down: 18/-62 (-44)
        Total: Before=79182, After=79138, chg -0.06%

While at it, indent some strings which were starting at column 0,
and use tabs where possible, to have a consistent style across helps.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-05-20 14:35:07 -07:00
Vinicius Costa Gomes 92f4b6032e taprio: Add support for cycle_time and cycle_time_extension
This allows a cycle-time and a cycle-time-extension to be specified.

Specifying a cycle-time will truncate that cycle, so when that instant
is reached, the cycle will start from its beginning.

A cycle-time-extension may cause the last entry of a cycle, just
before the start of a new schedule (the base-time of the "admin"
schedule) to be extended by at maximum "cycle-time-extension"
nanoseconds. The idea of this feauture, as described by the IEEE
802.1Q, is too avoid too narrow gate states.

Example:

tc qdisc change dev IFACE parent root handle 100 taprio \
	      sched-entry S 0x1 1000000 \
	      sched-entry S 0x0 2000000 \
	      sched-entry S 0x1 3000000 \
	      sched-entry S 0x0 4000000 \
	      cycle-time-extension 100000 \
	      cycle-time 9000000 \
	      base-time 12345678900000000

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-05-04 09:22:15 -07:00
Vinicius Costa Gomes 602fae856d taprio: Add support for changing schedules
This allows for a new schedule to be specified during runtime, without
removing the current one.

For that, the semantics of the 'tc qdisc change' operation in the
context of taprio is that if "change" is called and there is a running
schedule, a new schedule is created and the base-time (let's call it
X) of this new schedule is used so at instant X, it becomes the
"current" schedule. So, in short, "change" doesn't change the current
schedule, it creates a new one and sets it up to it becomes the
current one at some point.

In IEEE 802.1Q terms, it means that we have support for the
"Oper" (current and read-only) and "Admin" (future and mutable)
schedules.

Example of creating the first schedule, then adding a new one:

(1)
tc qdisc add dev IFACE parent root handle 100 taprio \
      	      num_tc 1 \
	      map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
	      queues 1@0 \
	      sched-entry S 0x1 1000000 \
	      sched-entry S 0x0 2000000 \
	      sched-entry S 0x1 3000000 \
	      sched-entry S 0x0 4000000 \
	      base-time 100000000 \
	      clockid CLOCK_TAI

(2)
tc qdisc change dev IFACE parent root handle 100 taprio \
	      base-time 7500000000000 \
	      sched-entry S 0x0 5000000 \
              sched-entry S 0x1 5000000 \

It was necessary to fix a bug, so the clockid doesn't need to be
specified when changing the schedule.

Most of the changes are related to make it easier to reuse the same
function for printing the "admin" and "oper" schedules.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-05-04 09:22:15 -07:00
Paolo Abeni c865c52365 tc: add support for plug qdisc
sch_plug can be used to perform functional qdisc unit tests
controlling explicitly the queuing behaviour from user-space.

Plug support lacks since its introduction in 2012. This change
introduces basic support, to control the tc status.

v1 -> v2:
 - use the SPDX identifier

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-05-04 09:22:14 -07:00
Stephen Hemminger 38983334f6 tc/ematch: fix deprecated yacc warning
Newer versions of Bison deprecated some directives.

    YACC     emp_ematch.yacc.c
emp_ematch.y:11.1-14: warning: deprecated directive, use ‘%define parse.error verbose’ [-Wdeprecated]
 %error-verbose
 ^~~~~~~~~~~~~~
emp_ematch.y:12.1-22: warning: deprecated directive, use ‘%define api.prefix {ematch_}’ [-Wdeprecated]
 %name-prefix "ematch_"

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-04-24 15:10:22 -07:00
Toke Høiland-Jørgensen d5d27f27d8 q_cake: Add support for setting the fwmark option
This adds support for the newly added fwmark option to CAKE, which allows
overriding the tin selection from the per-packet firewall marks. The fwmark
field is a bitmask that is applied to the fwmark to select the tin.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-04-05 15:01:31 -07:00
Leslie Monis 492ec9558b tc: pie: change maximum integer value of tc_pie_xstats->prob
tc_pie_xstats->prob has a maximum value of (2^64 - 1).

Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-03-29 14:26:00 -07:00
Stephen Hemminger 50cf634899 Merge branch 'master' of ../iproute2-next 2019-03-19 10:32:45 -07:00
Kevin 'ldir' Darbyshire-Bryant ef1e02e6ac tc: m_connmark: fix action error messages
action m_connmark returns error messages identifying itself as the
'simple' action instead of 'connmark' action. e.g.

tc filter add dev eth0 protocol all u32 match u32 0 0 flowid 1:1 \
	action connmark index wrong
simple: Illegal "index"
bad action parsing
parse_action: bad value (3:connmark)!
Illegal "action"

In what is most likely a copy/paste error from the simple action example
code, fix connmark error messages to identify themselves as coming from
connmark.

tc filter add dev eth0 protocol all u32 match u32 0 0 flowid 1:1 \
	action connmark index wrong
connmark: Illegal "index"
bad action parsing
parse_action: bad value (3:connmark)!
Illegal "action"

While we're here also fixup the 'Illegal "Zone"' error code to say
'Illegal "zone"' instead of 'Illegal "index"'

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-03-19 09:49:07 -07:00
David Ahern be029b3a58 Merge branch 'iproute2-master' into next
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-03-05 07:55:05 -08:00
Dmytro Linkin 2f103545a5 tc/pedit: Fix wrong pedit ipv6 structure id
Tc pedit action with more than two ip6 munge in a row cause infinite
loop.

Example:

$ tc filter add dev eth0 protocol ipv6 parent ffff: \
flower ip_proto sctp \
    action pedit ex \
        munge ip6 hoplimit set 0x1 \
        munge ip6 src set 2001:0db8:0:f101::1 \
        munge that cause infinite loop

The example command never returns, instead of failing with parse error
as expected. Pedit ipv6 structure has wrong id, which leads to the
creation linked list with one node in tc/m_pedit.c:get_pedit_kind(),
referring to itself. This node is created if command have two ip6 munge
in a row, and any third ip6 munge will cause infinite loop.
Changing this id from "ipv6" to "ip6" solves the problem.

Fixes: f3e1b2448a ("pedit: Introduce ipv6 support")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-03-01 11:05:00 -08:00
David Ahern 9f78e995a8 Merge branch 'iproute2-master' into next
Conflicts:
	misc/ss.c

Signed-off-by: David Ahern <dsahern@gmail.com>
2019-02-22 18:50:39 -08:00
Marcos Antonio Moraes 9e46c5c206 tc: use bits not mbits/sec in rate percent
As /sys/class/net/<iface>/speed indicates a value in Mbits/sec, the
conversion is necessary to create the correct limits.

This guarantees the same result for the following commands in an
1000Mbit/sec device:

tc class add ... htb rate 500Mbit
tc class add ... htb rate 50%

Fixes: 927e3cfb52 ("tc: B.W limits can now be specified in %.")
Signed-off-by: Marcos Antonio Moraes <marcos.antonio@digirati.com.br>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-02-08 09:59:45 -08:00
Stephen Hemminger 817204d0b0 tc: avoid problems with hard coded rate string length
The parse_percent_rate function assumed the buffer was 20 characters.
Better to pass length in case the size ever changes.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-02-06 10:49:47 -08:00
Stephen Hemminger 2d603d55a8 tc: fix memory leak in error path
If value passed to parse_percent was not valid, it would
leak the dynamic allocation from sscanf.

Fixes: 927e3cfb52 ("tc: B.W limits can now be specified in %.")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-02-06 10:41:58 -08:00
Davide Caratti e8a3d76919 tc: add 'kind' property to 'csum' action
unlike other TC actions already supporting JSON printout, 'csum' does not
print the value of TCA_KIND in the 'kind' property: remove 'csum' word
from 'csum' property, and add a separate 'kind' property containing the
action name. The human-readable printout is preserved.

Tested with:
 # ./tdc.py -c csum

Cc: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-02-03 09:10:38 -08:00
Davide Caratti 52d57f6bbd tc: full JSON support for 'bpf' actions
Add full JSON output support in the dump of 'act_bpf'.

Example using eBPF:

 # tc actions flush action bpf
 # tc action add action bpf object bpf/action.o section 'action-ok'
 # tc -j action list action bpf | jq
 [
   {
     "total acts": 1
   },
   {
     "actions": [
       {
         "order": 0,
         "kind": "bpf",
         "bpf_name": "action.o:[action-ok]",
         "prog": {
           "id": 33,
           "tag": "a04f5eef06a7f555",
           "jited": 1
         },
         "control_action": {
           "type": "pipe"
         },
         "index": 1,
         "ref": 1,
         "bind": 0
       }
     ]
   }
 ]

Example using cBPF:

 # tc actions flush action bpf
 # a=$(mktemp)
 # tcpdump -ddd not ether proto 0x888e >$a
 # tc action add action bpf bytecode-file $a index 42
 # rm $a
 # tc -j action list action bpf | jq
 [
   {
     "total acts": 1
   },
   {
     "actions": [
       {
         "order": 0,
         "kind": "bpf",
         "bytecode": {
           "length": 4,
           "insns": [
             {
               "code": 40,
               "jt": 0,
               "jf": 0,
               "k": 12
             },
             {
               "code": 21,
               "jt": 0,
               "jf": 1,
               "k": 34958
             },
             {
               "code": 6,
               "jt": 0,
               "jf": 0,
               "k": 0
             },
             {
               "code": 6,
               "jt": 0,
               "jf": 0,
               "k": 262144
             }
           ]
         },
         "control_action": {
           "type": "pipe"
         },
         "index": 42,
         "ref": 1,
         "bind": 0
       }
     ]
   }
 ]

Tested with:
 # ./tdc.py -c bpf

Cc: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-02-03 09:10:10 -08:00
Stephen Hemminger 6f1940da8e tc: replace left side comparison
The kernel (and iproute2) don't use the if (NULL == x) style
and instead prefer if (!x)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-01-28 08:51:03 -08:00
Hans Dedecker 2874714662 f_flower: fix build with musl libc
XATTR_SIZE_MAX requires the usage of linux/limits.h; let's include it

Signed-off-by: Hans Dedecker <dedeckeh@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-01-25 09:20:03 +13:00
David Ahern b45664e064 Merge 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-01-22 08:30:38 -08:00
Adi Nissim dc0332b1e8 tc: m_tunnel_key: Allow key-less tunnels
Change the id parameter of the tunnel_key set action from mandatory to
optional.

Some tunneling protocols (e.g. GRE) specify the id as an optional field.

Signed-off-by: Adi Nissim <adin@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2019-01-22 16:04:07 +13:00
Cong Wang b0ca46a1f8 tc: add hit counter for matchall
Cc: Martin Olsson <martin.olsson+netdev@sentorsecurity.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2019-01-21 08:30:07 -08:00
David Ahern 6065ddfaa7 Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-12-19 12:02:17 -08:00
Syrone Wong 6ddb36c3a9 tc: fix xtables incorrect usage of LDFLAGS
The incorrect setting of LDFLAGS causes error below:

> em_ipt.o: In function `em_ipt_print_epot':
> em_ipt.c:(.text.em_ipt_print_epot+0x2e): undefined reference to
> `xtables_init_all'

em_ipt.c gets involved when TC_CONFIG_XT=y, which requires xtables,
while tc/Makefile doesn't pass flags correctly. It adds '-lxtables'
to LDFLAGS instead of LDLIBS.

Fixes: dd296215 ("tc: add em_ipt ematch for calling xtables matches from tc matching context")

Signed-off-by: Syrone Wong <wong.syrone@gmail.com>
Acked-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-12-13 11:38:43 -08:00
Stephen Hemminger 90c5c969f0 fix print_0xhex on 32 bit
The argument to print_0xhex is converted to unsigned long long
so the format string give for normal printout has to be some
variant of %llx. Otherwise, bogus values will be printed on
32 bit platforms.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-12-10 14:20:32 -08:00
Amritha Nambiar 8930840e67 tc: flower: Classify packets based port ranges
Added support for filtering based on port ranges.
UAPI changes have been accepted into net-next.

Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower ip_proto tcp dst_port 20-30 skip_hw\
  action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  ip_proto tcp
  dst_port 20-30
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 85 sec used 3 sec
        Action statistics:
        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port 100-200\
  skip_hw action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
  eth_type ipv4
  ip_proto tcp
  dst_ip 192.168.1.1
  dst_port 100-200
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 2 ref 1 bind 1 installed 58 sec used 2 sec
        Action statistics:
        Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

v6:
Modified to change json output format as object for sport/dport.

 "dst_port":{
           "start":2000,
           "end":6000
 },
 "src_port":{
           "start":50,
           "end":60
 }

v5:
Simplified some code and used 'sscanf' for parsing. Removed
space in output format.

v4:
Added man updates explaining filtering based on port ranges.
Removed 'range' keyword.

v3:
Modified flower_port_range_attr_type calls.

v2:
Addressed Jiri's comment to sync output format with input

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-12-03 16:02:58 -08:00
David Ahern dd7d522a67 Revert "tc: flower: Classify packets based port ranges"
This reverts commit e20e50b0c1.

Inadvertently pushed v3 of this patch.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-12-03 16:01:07 -08:00
David Ahern fb417073a3 Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-12-03 15:39:29 -08:00
Eric Dumazet 3adcbf3757 tc: add a missing space between rate estimator and backlog
When a rate estimator is active, "tc -s qd" displays
something like :

rate 12616bit 11ppsbacklog 0b 0p requeues 2

instead of :

rate 12616bit 11pps backlog 0b 0p requeues 2

Fixes: 4fcec7f366 ("tc: jsonify stats2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-12-03 14:34:05 -08:00
Eric Dumazet 55e106c480 tc: fq: support ce_threshold attribute
Kernel commit 48872c11b772 ("net_sched: sch_fq: add dctcp-like marking")
added support for TCA_FQ_CE_THRESHOLD attribute.

This patch adds iproute2 support for it.

It also makes sure fq_print_xstats() can deal with smaller tc_fq_qd_stats
structures given by older kernels.

Usage :

FQATTRS="ce_threshold 4ms"
TXQS=8

for ETH in eth0
do
 tc qd del dev $ETH root 2>/dev/null
 tc qd add dev $ETH root handle 1: mq
 for i in `seq 1 $TXQS`
 do
  tc qd add dev $ETH parent 1:$i fq $FQATTRS
 done
done

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:30:24 -08:00
Jakub Kicinski f7a8749aff tc: gred: allow controlling and dumping per-DP RED flags
Kernel now support setting ECN and HARDDROP flags per-virtual
queue.  Allow users to tweak the settings, and print them on
dump.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:11:40 -08:00
Jakub Kicinski 2d7c564a1e tc: gred: support controlling RED flags
Kernel GRED qdisc supports ECN marking, and the harddrop flag
but setting and dumping this flag is not possible with iproute2.
Add the support.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:11:36 -08:00
Jakub Kicinski fdaff63c6a tc: gred: use extended stats if available
Use the extended attributes with extra and better stats, when
possible.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:11:19 -08:00
Jakub Kicinski c3e1cd28c1 tc: gred: separate out stats printing
Printing GRED statistics is long and deserves a function on its own.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:11:09 -08:00
Jakub Kicinski 6475e6a580 tc: gred: jsonify GRED output
Make GRED dump JSON-compatible.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:11:04 -08:00
Jakub Kicinski 33021752cd tc: move RED flag printing to helper
Number of qdiscs use the same set of flags to control shared RED
implementation.  Add a helper for printing those flags.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:10:58 -08:00
Jakub Kicinski c8f201e3d2 tc: gred: remove unclear comment
The comment about providing a proper message seems similar to
the comment in the kernel which says:

    /* hack -- fix at some point with proper message
       This is how we indicate to tc that there is no VQ
       at this DP */

it's unclear what that message would be, and whether it's needed.
Remove the confusing comment.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:08:16 -08:00
David Ahern 0868c8ab07 Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-24 07:06:11 -08:00
Amritha Nambiar e20e50b0c1 tc: flower: Classify packets based port ranges
Added support for filtering based on port ranges.
UAPI changes have been accepted into net-next.

Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\
  action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  ip_proto tcp
  dst_port range 20-30
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 85 sec used 3 sec
        Action statistics:
        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\
  skip_hw action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
  eth_type ipv4
  ip_proto tcp
  dst_ip 192.168.1.1
  dst_port range 100-200
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 2 ref 1 bind 1 installed 58 sec used 2 sec
        Action statistics:
        Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

v3:
Modified flower_port_range_attr_type calls.

v2:
Addressed Jiri's comment to sync output format with input

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-11-20 14:34:56 -08:00
Stephen Hemminger 946a135c58 tc/pedit: use structure initialization
The pedit callback structure table should be iniatialized using
structure initialization to avoid structure changes problems.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Stephen Hemminger 9e96e71594 tc/action: make variables static
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Stephen Hemminger 42d9eed451 tc/meta: make meta_table static and const
The mapping table is only used by em_meta.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Stephen Hemminger 9455bec52a tc/util: make local functions static
The tc util library parse/print has functions only used locally
(and some dead code removed).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Stephen Hemminger 33043dfc9c tc/ematch: make local functions static
The print handling is only used in tc/m_ematch.c

Remove unused function to print_ematch_tree.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Stephen Hemminger 7527b221d6 tc/pedit: make functions static
The parse and pack functions are only used by the pedit routines.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Stephen Hemminger a38fadf401 tc/police: make print_police static
print_police function only used by m_police.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Stephen Hemminger 7e569d92a9 tc/class: make filter variables static
Only used in this file.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-19 11:42:44 -08:00
Jakub Kicinski 9c5f4251d6 tc: f_u32: allow skip_hw and skip_sw flags to be last
u32 uses NEXT_ARG() incorrectly when parsing skip_hw and skip_sw
flags.  NEXT_ARG() ensures there is another argument on the command
line, and is used in handling <keyword> <value> syntax to move past
<keyword> and ensure there is a <value> to read.

Commit 5e5b3008d1 ("tc: f_u32: Add support for skip_hw and skip_sw
flags") seems to have copy pasted the handling from the previous
command - "police", which needs an extra parameter and is kind of
special due to the use of parse_police() helper.

The combination of NEXT_ARG() and continue worked fine as long as
skip_sw/skip_hw wasn't last, e.g.:

$ tc filter add dev dummy0 ingress prio 101 protocol ipv6 \
    u32 match ip6 priority 0xa0 0xe0 skip_hw action pass

But would fail if it was last:

$ tc filter add dev dummy0 ingress prio 101 protocol ipv6 \
    u32 match ip6 priority 0xa0 0xe0 flowid :1 skip_hw
Command line is not complete. Try option "help"

Remove the NEXT_ARG()s and the continues, and let the argc--; argv++;
at the end of the loop do its job.

Fixes: 5e5b3008d1 ("tc: f_u32: Add support for skip_hw and skip_sw flags")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-09 08:12:29 -08:00
Luca Boccassi 1a03ac6b05 Pass CPPFLAGS to the compiler
When building Debian packages pre-processor flags are passed via
CPPFLAGS, as the convention indicates. Specifically, the hardening
-D_FORTIFY_SOURCE=2 flag is used.
Pass CPPFLAGS to all calls of QUIET_CC together with CFLAGS.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-09 08:07:18 -08:00
Luca Boccassi 6d2fd4a53f Include bsd/string.h only in include/utils.h
This is simpler and cleaner, and avoids having to include the header
from every file where the functions are used. The prototypes of the
internal implementation are in this header, so utils.h will have to be
included anyway for those.

Fixes: 508f3c231e ("Use libbsd for strlcpy if available")

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-05 08:38:32 -08:00
Luca Boccassi 508f3c231e Use libbsd for strlcpy if available
If libc does not provide strlcpy check for libbsd with pkg-config to
avoid relying on inline version.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-11-01 12:47:03 -07:00
David Ahern 6e221408e6 Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-23 10:55:09 -07:00
Phil Sutter 737b8258b3 tc: htb: Print default value in hex
Value of 'default' is assumed to be hexadecimal when parsing, so
consequently it should be printed in hex as well. This is a regression
introduced when adding JSON output.

As requested, also change JSON output to print the value as hex string.

Fixes: f354fa6aa5 ("tc: jsonify htb qdisc")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-10-23 10:07:10 -07:00
Phil Sutter 6358bbc381 tc: Remove pointless assignments in batch()
All these assignments are later overwritten without reading in between,
so just drop them.

Fixes: 485d0c6001 ("tc: Add batchsize feature for filter and actions")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-10-22 10:05:43 -07:00
David Ahern cd554f2c2f Tree wide: Drop sockaddr_nl arg
No function, filter, or print function uses the sockaddr_nl arg,
so just drop it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2018-10-22 09:43:48 -07:00
Stephen Hemminger f5a398bf17 tc: spelling fixes
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-10-18 13:22:51 -07:00
David Ahern 0d30c1f8d4 Merge branch 'master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-13 19:31:37 -07:00
Jakub Kicinski 650a10e032 tc: jsonify output of q_fifo
Print limits correctly in JSON context.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-10-08 09:22:22 -07:00
Vinicius Costa Gomes 0dd1644935 tc: Add support for configuring the taprio scheduler
This traffic scheduler allows traffic classes states (transmission
allowed/not allowed, in the simplest case) to be scheduled, according
to a pre-generated time sequence. This is the basis of the IEEE
802.1Qbv specification.

Example configuration:

tc qdisc replace dev enp3s0 parent root handle 100 taprio \
          num_tc 3 \
	  map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
	  queues 1@0 1@1 2@2 \
	  base-time 1528743495910289987 \
	  sched-entry S 01 300000 \
	  sched-entry S 02 300000 \
	  sched-entry S 04 300000 \
	  clockid CLOCK_TAI

The configuration format is similar to mqprio. The main difference is
the presence of a schedule, built by multiple "sched-entry"
definitions, each entry has the following format:

     sched-entry <CMD> <GATE MASK> <INTERVAL>

The only supported <CMD> is "S", which means "SetGateStates",
following the IEEE 802.1Qbv-2015 definition (Table 8-6). <GATE MASK>
is a bitmask where each bit is a associated with a traffic class, so
bit 0 (the least significant bit) being "on" means that traffic class
0 is "active" for that schedule entry. <INTERVAL> is a time duration
in nanoseconds that specifies for how long that state defined by <CMD>
and <GATE MASK> should be held before moving to the next entry.

This schedule is circular, that is, after the last entry is executed
it starts from the first one, indefinitely.

The other parameters can be defined as follows:

 - base-time: specifies the instant when the schedule starts, if
  'base-time' is a time in the past, the schedule will start at

 	      base-time + (N * cycle-time)

   where N is the smallest integer so the resulting time is greater
   than "now", and "cycle-time" is the sum of all the intervals of the
   entries in the schedule;

 - clockid: specifies the reference clock to be used;

The parameters should be similar to what the IEEE 802.1Q family of
specification defines.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-07 10:32:08 -07:00
Vlad Buslov f6b498f957 tc: flower: expose hardware offload count
Recently flower classifier was updated to expose count of devices that
filter is offloaded to. Add support to print this counter as 'in_hw_count'.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
2018-10-07 10:14:09 -07:00
Eelco Chaudron 5ac138324e tc_util: Add support for showing TCA_STATS_BASIC_HW statistics
Add support for showing hardware specific counters to easy
troubleshooting hardware offload.

$ tc -s filter show dev enp3s0np0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 2.0.0.0
  src_ip 1.0.0.0
  ip_flags nofrag
  in_hw
        action order 1: mirred (Egress Redirect to device eth1) stolen
        index 1 ref 1 bind 1 installed 0 sec used 0 sec
        Action statistics:
        Sent 534884742 bytes 8915697 pkt (dropped 0, overlimits 0 requeues 0)
        Sent software 187542 bytes 4077 pkt
        Sent hardware 534697200 bytes 8911620 pkt
        backlog 0b 0p requeues 0
        cookie 89173e6a44447001becfd486bda17e29

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 14:45:33 -07:00
Pieter Jansen van Vuuren 56155d4df8 tc: f_flower: add geneve option match support to flower
Allow matching on options in Geneve tunnel headers.

The options can be described in the form
CLASS:TYPE:DATA/CLASS_MASK:TYPE_MASK:DATA_MASK, where CLASS is
represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.

e.g.
 # ip link add name geneve0 type geneve dstport 0 external
 # tc qdisc add dev geneve0 ingress
 # tc filter add dev geneve0 protocol ip parent ffff: \
     flower \
       enc_src_ip 10.0.99.192 \
       enc_dst_ip 10.0.99.193 \
       enc_key_id 11 \
       geneve_opts 0102:80:1122334421314151/ffff:ff:ffffffffffffffff \
       ip_proto udp \
       action mirred egress redirect dev eth1

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 14:39:55 -07:00
David Ahern 34212c73b7 Merge branch 'iproute2-master' into iproute2-next
Conflicts:
	ip/iproute_lwtunnel.c

In addition to merge conflict between bd59e5b151 and 94a8722f2f,
updated the code added by the latter commit based on the change of the
former (ie., added ret = to the new rta_addattr_l).

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-09-20 17:53:27 -07:00
Toke Høiland-Jørgensen 2153e01f36 q_cake: Also print nonat, nowash and no-ack-filter keywords
Similar to the previous patch for no-split-gso, the negative keywords for
'nat', 'wash' and 'ack-filter' were not printed either. Add those well.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-14 11:32:46 -07:00
Toke Høiland-Jørgensen b914fe5f1c q_cake: Add printing of no-split-gso option
When the GSO splitting was turned into dual split-gso/no-split-gso options,
the printing of the latter was left out. Add that, so output is consistent
with the options passed.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-12 12:59:38 -07:00
Stephen Hemminger b85076cd74 lib: introduce print_nl
Common pattern in iproute commands is to print a line seperator
in non-json mode. Make that a simple function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-11 08:29:33 -07:00
Caleb Raitto 40c2916fda tc/mqprio: Print extra info on invalid args.
Print the name of the argument that wasn't understood.

Signed-off-by: Caleb Raitto <caraitto@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 12:14:00 -07:00
Stephen Hemminger ad618b7984 tc/fifo: remove unnecessary prototype
The prototype for prio_print_opt is already in tc_util.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 11:50:22 -07:00
Yousuk Seung 588dd51e2c q_netem: slotting with non-uniform distribution
Extend slotting with support for non-uniform distributions. This is
similar to netem's non-uniform distribution delay feature.

Syntax:
   slot distribution DISTRIBUTION DELAY JITTER [packets MAX_PACKETS] \
      [bytes MAX_BYTES]

The syntax and use of the distribution table is the same as in the
non-uniform distribution delay feature. A file DISTRIBUTION must be
present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by
NETEM_DIST_SCALE. A random value x is selected from the table and it
takes DELAY + ( x * JITTER ) as delay. Correlation between values is not
supported.

Examples:
  Normal distribution delay with mean = 800us and stdev = 100us.
  > tc qdisc add dev eth0 root netem slot distribution normal \
    800us 100us

  Optionally set the max slot size in bytes and/or packets.
  > tc qdisc add dev eth0 root netem slot distribution normal \
    800us 100us bytes 64k packets 42

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:08:19 -07:00
Dave Taht b6268fbd58 q_netem: support delivering packets in delayed time slots
Slotting is a crude approximation of the behaviors of shared media such
as cable, wifi, and LTE, which gather up a bunch of packets within a
varying delay window and deliver them, relative to that, nearly all at
once.

It works within the existing loss, duplication, jitter and delay
parameters of netem. Some amount of inherent latency must be specified,
regardless.

The new "slot" parameter specifies a minimum and maximum delay between
transmission attempts.

The "bytes" and "packets" parameters can be used to limit the amount of
information transferred per slot.

Examples of use:

tc qdisc add dev eth0 root netem delay 200us \
        slot 800us 10ms bytes 64k packets 42

A more correct example, using stacked netem instances and a packet limit
to emulate a tail drop wifi queue with slots and variable packet
delivery, with a 200Mbit isochronous underlying rate, and 20ms path
delay:

tc qdisc add dev eth0 root handle 1: netem delay 20ms rate 200mbit \
         limit 10000
tc qdisc add dev eth0 parent 1:1 handle 10:1 netem delay 200us \
         slot 800us 10ms bytes 64k packets 42 limit 512

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:07:46 -07:00
Dave Taht abf70ef494 tc: support conversions to or from 64 bit nanosecond-based time
Using a 32 bit field to represent time in nanoseconds results in a
maximum value of about 4.3 seconds, which is well below many observed
delays in WiFi and LTE, and barely in the ballpark for a trip past the
Earth's moon, Luna.

Using 64 bit time fields in nanoseconds allows us to simulate
network diameters of several hundred light-years. However, only
conversions to and from ns, us, ms, and seconds are provided.

The iproute2 64 bit api uses signed values for time. Being able to
represent positive or negative time allows us to calculate +/- deltas
between, for example, the CLOCK_TAI and CLOCK_REALTIME clocks.

Time related utility functions in tc_util.c are moved to lib/utils.c.

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:04:38 -07:00
Florent Fourcot 2bfe28710e tc/htb: remove unused variable
Since introduction of htb module, this variable has never been used.

Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-30 08:00:45 -07:00
Mahesh Bandewar 5d5586b058 iproute: make clang happy
These are primarily fixes for "string is not string literal" warnings
/ errors (with -Werror -Wformat-nonliteral). This should be a no-op
change. I had to replace couple of print helper functions with the
code they call as it was becoming harder to eliminate these warnings,
however these helpers were used only at couple of places, so no
major change as such.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-30 07:58:09 -07:00
Stephen Hemminger a8e9f4ae14 tc: drop extern from function prototypes
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-20 16:01:31 -07:00
Phil Sutter ff1ab8edf8 Make colored output configurable
Allow for -color={never,auto,always} to have colored output disabled,
enabled only if stdout is a terminal or enabled regardless of stdout
state.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-20 08:54:06 -07:00
Phil Sutter 4d82962ccc Merge common code for conditionally colored output
Instead of calling enable_color() conditionally with identical check in
three places, introduce check_enable_color() which does it in one place.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-15 09:55:27 -07:00
Phil Sutter 0d0e0e0bef tc: Fix typo in check for colored output
The check used binary instead of boolean AND, which means colored output
was enabled only if the number of specified '-color' flags was odd.

Fixes: 2d165c0811 ("tc: implement color output")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-15 09:54:32 -07:00
Nishanth Devarajan 141b55f854 Add SKB Priority qdisc support in tc(8)
sch_skbprio is a qdisc that prioritizes packets according to their skb->priority
field. Under congestion, it drops already-enqueued lower priority packets to
make space available for higher priority packets. Skbprio was conceived as a
solution for denial-of-service defenses that need to route packets with
different priorities as a means to overcome DoS attacks.

Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com>
Reviewed-by: Michel Machado <michel@digirati.com.br>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-14 07:06:43 -07:00
David Ahern c044be6b34 Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-13 07:47:21 -07:00
Toke Høiland-Jørgensen 23a67b008a sch_cake: Make gso-splitting configurable
This patch makes sch_cake's gso/gro splitting configurable
from userspace.

To disable breaking apart superpackets in sch_cake:

tc qdisc replace dev whatever root cake no-split-gso

to enable:

tc qdisc replace dev whatever root cake split-gso

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-13 07:41:44 -07:00
Keara Leibovitz e8bd395508 tc: fix bugs for tcp_flags and ip_attr hex output
Fix hex output for both the ip_attr and tcp_flags print functions.

Sample usage:

$ $TC qdisc add dev lo ingress
$ $TC filter add dev lo parent ffff: prio 3 proto ip flower ip_tos 0x8/32
$ $TC fitler add dev lo parent ffff: prio 5 proto ip flower ip_proto tcp \
	tcp_flags 0x909/f00

$ $TC filter show dev lo parent ffff:

filter protocol ip pref 3 flower chain 0
filter protocol ip pref 3 flower chain 0 handle 0x1
  eth_type ipv4
  ip_tos 0x8/32
  not_in_hw
filter protocol ip pref 5 flower chain 0
filter protocol ip pref 5 flower chain 0 handle 0x1
  eth_type ipv4
  ip_proto tcp
  tcp_flags 0x909/f00
  not_in_hw

$ $TC -j filter show dev lo parent ffff:

[{
    "protocol":"ip",
    "pref":3,
    "kind":"flower",
    "chain":0
},{
    "protocol":"ip",
    "pref":3,
    "kind":"flower",
    "chain":0,
    "options": {
	"handle":1,
	"keys": {
	    "eth_type":"ipv4",
	    "ip_tos":"0x8/32"
    },
    "not_in_hw":true
    }
},{
    "protocol":"ip",
    "pref":5,
    "kind":"flower",
    "chain":0
},{
    "protocol":"ip",
    "pref":5,
    "kind":"flower",
    "chain":0,
    "options": {
	"handle":1,
	"keys": {
	    "eth_type":"ipv4",
	    "ip_proto":"tcp",
	    "tcp_flags":"0x909/f00"
	},
	"not_in_hw":true
    }
}]

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-12 14:04:00 -07:00
Stephen Hemminger d66fdfda71 tc: flush after each command in batch mode
After each command flush output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-08 09:23:48 -07:00
David Ahern a0bc57e1ef Merge branch 'iproute2-master' into iproute2-next
Conflicts:
	include/uapi/linux/bpf.h

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-25 10:08:04 -07:00
Jiri Pirko afcd06991d tc: introduce support for chain templates
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-25 10:00:28 -07:00
Or Gerlitz 761ec9e29f tc/flower: Add match on encapsulating tos/ttl
Add matching on tos/ttl of the IP tunnel headers.

For example, here's decap rule that matches on the tunnel tos:

tc filter add dev vxlan_sys_4789 protocol ip parent ffff: prio 10 flower \
   enc_src_ip 192.168.10.2 enc_dst_ip 192.168.10.1 enc_key_id 100 enc_dst_port 4789 enc_tos 0x30 \
   src_mac e4:11:22:33:44:70 dst_mac e4:11:22:33:44:50  \
   action tunnel_key unset \
   action mirred egress redirect dev eth0_0

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-20 08:59:11 -07:00
Or Gerlitz 9f89b0cc0e tc/act_tunnel_key: Enable setup of tos and ttl
Allow to set tos and ttl for the tunnel.

For example, here's encap rule that sets tos to the tunnel:

tc filter add dev eth0_0 protocol ip parent ffff: prio 10 flower \
   src_mac e4:11:22:33:44:50 dst_mac e4:11:22:33:44:70 \
   action tunnel_key set src_ip 192.168.10.1 dst_ip 192.168.10.2 id 100 dst_port 4789 tos 0x30 \
   action mirred egress redirect dev vxlan_sys_4789

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-20 08:58:31 -07:00
Toke Høiland-Jørgensen 77c9fbd06e q_cake: Rename autorate_ingress parameter to use dash as word separator
This is consistent with the other multi-word parameters. Also change the
JSON output to be consistent with way it is formatted for the other
options.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-20 08:46:42 -07:00
Jesus Sanchez-Palencia b625e36108 tc: Do not use addattr_nest_compat on mqprio and netem
Here we are partially reverting commit c14f9d92ee
"treewide: Use addattr_nest()/addattr_nest_end() to handle nested
attributes" .

As discussed in [1], changing from the 'manually' coded version that
used addattr_l() to addattr_nest_compat() wasn't functionally
equivalent, because now the messages have extra fields appended to it.

This introduced a regression since the implementation of parse_attr()
from both mqprio and netem can't handle this new message format.

Without this fix, mqprio returns an error. netem won't return an error
but its internal configuration ends up wrong.

As an example, this can be reproduced by the following commands when
this patch is not applied:

 1) mqprio
$ tc qdisc replace dev enp3s0 parent root handle 100 mqprio \
	num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
	queues 1@0 1@1 2@2 hw 0

RTNETLINK answers: Numerical result out of range

 2) netem
$ tc qdisc add dev enp3s0 root netem rate 5kbit 20 100 5 \
	distribution normal latency 1 1

$ tc -s qdisc

(...)
qdisc netem 8001: dev enp3s0 root refcnt 9 limit 1000 delay 0us  0us
 Sent 402 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
(...)

With this patch applied, the tc -s qdisc command above for netem instead
reads:

(...)
qdisc netem 8002: dev enp3s0 root refcnt 9 limit 1000 delay 0us  0us \
	rate 5Kbit packetoverhead 20 cellsize 100 celloverhead 5
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
(...)

[1] https://patchwork.ozlabs.org/patch/867860/#1893405

Fixes: c14f9d92ee ("treewide: Use addattr_nest()/addattr_nest_end() to handle nested attributes")
Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-07-19 15:50:07 -07:00
Toke Høiland-Jørgensen 714444c0cb Add support for CAKE qdisc
sch_cake is intended to squeeze the most bandwidth and latency out of even
the slowest ISP links and routers, while presenting an API simple enough
that even an ISP can configure it.

Example of use on a cable ISP uplink:

tc qdisc add dev eth0 cake bandwidth 20Mbit nat docsis ack-filter

To shape a cable download link (ifb and tc-mirred setup elided)

tc qdisc add dev ifb0 cake bandwidth 200mbit nat docsis ingress wash besteffort

Cake is filled with:

* A hybrid Codel/Blue AQM algorithm, "Cobalt", tied to an FQ_Codel
  derived Flow Queuing system, which autoconfigures based on the bandwidth.
* A novel "triple-isolate" mode (the default) which balances per-host
  and per-flow FQ even through NAT.
* An deficit based shaper, that can also be used in an unlimited mode.
* 8 way set associative hashing to reduce flow collisions to a minimum.
* A reasonable interpretation of various diffserv latency/loss tradeoffs.
* Support for zeroing diffserv markings for entering and exiting traffic.
* Support for interacting well with Docsis 3.0 shaper framing.
* Support for DSL framing types and shapers.
* Support for ack filtering.
* Extensive statistics for measuring, loss, ecn markings, latency variation.

Various versions baking have been available as an out of tree build for
kernel versions going back to 3.10, as the embedded router world has been
running a few years behind mainline Linux. A stable version has been
generally available on lede-17.01 and later.

sch_cake replaces a combination of iptables, tc filter, htb and fq_codel
in the sqm-scripts, with sane defaults and vastly simpler configuration.

Cake's principal author is Jonathan Morton, with contributions from
Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebastian Moeller,
Ryan Mounce, Tony Ambardar, Dean Scarff, Nils Andreas Svee, Dave Täht,
and Loganaden Velvindron.

Testing from Pete Heist, Georgios Amanakis, and the many other members of
the cake@lists.bufferbloat.net mailing list.

Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-19 09:23:46 -07:00
Qiaobin Fu 697dce7b3a net:sched: add action inheritdsfield to skbedit
The new action inheritdsfield copies the field DS of
IPv4 and IPv6 packets into skb->priority. This enables
later classification of packets based on the DS field.

v4:
* Make tc use netlink helper functions

v3:
* Make flag represented in JSON output as a null value

v2:
* Align the output syntax with the input syntax

* Fix the style issues

Original idea by Jamal Hadi Salim <jhs@mojatatu.com>

Signed-off-by: Qiaobin Fu <qiaobinf@bu.edu>
Reviewed-by: Michel Machado <michel@digirati.com.br>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-19 09:17:56 -07:00
Jianbo Liu 1f0a5dfd38 tc: flower: Add support for QinQ
To support matching on both outer and inner vlan headers,
we add new cvlan_id/cvlan_prio/cvlan_ethtype for inner vlan header.

Example:
# tc filter add dev eth0 protocol 802.1ad parent ffff: \
    flower vlan_id 1000 vlan_ethtype 802.1q \
        cvlan_id 100 cvlan_ethtype ipv4 \
    action vlan pop \
    action vlan pop \
    action mirred egress redirect dev eth1

# tc filter show dev eth0 ingress
filter protocol 802.1ad pref 1 flower chain 0
filter protocol 802.1ad pref 1 flower chain 0 handle 0x1
  vlan_id 1000
  vlan_ethtype 802.1Q
  cvlan_id 100
  cvlan_ethtype ip
  eth_type ipv4
  in_hw

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-15 13:03:50 -07:00
Vinicius Costa Gomes 7da5ef2200 tc: Add support for the ETF Qdisc
The "Earliest TxTime First" (ETF) queueing discipline allows precise
control of the transmission time of packets by providing a sorted
time-based scheduling of packets.

The syntax is:

tc qdisc add dev DEV parent NODE etf delta <DELTA>
                     clockid <CLOCKID> [offload] [deadline_mode]

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-11 17:50:10 -07:00
Stephen Hemminger b49759c0e7 tc: don't double print rate
Conversion to print stats in JSON forgot to remove existing
fprintf.

Fixes: 4fcec7f366 ("tc: jsonify stats2")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-07-09 09:53:45 -07:00
fumihiko kakuma d529ea2ff4 tc: Fix the bug not to display prio and quantum options of htb
A commandline like 'tc -d class show dev dev-name' does not
display value of prio and quantum option when we use htb qdisc.
This patch fixes the bug.

Signed-off-by: Fumihiko Kakuma <kakuma@valinux.co.jp>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-07-07 09:57:45 -07:00
Roi Dayan 425dcc2741 tc: Fix output of ip attributes
Example output is of tos and ttl.
Befoe this fix the format used %x caused output of the pointer
instead of the intended string created in the out variable.

Fixes: e28b88a464 ("tc: jsonify flower filter")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-07-07 09:57:45 -07:00
Simon Horman 6217917a38 tc: m_tunnel_key: Add tunnel option support to act_tunnel_key
Allow setting tunnel options using the act_tunnel_key action.

Options are expressed as class:type:data and multiple options
may be listed using a comma delimiter.

 # ip link add name geneve0 type geneve dstport 0 external
 # tc qdisc add dev eth0 ingress
 # tc filter add dev eth0 protocol ip parent ffff: \
     flower indev eth0 \
        ip_proto udp \
        action tunnel_key \
            set src_ip 10.0.99.192 \
            dst_ip 10.0.99.193 \
            dst_port 6081 \
            id 11 \
            geneve_opts 0102:80:00800022,0102:80:00800022 \
    action mirred egress redirect dev geneve0

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-06 09:10:05 -07:00
Keara Leibovitz 4757a54799 tc: jsonify nat action
Add json output support for nat action

Example output:

~$ $TC actions add action nat egress 10.10.10.1 20.20.20.2 index 2
~$ $TC actions add action nat ingress 100.100.100.1/32 200.200.200.2 \
	continue index 99
~$ $TC -j actions ls action nat

[{
	"total acts": 2
}, {
	"actions": [{
		"order": 0,
		"type": "nat",
		"direction": "egress",
		"old_addr": "10.10.10.1/32",
		"new_addr": "20.20.20.2",
		"control_action": {
			"type": "pass"
		},
		"index": 2,
		"ref": 1,
		"bind": 0
	}, {
		"order": 1,
		"type": "nat",
		"direction": "ingress",
		"old_addr": "100.100.100.1/32",
		"new_addr": "200.200.200.2",
		"control_action": {
			"type": "continue"
		},
		"index": 99,
		"ref": 1,
		"bind": 0
	}]
}]

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-06-20 10:20:34 -07:00
Vlad Buslov b133392468 tc: fix batch force option
When sending accumulated compound command results an error, check 'force'
option before exiting. Move return code check after putting batch bufs and
freeing iovs to prevent memory leak. Break from loop, instead of returning
error code to allow cleanup at the end of batch function. Don't reset ret
code on each iteration.

Fixes: 485d0c6001 ("tc: Add batchsize feature for filter and actions")
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-06-20 09:32:36 -07:00
Keara Leibovitz 831b5d40d9 tc: add json support in csum action
Add json output support for checksum action.

Example output:

~$ $TC actions add action csum udp continue index 7
~$ $TC actions add action csum icmp iph igmp pipe index 200 cookie 112233
~$ $TC -j actions ls action csum

[{
    "total acts":2
}, {
    "actions": [{
        "order":0,
        "csum":"udp",
        "control_action": {
            "type":"continue"
        },
        "index":7,
        "ref":1,
        "bind":0
    }, {
        "order":1,
        "csum":"iph, icmp, igmp",
        "control_action": {
            "type":"pipe"
        },
        "index":200,
        "ref":1,
        "bind":0,
        "cookie":"112233"
    }]
}]

v2:
    Don't initialized char buf[64];
    Add output example

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-06-05 15:30:30 -07:00
Roman Mashak 53d34eb66c tc: add missing space symbol in ife output
In order to make TDC tests match the output patterns, the missing space
character must be added in the mode output string.

Fixes: 8744c5d338 ("tc: jsonify ife action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-05-18 09:10:48 -07:00
Marcelo Ricardo Leitner ac6a4c2299 tc: flower: add support for verbose logging
Currently there is no way to log offloading errors if the rule is not
explicitly marked as skip_sw, making it hard for other applications such
as Open vSwitch to log why a given could not be offloaded.

This patch adds support for signaling the kernel that more verbose
logging is wanted, which now will include such messages.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-05-18 09:06:04 -07:00
David Ahern 7732148d1d Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-05-05 11:07:47 -07:00
Toke Høiland-Jørgensen bf717756b5 ingress: Don't break JSON output
The dash printed by the ingress qdisc breaks JSON output, so only print it
in regular output mode.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-25 11:08:39 -07:00
David Ahern 0d93d1e736 Merge branch 'master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-23 19:42:21 -07:00
Roman Mashak 0aaf62fcb6 tc: return on invalid smac or dmac in ife action
Return on invalid smac/dmac and use invarg consistently for invalid
arguments report.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
2018-04-20 10:35:21 -07:00
Stephen Hemminger 0b01f088ee flower: use 16 bit format where possible
Should use print_hu not print_uint for 16 bit value.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-20 10:35:00 -07:00
Roman Mashak 8744c5d338 tc: jsonify ife action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-15 17:23:17 -07:00
Roman Mashak 7b17701717 tc: jsonify skbedit action
v2:
   FIxed strings format in print_string()

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-15 17:09:16 -07:00
Roman Mashak 8feb516bfc tc: jsonify tunnel_key action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-08 10:52:33 -07:00
Roman Mashak 1d3c91a7c4 tc: jsonify connmark action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-08 10:52:32 -07:00
Yuval Mintz 0927bf83e7 tc: Correct json output for actions
Commit 9fd3f0b255 ("tc: enable json output for actions") added JSON
support for tc-actions at the expense of breaking other use cases that
reach tc_print_action(), as the latter don't expect the 'actions' array
to be a new object.

Consider the following taken duringrun of tc_chain.sh selftest,
and see the latter command output is broken:

$ ./tc/tc -j -p actions list action gact | grep -C 3 actions
[ {
        "total acts": 1
    },{
        "actions": [ {
                "order": 0,

$ ./tc/tc -p -j -s filter show dev enp3s0np2 ingress | grep -C 3 actions
            },
            "skip_hw": true,
            "not_in_hw": true,{
                "actions": [ {
                        "order": 1,
                        "kind": "gact",
                        "control_action": {

Relocate the open/close of the JSON object to declare the object only
for the case that needs it.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Tested-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 16:41:36 -07:00
David Ahern 2c62a64d60 Merge branch 'iproute2-master' into iproute2-next
Conflicts:
	bridge/mdb.c
	misc/ss.c
	tc/tc.c

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-02 10:47:34 -07:00
Roman Mashak 7ada016aeb tc: jsonify sample action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-01 08:44:31 -07:00
Roman Mashak c2f60f5c8e tc: support oneline mode in action generic printer functions
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-04-01 08:37:32 -07:00
Roman Mashak 9fd3f0b255 tc: enable json output for actions
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-30 08:55:17 -07:00
Roman Mashak 6e8634eb13 tc: add oneline mode
Add initial support for oneline mode in tc; actions, filters and qdiscs
will be gradually updated in the follow-up patches.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-30 08:18:58 -07:00
Stephen Hemminger d5732e3470 ematch: fix possible snprintf overflow
Fixes gcc 8 warning about possible snprint overflow

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-03-29 08:32:43 -07:00
Stephen Hemminger b8a6088e13 tc_class: fix snprintf warning
Size buffer big enough to avoid any possible overflow.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-03-29 08:32:43 -07:00
Stephen Hemminger 95744efac4 pedit: fix strncpy warning
Newer versions of Gcc warn about string truncation.
Fix by using strlcpy.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-03-29 08:30:28 -07:00
Roman Mashak d64a22f393 tc: print index, refcnt & bindcnt for nat action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
2018-03-27 17:00:32 -07:00
Stephen Hemminger fec62c0ec7 tc: help and whitespace cleanup
Break long lines, and cleanup usage message.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-03-27 15:33:13 -07:00
David Ahern 54eae5f76d Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-27 12:33:02 -07:00
Roman Mashak 990b1d90d7 tc: print actual action for connmark action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
2018-03-27 09:03:15 -07:00
Roi Dayan 17504be81d tc: Fix compilation error with old iptables
The compat_rev field does not exists in old versions of iptables.
e.g. iptables 1.4.

Fixes: dd29621578 ("tc: add em_ipt ematch for calling xtables matches from tc matching context")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-27 06:38:52 -07:00
Roman Mashak bf7d148803 tc: use get_u32() in psample action to match types
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Yotam Gigi <yotam.gi@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-03-16 13:38:50 -07:00
Roman Mashak e9fa16583a tc: print actual action for sample action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-03-16 13:38:38 -07:00
Toke Høiland-Jørgensen 997f2dc193 tc: Add JSON output of fq_codel stats
Enable proper JSON output support for fq_codel in `tc -s qdisc` output.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-13 18:05:40 -07:00
Toke Høiland-Jørgensen d7d044ff53 tc: Add missing documentation for codel and fq_codel parameters
Add missing documentation of the memory_limit fq_codel parameter and the
ce_threshold codel and fq_codel parameters.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-13 18:05:35 -07:00
Pieter Jansen van Vuuren fb4e6abfca tc: f_flower: Add support for matching first frag packets
Add matching support for distinguishing between first and later fragmented
packets.

 # tc filter add dev eth0 protocol ip parent ffff: \
     flower indev eth0 \
	ip_flags firstfrag \
        ip_proto udp \
    action mirred egress redirect dev eth1

 # tc filter add dev eth0 protocol ip parent ffff: \
     flower indev eth0 \
	ip_flags nofirstfrag \
        ip_proto udp \
    action mirred egress redirect dev eth1

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-13 18:03:21 -07:00
David Ahern e9625d6aea Merge branch 'iproute2-master' into iproute2-next
Conflicts:
	bridge/mdb.c

Updated bridge/bridge.c per removal of check_if_color_enabled by commit
1ca4341d2c ("color: disable color when json output is requested")

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-03-13 17:48:10 -07:00
Serhey Popovych fe99adbca4 utils: Introduce and use nodev() helper routine
There is a couple of places where we report error in case of no network
device is found. In all of them we output message in the same format to
stderr and either return -1 or 1 to the caller or exit with -1.

Introduce new helper function nodev() that takes name of the network
device caused error and returns -1 to it's caller. Either call exit()
or return to the caller to preserve behaviour before change.

Use -nodev() in traffic control (tc) code to return 1.

Simplify expression for checking for argument being 0/NULL in @if
statement.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2018-03-11 17:58:36 -07:00
Davide Caratti 75ef7b18d2 tc: fix parsing of the control action
If the user didn't specify any control action, don't pop the command line
arguments: otherwise, parsing of the next argument (tipically the 'index'
keyword) results in an error, causing the following 'tc-testing' failures:

 Test a6d6: Add skbedit action with index
 Test 38f3: Delete skbedit action
 Test a568: Add action with ife type
 Test b983: Add action without ife type
 Test 7d50: Add skbmod action to set destination mac
 Test 9b29: Add skbmod action to set source mac
 Test e93a: Delete an skbmod action

Also, add missing parse for 'ok' control action to m_police, to fix the
following 'tc-testing' failure:

 Test 8dd5: Add police action with control ok

tested with:
 # ./tdc.py

test results:
 all tests ok using kernel 4.16-rc2, except 9aa8 "Get a single skbmod
 action from a list" (which is failing also before this commit)

Fixes: 3572e01a09 ("tc: util: Don't call NEXT_ARG_FWD() in __parse_action_control()")
Cc: Michal Privoznik <mprivozn@redhat.com>
Cc: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-03-04 09:01:38 -08:00
Eyal Birger dd29621578 tc: add em_ipt ematch for calling xtables matches from tc matching context
The commit calls a new tc ematch for using netfilter xtable matches.

This allows early classification as well as mirroning/redirecting traffic
based on logic implemented in netfilter extensions.

Current supported use case is classification based on the incoming IPSec
state used during decpsulation using the 'policy' iptables extension
(xt_policy).

The matcher uses libxtables for parsing the input parameters.

Example use for matching an IPSec state with reqid 1:

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: \
    basic match 'ipt(-m policy --dir in --pol ipsec --reqid 1)' \
    action drop

This is the user-space counter part of kernel commit ccc007e4a746
("net: sched: add em_ipt ematch for calling xtables matches")

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-02-27 09:43:16 -08:00
Eyal Birger 526862038e tc: ematch: add parse_eopt_argv() method for providing ematches with argv parameters
ematche uses YACC to parse ematch arguments and places them in struct bstr
linked lists.

It is useful to be able to receive parameters as argc,argv in order to use
getopt (and alike) argument parsers.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-02-27 09:43:06 -08:00
Adam Vyskovsky 2fb854d07c tc: fix an off-by-one error while printing tc actions
The tc_print_action() function did not print all tc actions
when e.g. TCA_ACT_MAX_PRIO actions were defined for a single
tc filter.

Signed-off-by: Adam Vyskovsky <adamvyskovsky@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-02-23 08:18:29 -08:00
Stephen Hemminger 2d165c0811 tc: implement color output
Implement the -color option; in this case -co is ambiguous
since it was already used for -conf.
For now this just means putting device name in color.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-02-21 09:12:28 -08:00
Serhey Popovych 5433656705 ip: Use single variable to represent -pretty
After commit a233caa0aa ("json: make pretty printing optional") I get
following build failure:

    LINK     rtmon
    ../lib/libutil.a(json_print.o): In function `new_json_obj':
    json_print.c:(.text+0x35): undefined reference to `show_pretty'
    collect2: error: ld returned 1 exit status
    make[1]: *** [rtmon] Error 1
    make: *** [all] Error 2

It is caused by missing show_pretty variable in rtmon.

On the other hand tc/tc.c there are two distinct variables and single
matches() call that handles -pretty option thus setting show_pretty
will never happen. Note that since commit 44dcfe8201 ("Change
formatting of u32 back to default") show_pretty is used in tc/f_u32.c
so this is first place where -pretty introduced.

Furthermore other utilities like misc/ifstat.c and misc/nstat.c define
pretty variable, however only for their own purposes. They both support
JSON output and thus depend show_pretty in new_json_obj().

Assuming above use common variable to represent -pretty option, define
it in utils.c and declare in utils.h that is commonly used. Replace
show_pretty with pretty.

Fixes: a233caa0aa ("json: make pretty printing optional")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-02-16 08:13:36 -08:00
Stephen Hemminger a233caa0aa json: make pretty printing optional
Since JSON is intended for programmatic consumption, it makes
sense for the default output format to be concise as possible.

For programmer and other uses, it is helpful to keep the pretty
whitespace format; therefore enable it with -p flag.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-02-10 08:15:08 -08:00
Serhey Popovych c14f9d92ee treewide: Use addattr_nest()/addattr_nest_end() to handle nested attributes
We have helper routines to support nested attribute addition into
netlink buffer: use them instead of open coding.

Use addattr_nest_compat()/addattr_nest_compat_end() where appropriate.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-02-02 15:01:09 -08:00
David Ahern 1e24e773f1 Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-29 08:24:57 -08:00
Jakub Kicinski 44c7655186 tc: fix second printing of requeues
Non-JSON tc qdisc output used to print the "requeues" statistic
twice.  Commit 4fcec7f366 ("tc: jsonify stats2") tried to preserve
this behaviour for both standard output and JSON, but used the wrong
statistic (q.qlen).  Also duplicating keys in JSON is not allowed,
so the second occurrence should be completely skipped with JSON.

Fixes: 4fcec7f366 ("tc: jsonify stats2")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-27 16:06:54 -08:00
Jakub Kicinski c061b75895 tc: prio: JSON-ify prio output
Make JSON output work with prio Qdiscs.  This will also make
other qdiscs which reuse the print_qopt work, like mqprio or
pfifo_fast.

Note that there is a double space between "priomap" and first
prio number.  Keep this original behaviour.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-26 13:00:18 -08:00
Jakub Kicinski 097415d510 tc: red: JSON-ify RED output
Make JSON output work with RED Qdiscs.  Float/double printing
helpers have to be added/uncommented to print the probability.
Since TC stats in general are not split out to a separate object
the xstats printed by this patch are not separated either.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-26 12:59:55 -08:00
David Ahern 6517b5c0ac Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-24 09:59:03 -08:00
Wolfgang Bumiller 7ac29190db tc/lexer: let quotes actually start strings
The lexer will go with the longest match, so previously
the starting double quotes of a string would be swallowed by
the [^ \t\r\n()]+ pattern leaving the user no way to
actually use strings with escape sequences.
Fix this by not allowing this case to start with double
quotes.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2018-01-24 08:49:10 -08:00
Jiri Pirko 063463efd7 tc: implement ingress/egress block index attributes for qdiscs
During qdisc creation it is possible to specify shared block for bot
ingress and egress. Pass this values to kernel according to the command
line options.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-21 10:42:57 -08:00
Jiri Pirko 0c7cef9669 tc: introduce support for block-handle for filter operations
So far, qdisc was the only handle that could be used to manipulate
filters. Kernel added support for using block to manipulate it. So add
the support to use block index to manipulate filters. The magic
TCM_IFINDEX_MAGIC_BLOCK indicates the block index is in use.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-21 10:42:53 -08:00
Jiri Pirko d0bcedd549 tc: introduce tc_qdisc_block_exists helper
This hepler used qdisc dump to list all qdisc and find if block index in
question is used by any of them. That means the block with specified
index exists.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-21 10:42:35 -08:00
David Ahern 8c75f69411 Merge branch 'master' into net-next
Conflicts:
	ip/link_gre.c
	ip/link_gre6.c

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-21 09:37:39 -08:00
Jakub Kicinski e0850bdedc tc: red: allow setting th_min and th_max to the same value
Setting th_min and th_max to the same value may be useful for DCTCP
deployments.  The original DCTCP paper describes it as a simplest way
of achieving simple ECN threshold marking.  Indeed, there doesn't seem
to be any simpler qdisc in Linux which would allow such a setup today.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-19 12:35:23 -08:00
Phil Sutter 6f7df6b2a1 tc: Optimize gact action lookup
When adding a filter with a gact action such as 'drop', tc first tries
to open a shared object with equivalent name (m_drop.so in this case)
before trying gact. Avoid this by matching the action name against those
handled by gact prior to calling get_action_kind().

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
2018-01-17 10:27:47 -08:00
Chris Mi 485d0c6001 tc: Add batchsize feature for filter and actions
Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this support, at most 128
commands can be accumulated before sending to kernel.

Now it only works for the following successive commands:
1. filter add/delete/change/replace
2. actions add/change/replace

Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-14 09:03:35 -08:00
Stephen Hemminger 7d63671030 tc: remove no longer relevant README
This document described how kernel and tc used to handle
timing. In last two years, kernel has switched over to using
ktime. Nothing to see here, move along.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:21:22 -08:00
Jamal Hadi Salim 24a5a48e27 tc: Fix filter protocol output
Fixes: 249284ff5a ("tc: jsonify filter core")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
2018-01-09 08:09:10 -08:00
Yuval Mintz b97c6fa71d qdisc: print offload indication
Use the newly added TCA_HW_OFFLOAD indication from kernel
to print a consistent 'offloaded' message to user when listing qdiscs.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-27 13:55:16 -08:00
Chris Mi 83cf5bc73b tc: fix command "tc actions del" hang issue
If command is RTM_DELACTION, a non-NULL pointer is passed to rtnl_talk().
Then flag NLM_F_ACK is not set on n->nlmsg_flags and netlink_ack() will
not be called. Command tc will wait for the reply for ever.

Fixes: 86bf43c7c2 ("lib/libnetlink: update rtnl_talk to support malloc buff at run time")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-14 21:17:04 -08:00
Jiri Pirko 1876ab0779 tc: fix json array closing
Fixes: 2704bd6255 ("tc: jsonify actions core")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-12-13 18:16:27 -08:00
Michal Privoznik 3572e01a09 tc: util: Don't call NEXT_ARG_FWD() in __parse_action_control()
Not all callers want parse_action_control*() to advance the
arguments. For instance act_parse_police() does the argument
advancing itself.

Fixes: e67aba5595 ("tc: actions: add helpers to parse and print control actions")
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-12-08 10:29:01 -08:00
Stephen Hemminger c6a656f4f9 m_mirred: style cleanups
Fix whitespace and long lines.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-11-26 12:42:17 -08:00
Stephen Hemminger 5c235ac27e m_gact: whitespace cleanup
Fix whitespace errors reported by checkpatch

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-11-26 12:38:21 -08:00
Stephen Hemminger ed4856919f m_action: style cleanup
Break long lines, and use bool where possible.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-11-26 12:36:15 -08:00
Stephen Hemminger eb4bccf12b m_vlan: style cleanups
Break long lines and make duplicated code into function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-11-26 12:28:55 -08:00
Jiri Pirko b021ee40f6 tc: jsonify vlan action
Add json output to vlan action.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 502c4adf19 tc: jsonify mirred action
Add json output to mirred action.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 66fedb6df0 tc: jsonify gact action
Add json output to gact action.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 2704bd6255 tc: jsonify actions core
Add json output to actions core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 619ca351e3 tc: jsonify matchall filter
Add json output to matchall filter.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko e28b88a464 tc: jsonify flower filter
Add json output to flower filter.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 249284ff5a tc: jsonify filter core
Add json output to filter core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko f354fa6aa5 tc: jsonify htb qdisc
Add json output to htb qdisc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 378ac491f5 tc: jsonify fq_codel qdisc
Add json output to fq_codel qdisc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 4fcec7f366 tc: jsonify stats2
Add json output to stats2.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko c91d262f41 tc: jsonify qdisc core
Add json output to qdisc core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:20:51 -08:00
Jiri Pirko 81051c60c2 tc: remove action cookie len from printout
Make the output same as input and avoid printout of unnecessary len.

Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes: fd8b3d2c1b ("actions: Add support for user cookies")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:18:38 -08:00
Jiri Pirko abff45b802 tc: move action cookie print out of the stats if
Cookie print was made dependent on show_stats for no good reason. Fix
this bu pushing cookie print ot of the stats if.

Fixes: fd8b3d2c1b ("actions: Add support for user cookies")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-11-26 12:18:38 -08:00
Jakub Kicinski eb91c55731 f_bpf: communicate ifindex for eBPF offload
Split parsing and loading of the eBPF program and if skip_sw is set
load the program for ifindex, to which the qdisc is attached.

Note that the ifindex will be ignored for programs which are already
loaded (e.g. when using pinned programs), but in that case we just
trust the user knows what he's doing.  Hopefully we will get extack
soon in the driver to help debugging this case.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-26 11:57:57 -08:00
Jakub Kicinski 01ea76b1cf tc_filter: resolve device name before parsing filter
Move resolving device name into an ifindex before calling filter
specific callbacks.  This way if filters need the ifindex, they
can read it from the request.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-26 11:57:57 -08:00
Jakub Kicinski 67c857df80 {f, m}_bpf: don't allow specifying multiple bpf programs
Both BPF filter and action will allow users to specify run
multiple times, and only the last one will be considered by
the kernel.  Explicitly refuse such command lines.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-26 11:57:57 -08:00
Jakub Kicinski 399db8392b bpf: rename bpf_parse_common() to bpf_parse_and_load_common()
bpf_parse_common() parses and loads the program.  Rename it
accordingly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-11-26 11:57:57 -08:00