Match all MPLS fields using smallest and highest possible values.
Test the two ways of specifying MPLS header matching:
* with the basic mpls_{label,tc,bos,ttl} keywords (match only on the
first LSE),
* with the more generic "lse" keyword (allows matching at different
depth of the MPLS label stack).
This test file allows to find problems like the one fixed by
Linux commit 7fdd375e3830 ("net: sched: Fix dump of MPLS_OPT_LSE_LABEL
attribute in cls_flower").
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch allows the user to set and retrieve the
IFLA_MACVLAN_BC_QUEUE_LEN parameter via the bcqueuelen
command line argument
This parameter controls the requested size of the queue for
broadcast and multicast packages in the macvlan driver.
If not specified, the driver default (1000) will be used.
Note: The request is per macvlan but the actually used queue
length per port is the maximum of any request to any macvlan
connected to the same port.
For this reason, the used queue length IFLA_MACVLAN_BC_QUEUE_LEN_USED
is also retrieved and displayed in order to aid in the understanding
of the setting. However, it can of course not be directly set.
Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
Signed-off-by: David Ahern <dsahern@gmail.com>
add_addr_accepted value is not printed if add_addr_signal value is 0.
Fix this properly looking for add_addr_accepted value, instead.
Fixes: 9c3be2c0ee ("ss: mptcp: add msk diag interface support")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
keys_ex is dinamically allocated with calloc on line 770, but
is not freed in case of error at line 823.
Fixes: 081d6c310d ("tc: pedit: Support JSON dumping")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
nlg_ntf is dinamically allocated in mnlg_socket_open(), and is freed on
the out: return path. However, some error paths do not free it,
resulting in memory leak.
This commit fix this using mnlg_socket_close(), and reporting the
correct error number when required.
Fixes: 9b13cddfe2 ("devlink: implement flash status monitoring")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Commit 924c43778a ("man: tc-ct.8: Add manual page for ct tc action")
add man page for tc-ct, but it brings with it a bogus block of text
in the benning of tc-flower man page.
This commit simply removes it.
Fixes: 924c43778a ("man: tc-ct.8: Add manual page for ct tc action")
Reported-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Petr Machata says:
====================
Add support to the dcb tool for the following three DCB objects:
- PFC, for "Priority-based Flow Control", allows configuration of priority
lossiness, and related toggles.
- DCBNL buffer interfaces are an extension to the 802.1q DCB interfaces and
allow configuration of port headroom buffers.
- DCBNL maxrate interfaces are an extension to the 802.1q DCB interfaces
and allow configuration of rate with which traffic in a given traffic
class is sent.
Patches #1-#4 fix small issues in the current DCB code and man pages.
Patch #5 adds new helpers to the DCB dispatcher.
Patches #6 and #7 add support for command line arguments -s and -i. These
enable, respectively, display of statistical counters, and ISO/IEC mode of
rate units.
Patches #8-#10 add the subtools themselves and their man pages.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
DCBNL maxrate interfaces are an extension to the 802.1q DCB interfaces and
allow configuration of rate with which traffic in a given traffic class is
sent.
Add a dcb subtool to allow showing and tweaking of this per-TC maximum
rate. For example:
# dcb maxrate show dev eni1np1
tc-maxrate 0:25Gbit 1:25Gbit 2:25Gbit 3:25Gbit 4:25Gbit 5:25Gbit 6:100Gbit 7:25Gbit
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
DCBNL buffer interfaces are an extension to the 802.1q DCB interfaces and
allow configuration of port headroom buffers.
Add a dcb subtool to allow showing and tweaking of buffer priority mapping
and buffer sizes. For example:
# dcb buf show dev eni1np1
prio-buffer 0:0 1:0 2:0 3:3 4:0 5:0 6:6 7:0
buffer-size 0:10000 1:0 2:0 3:70000 4:0 5:0 6:10000 7:0
total-size 221072
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow switching "dcb" into the ISO/IEC mode of units by passing -i.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow selective display of statistical counters by passing -s.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The DCB buffer object has a settable array of 32-bit quantities, and the
maxrate object of 64-bit ones. Adjust dcb_parse_mapping() and related
helpers to support 64-bit values in mappings, and add appropriate helpers.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
None, one, or many parameters can be given on the command line, but
the current synopsis allows only none or one. Fix it.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
"dcb ets show dev X help" currently shows full "ets" help instead of just
help for the show command. Fix it.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
getopt_long() currently includes "c" and "n" in the short option string.
These probably slipped in as a cut'n'paste, and are not actually accepted.
Remove them.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add pr_out_dev() helper function and use it both by cmd_dev_show_cb()
and by cmd_mon_show_cb().
Dev stats will be added on the next patch to dev context, so
cmd_mon_show_cb() should print the whole dev context and not just dev
handle.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add reload action and reload limit to devlink reload command to enable
the user to select the reload action required and constrains limits on
these actions that he may want to ensure.
The following reload actions are supported:
driver_reinit: driver entities re-initialization, applying
devlink-param and devlink-resource values.
fw_activate: firmware activate.
The uAPI is backward compatible, if the reload action option is omitted
from the reload command, the driver reinit action will be used.
Note that when required to do firmware activation some drivers may need
to reload the driver. On the other hand some drivers may need to reset
the firmware to reinitialize the driver entities. Therefore, the devlink
reload command returns the actions which were actually performed.
By default reload actions are not limited and driver implementation may
include reset or downtime as needed to perform the actions. However, if
reload limit is selected, the driver should perform only if it can do it
while keeping the limit constraints.
Reload limit added:
no_reset: No reset allowed, no down time allowed, no link flap and no
configuration is lost.
Command examples:
$devlink dev reload pci/0000:82:00.0 action driver_reinit
reload_actions_performed:
driver_reinit
$devlink dev reload pci/0000:82:00.0 action fw_activate
reload_actions_performed:
driver_reinit fw_activate
devlink dev reload pci/0000:82:00.1 action driver_reinit -jp
{
"reload": {
"reload_actions_performed": [ "driver_reinit" ]
}
}
devlink dev reload pci/0000:82:00.0 action fw_activate -jp
{
"reload": {
"reload_actions_performed": [ "driver_reinit","fw_activate" ]
}
}
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Petr Machata says:
==================
The DCB tool will have commands that deal with buffer sizes and traffic
rates. TC is another tool that has a number of such commands, and functions
to support them: get_size(), get_rate/64(), s/print_size() and
s/print_rate(). In this patchset, these functions are moved from TC to lib/
for possible reuse and modernized.
s/print_rate() has a hidden parameter of a global variable use_iec, which
made the conversion non-trivial. The parameter was made explicit,
print_rate() converted to a mostly json_print-like function, and
sprint_rate() retired in favor of the new print_rate. Patches #1 and #2
deal with this.
The intention was to treat s/print_size() similarly, but unfortunately two
use cases of sprint_size() cannot be converted to a json_print-like
print_size(), and the function sprint_size() had to remain as a discouraged
backdoor to print_size(). This is done in patch #3.
Patch #4 then improves the code of sprint_size() a little bit.
Patch #5 fixes a buglet in formatting small rates in IEC mode.
Patches #6 and #7 handle a routine movement of, respectively,
get_rate/64() and get_size() from tc to lib.
This patchset does not actually add any new uses of these functions. A
follow-up patchset will add subtools for management of DCB buffer and DCB
maxrate objects that will make use of them.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
The function get_size() serves for parsing of sizes using a handly notation
that supports units and their prefixes, such as 10Kbit. This will be useful
for the DCB buffer size parsing. Move the function from TC to the general
library, so that it can be reused.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The functions get_rate() and get_rate64() are useful for parsing rate-like
values. The DCB tool will find these useful in the maxrate subtool.
Move them over to lib so that they can be easily reused.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
ISO/IEC units are distinguished from the decadic ones by using a prefixes
like "Ki", "Mi" instead of "K" and "M". The current code inserts the letter
"i" after the decadic unit when in IEC mode. However it does so even when
the prefix is an empty string, formatting 1Kbit in IEC mode as "1000ibit".
Fix by omitting the letter if there is no prefix.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Ideally this and the rate printing would both be converted to a common
helper, but unfortunately the two format differently and this would break
tests and scripts out there. So just make the code look less like a wad of
hay.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
When displaying sizes of various sorts, tc commonly uses the function
sprint_size() to format the size into a buffer as a human-readable string.
This string is then displayed either using print_string(), or in some code
even fprintf(). As a result, a typical sequence of code when formatting a
size is something like the following:
SPRINT_BUF(b);
print_uint(PRINT_JSON, "foo", NULL, foo);
print_string(PRINT_FP, NULL, "foo %s ", sprint_size(foo, b));
For a concept as broadly useful as size, it would be better to have a
dedicated function in json_print.
To that end, move sprint_size() from tc_util to json_print. Add helpers
print_size() and print_color_size() that wrap arount sprint_size() and
provide the JSON dispatch as appropriate.
Since print_size() should be the preferred interface, convert vast majority
of uses of sprint_size() to print_size(). Two notable exceptions are:
- q_tbf, which does not show the size as such, but uses the string
"$human_readable_size/$cell_size" even in JSON. There is simply no way to
have print_size() emit the same text, because print_size() in JSON mode
should of course just use the raw number, without human-readable frills.
- q_cake, which relies on the existence of sprint_size() in its macro-based
formatting helpers. There might be ways to convert this particular case,
but given q_tbf simply cannot be converted, leave it as is.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The functions print_rate() and sprint_rate() are useful for formatting
rate-like values. The DCB tool would find these useful in the maxrate
subtool. However, the current interface to these functions uses a global
variable use_iec as a flag indicating whether 1024- or 1000-based powers
should be used when formatting the rate value. For general use, a global
variable is not a great way of passing arguments to a function. Besides, it
is unlike most other printing functions in that it deals in buffers and
ignores JSON.
Therefore make the interface to print_rate() explicit by converting use_iec
to an ordinary parameter. Since the interface changes anyway, convert it to
follow the pattern of other json_print functions (except for the
now-explicit use_iec parameter). Move to json_print.c.
Add a wrapper to tc, so that all the call sites do not need to repeat the
use_iec global variable argument, and convert all call sites.
In q_cake.c, the conversion is not straightforward due to usage of a macro
that is shared across numerous data types. Simply hand-roll the
corresponding code, which seems better than making an extra helper for one
call site.
Drop sprint_rate() now that everybody just uses print_rate().
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The tools "ip" and "tc" use a flag "use_iec", which indicates whether, when
formatting rate values, the prefixes "K", "M", etc. should refer to powers
of 1024, or powers of 1000. The flag is currently kept as a global variable
in "ip" and "tc", but is nonetheless declared in util.h.
Instead, move the declaration to tool-specific headers ip/ip_common.h and
tc/tc_common.h.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
We introduce the "vrftable" attribute for supporting the SRv6 End.DT4 and
End.DT6 behaviors in iproute2.
The "vrftable" attribute indicates the routing table associated with
the VRF device used by SRv6 End.DT4/DT6 for routing IPv4/IPv6 packets.
The SRv6 End.DT4/DT6 is used to implement IPv4/IPv6 L3 VPNs based on Segment
Routing over IPv6 networks in multi-tenants environments.
It decapsulates the received packets and it performs the IPv4/IPv6 routing
lookup in the routing table of the tenant.
The SRv6 End.DT4/DT6 leverages a VRF device in order to force the routing
lookup into the associated routing table using the "vrftable" attribute.
Some examples:
$ ip -6 route add 2001:db8::1 encap seg6local action End.DT4 vrftable 100 dev eth0
$ ip -6 route add 2001:db8::2 encap seg6local action End.DT6 vrftable 200 dev eth0
Standard Output:
$ ip -6 route show 2001:db8::1
2001:db8::1 encap seg6local action End.DT4 vrftable 100 dev eth0 metric 1024 pref medium
JSON Output:
$ ip -6 -j -p route show 2001:db8::2
[ {
"dst": "2001:db8::2",
"encap": "seg6local",
"action": "End.DT6",
"vrftable": 200,
"dev": "eth0",
"metric": 1024,
"flags": [ ],
"pref": "medium"
} ]
v2:
- no changes made: resubmit after pulling out this patch from the kernel
patchset.
v1:
- mixing this patch with the kernel patchset confused patckwork.
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@cnit.it>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David Ahern <dsahern@gmail.com>
Update kernel headers to commit:
afae3cc2da10 ("net: atheros: simplify the return expression of atl2_phy_setup_autoneg_adv()")
Signed-off-by: David Ahern <dsahern@gmail.com>
New lib/mnl_utils.c fails to compile if libmnl is not installed:
mnl_utils.c:9:10: fatal error: libmnl/libmnl.h: No such file or directory
9 | #include <libmnl/libmnl.h>
Make it dependent on HAVE_MNL.
Fixes: 72858c7b77 ("lib: Extract from devlink/mnlg a helper, mnlu_socket_open()")
Signed-off-by: David Ahern <dsahern@gmail.com>
If multiple ip processes are ran at the same time to set up
separate network namespaces, and it is the first time so /run/netns
has to be set up first, and they end up doing it at the same time,
the processes might enter a recursive loop creating thousands of
mount points, which might crash the system depending on resources
available.
Try to take a flock on /run/netns before doing the mount() dance, to
ensure this cannot happen. But do not try too hard, and if it fails
continue after printing a warning, to avoid introducing regressions.
First reported on Debian: https://bugs.debian.org/949235
To reproduce (WARNING: run in a VM to avoid system lockups):
for i in {0..9}
do
strace -e trace=mount -e inject=mount:delay_exit=1000000 ip \
netns add "testnetns$i" 2>&1 | tee "$i.log" &
done
wait
The strace is to ensure the problem always reproduces, to add an
artificial synchronization point after the first mount().
Reported-by: Etienne Dechamps <etienne@edechamps.fr>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Implement support for action terse dump using new TCA_ACT_FLAG_TERSE_DUMP
value of TCA_ROOT_FLAGS tlv. Set the flag when user requested it with
following example CLI (-br for 'brief'):
$ tc -s -br actions ls action tunnel_key
total acts 2
action order 0: tunnel_key index 1
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
action order 1: tunnel_key index 2
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
In terse mode dump only outputs essential data needed to identify the
action (kind, index) and stats, if requested by the user.
Signed-off-by: Vlad Buslov <vlad@buslov.dev>
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Use TCA_ACT_FLAG_LARGE_DUMP_ON alias according to new preferred naming for
action flags.
Signed-off-by: Vlad Buslov <vlad@buslov.dev>
Signed-off-by: David Ahern <dsahern@gmail.com>
Do not hardcode /usr/lib/ip as a path and allow libraries path
configuration in run-time.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
With gcc-10 it complains about array subscript error.
f_u32.c: In function ‘u32_parse_opt’:
f_u32.c:1113:24: warning: array subscript 0 is outside the bounds of an interior zero-length array ‘struct tc_u32_key[0]’ [-Wzero-length-bounds]
1113 | hash = sel2.sel.keys[0].val & sel2.sel.keys[0].mask;
| ~~~~~~~~~~~~~^~~
In file included from tc_util.h:11,
from f_u32.c:26:
../include/uapi/linux/pkt_cls.h:253:20: note: while referencing ‘keys’
253 | struct tc_u32_key keys[0];
|
This is because the keys are actually allocated in the second element
of the parent structure.
Simplest way to address the warning is to assign directly to the keys
in the containing structure.
This has always been in iproute2 (pre-git) so no Fixes.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The code here was doing strncpy() in a way that causes gcc 10
warning about possible string overflow. Just use strlcpy() which
will null terminate and bound the string as expected.
This has existed since start of git era so no Fixes tag.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Gcc-10 complains about referencing a zero size array.
This occurs because the array of keys is actually in the following
structure which is part of the overall selector.
The original code was safe, but better to just use the key
array directly.
Fixes: 2d9a8dc439 ("tc: p_ip6: Support pedit of IPv6 dsfield")
Cc: petrm@mellanox.com
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Gcc-10 complains about possible string length overflow.
This can't happen Ethernet address format is always limited to
18 characters or less. Just resize the temp buffer.
Fixes: 70dfb0b883 ("iplink: bridge: export bridge_id and designated_root")
Cc: nikolay@cumulusnetworks.com
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
GCC-10 complains about uninitialized variable.
devlink.c: In function ‘cmd_dev’:
devlink.c:2803:12: warning: ‘val_u32’ may be used uninitialized in this function [-Wmaybe-uninitialized]
2803 | val_u16 = val_u32;
| ~~~~~~~~^~~~~~~~~
devlink.c:2747:11: note: ‘val_u32’ was declared here
2747 | uint32_t val_u32;
| ^~~~~~~
This is a false positive because it can't figure out the control flow
when the parse returns error.
Fixes: 2557dca2b0 ("devlink: Add string to uint{8,16,32} conversion for generic parameters")
Cc: shalomt@mellanox.com
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Extend the 'bridge mdb' command for the following syntax:
bridge mdb add dev br0 port swp0 grp 01:02:03:04:05:06 permanent
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David Ahern <dsahern@gmail.com>