Commit Graph

5680 Commits

Author SHA1 Message Date
Stephen Hemminger 79026c1262 rdma: update uapi headers
Update the RDMA uapi headers from 5.16.0-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-18 10:00:19 -08:00
Stephen Hemminger fa58de9b0c vdpa: align uapi headers
Update vdpa headers based on 5.16.0-rc1 and remove redundant
copy.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-18 09:56:57 -08:00
[200~jiangheng be31c26484 lnstat: fix buffer overflow in header output
Running lnstat will cause core dump from reading past end of array.

Segmentation fault (core dumped)

The maximum  value of th.num_lines is HDR_LINES(10),  h should not be equal to th.num_lines, array th.hdr may be out of bounds.

Signed-off-by jiangheng <jiangheng12@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-17 13:41:10 -08:00
Maxim Petrov 0e94972590 tc/m_vlan: fix print_vlan() conditional on TCA_VLAN_ACT_PUSH_ETH
Fix the wild bracket in the if clause leading to the error in the condition.

Fixes: d61167dd88 ("m_vlan: add pop_eth and push_eth actions")
Signed-off-by: Maxim Petrov <mmrmaximuzz@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-17 11:13:12 -08:00
Davide Caratti 9bd5ab0f09 mptcp: fix JSON output when dumping endpoints by id
iproute ignores '-j' command line argument when dumping endpoints by id:

 [dcaratti@dcaratti iproute2]$ ./ip/ip -j mptcp endpoint show
 [{"address":"1.2.3.4","id":42,"signal":true,"backup":true}]
 [dcaratti@dcaratti iproute2]$ ./ip/ip -j mptcp endpoint show id 42
 1.2.3.4 id 42 signal backup

fix mptcp_addr_show() to use the proper JSON helpers.

Fixes: 7e0767cd86 ("add support for mptcp netlink interface")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-11 10:07:26 -08:00
Anssi Hannula a787d9ae10 man: tc-u32: Fix page to match new firstfrag behavior
Commit 690b11f4a6 ("tc: u32: Fix firstfrag filter.") applied in 2012
changed the "ip firstfrag" selector to not match non-fragmented packets
anymore.

However, the documentation added in f15a23966f ("tc: add a man page
for u32 filter") in 2015 includes an example that relies on the previous
behavior (non-fragmented packet counted as first fragment).

Due to this, the example does not work correctly and does not actually
classify regular SSH packets.

Modify the example to use a raw u16 selector on the fragment offset to
make it work, and also make the firstfrag description more clear about
the current behavior.

Fixes: f15a23966f ("tc: add a man page for u32 filter")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-09 10:46:17 -08:00
Luca Boccassi af96c7b5dd Fix some typos detected by Lintian in manpages
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-09 10:45:44 -08:00
Stephen Hemminger 35c81b18c4 uapi: update vdpa.h
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-09 10:40:40 -08:00
David Ahern 50b668bdbf Merge branch 'main' into next
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-04 09:45:31 -06:00
David Ahern 9c56d693f6 Merge branch 'can-tdc-plus-cleanups' into next
Vincent Mailhol  says:

====================

The main purpose is to add commandline support for Transmitter Delay
Compensation (TDC) in iproute. Other issues found during the
development of this feature also get addressed.

This patch series contains four patches which respectively:

  1. Correct the bittiming ranges in the print_usage function and add
  the units to give more clarity: some parameters are in milliseconds,
  some in nano seconds, some in time quantum and the newly TDC
  parameters introduced in this series would be in clock period.

  2. Do some code refactoring on function print_ctrlmode().

  3. factorize the many print_*(PRINT_JSON, ...) and fprintf
  occurrences in a single print_*(PRINT_ANY, ...) call and fix the
  signedness while doing that.

  4. report the value of the bitrate prescalers (brp and dbrp).

  5. adds command line support for the TDC in iproute and goes together
  with below series in the kernel:
  https://lore.kernel.org/linux-can/20210814091750.73931-1-mailhol.vincent@wanadoo.fr/T/#t

** Changelog **

>From RFC v5 to v6:
  * Dropped the RFC tag because the related patch series on the kernel
    side were pulled into net-next.
  * Remove the changes in include/uapi/linux/can/netlink.h because
    these should be pulled separately.
  * Add another patch (the second of this series) to do some cleanup
    on function print_ctrlmode().
  * Minor fixes in the patch comments (grammar, rephrasing).

>From RFC v4 to RFC v5:
  * Add the unit (bps, tq, ns or ms) in print_usage()
  * Rewrote void can_print_timing_min_max() to better factorize the
    code.
  * Rewrote the commit message of the two last patches (those related
    to TDC) to either add clarification of fix inacurracies.

>From v3 to RFC v4:
  * Reflect the changes made on the kernel side.

>From RFC v2 to v3:
  * Dropped the RFC tag. Now that the kernel patch reach the testing
    branch, I am finaly ready.
  * Regression fix: configuring a link with only nominal bittiming
    returned -EOPNOTSUPP
  * Added two more patches to the series:
      - iplink_can: fix configuration ranges in print_usage()
      - iplink_can: print brp and dbrp bittiming variables
  * Other small fixes on formatting.

>From RFC v1 to RFC v2:
  * Add an additional patch to the series to fix the issues reported
    by Stephen Hemminger
    Ref: https://lore.kernel.org/linux-can/20210506112007.1666738-1-mailhol.vincent@wanadoo.fr/T/#t

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-04 09:44:56 -06:00
Vincent Mailhol 0c263d7c36 iplink_can: add new CAN FD bittiming parameters: Transmitter Delay Compensation (TDC)
At high bit rates, the propagation delay from the TX pin to the RX pin
of the transceiver causes measurement errors: the sample point on the
RX pin might occur on the previous bit.

This issue is addressed in ISO 11898-1 section 11.3.3 "Transmitter
delay compensation" (TDC).

This patch brings command line support to nine TDC parameters which
were recently added to the kernel's CAN netlink interface in order to
implement TDC:
  - IFLA_CAN_TDC_TDCV_MIN: Transmitter Delay Compensation Value
    minimum value
  - IFLA_CAN_TDC_TDCV_MAX: Transmitter Delay Compensation Value
    maximum value
  - IFLA_CAN_TDC_TDCO_MIN: Transmitter Delay Compensation Offset
    minimum value
  - IFLA_CAN_TDC_TDCO_MAX: Transmitter Delay Compensation Offset
    maximum value
  - IFLA_CAN_TDC_TDCF_MIN: Transmitter Delay Compensation Filter
    window minimum value
  - IFLA_CAN_TDC_TDCF_MAX: Transmitter Delay Compensation Filter
    window maximum value
  - IFLA_CAN_TDC_TDCV: Transmitter Delay Compensation Value
  - IFLA_CAN_TDC_TDCO: Transmitter Delay Compensation Offset
  - IFLA_CAN_TDC_TDCF: Transmitter Delay Compensation Filter window

All those new parameters are nested together into the attribute
IFLA_CAN_TDC.

The TDC parameters extend the FD parameters. As such, the TDC
parameters must be specified together the "fd on" flag.

When "fd on" flag is provided, a tdc-mode parameter allows to specify
how to operate.  Valid options for tdc-mode are:

  * auto: the transmitter dynamically measures TDCV for each of the
    transmitted frames. As such, TDCV can not be manually provided. In
    this mode, the user must specify TDCO and may also specify TDCF if
    supported.

  * manual: use a static TDCV provided by the user. In this mode, the
    user must specify both TDCV and TDCO and may also specify TDCF if
    supported.

  * off: TDC is explicitly disabled.

  * tdc-mode parameter omitted (default mode): the kernel decides
    whether TDC should be enabled or not and if so, it calculates the
    TDC values. TDC parameters are an expert option and the average
    user is not expected to provide those, thus the presence of this
    "default mode".

If the fd flag is omitted, all the FD values (including TDC values)
remain unchanged.

If "fd off" flag is specified, all FD values (including TDC values)
are zeroed.

TDCV is always reported in manual mode. In auto mode, TDCV is reported
only if the value is available. Especially, the TDCV might not be
available if the controller has no feature to report it or if the
value in not yet available (i.e. no data sent yet and measurement did
not occur).

TDCF is reported only if tdcf_max is not zero (i.e. if supported by
the controller).

For reference, here are a few samples of how the output looks like:

| $ ip link set can0 type can bitrate 1000000 dbitrate 8000000 fd on tdco 7 tdcf 8 tdc-mode auto

| $ ip --details link show can0
| 1:  can0: <NOARP,ECHO> mtu 72 qdisc noop state DOWN mode DEFAULT group default qlen 10
|     link/can  promiscuity 0 minmtu 0 maxmtu 0
|     can <FD,TDC-AUTO> state STOPPED (berr-counter tx 0 rx 0) restart-ms 0
| 	  bitrate 1000000 sample-point 0.750
| 	  tq 12 prop-seg 29 phase-seg1 30 phase-seg2 20 sjw 1 brp 1
| 	  ES582.1/ES584.1: tseg1 2..256 tseg2 2..128 sjw 1..128 brp 1..512 brp_inc 1
| 	  dbitrate 8000000 dsample-point 0.700
| 	  dtq 12 dprop-seg 3 dphase-seg1 3 dphase-seg2 3 dsjw 1 dbrp 1
| 	  tdco 7 tdcf 8
| 	  ES582.1/ES584.1: dtseg1 2..32 dtseg2 1..16 dsjw 1..8 dbrp 1..32 dbrp_inc 1
| 	  tdco 0..127 tdcf 0..127
| 	  clock 80000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

| $ ip --details --json --pretty link show can0
| [ {
|         "ifindex": 1,
|         "ifname": "can0",
|         "flags": [ "NOARP","ECHO" ],
|         "mtu": 72,
|         "qdisc": "noop",
|         "operstate": "DOWN",
|         "linkmode": "DEFAULT",
|         "group": "default",
|         "txqlen": 10,
|         "link_type": "can",
|         "promiscuity": 0,
|         "min_mtu": 0,
|         "max_mtu": 0,
|         "linkinfo": {
|             "info_kind": "can",
|             "info_data": {
|                 "ctrlmode": [ "FD","TDC-AUTO" ],
|                 "state": "STOPPED",
|                 "berr_counter": {
|                     "tx": 0,
|                     "rx": 0
|                 },
|                 "restart_ms": 0,
|                 "bittiming": {
|                     "bitrate": 1000000,
|                     "sample_point": "0.750",
|                     "tq": 12,
|                     "prop_seg": 29,
|                     "phase_seg1": 30,
|                     "phase_seg2": 20,
|                     "sjw": 1,
|                     "brp": 1
|                 },
|                 "bittiming_const": {
|                     "name": "ES582.1/ES584.1",
|                     "tseg1": {
|                         "min": 2,
|                         "max": 256
|                     },
|                     "tseg2": {
|                         "min": 2,
|                         "max": 128
|                     },
|                     "sjw": {
|                         "min": 1,
|                         "max": 128
|                     },
|                     "brp": {
|                         "min": 1,
|                         "max": 512
|                     },
|                     "brp_inc": 1
|                 },
|                 "data_bittiming": {
|                     "bitrate": 8000000,
|                     "sample_point": "0.700",
|                     "tq": 12,
|                     "prop_seg": 3,
|                     "phase_seg1": 3,
|                     "phase_seg2": 3,
|                     "sjw": 1,
|                     "brp": 1,
|                     "tdc": {
|                         "tdco": 7,
|                         "tdcf": 8
|                     }
|                 },
|                 "data_bittiming_const": {
|                     "name": "ES582.1/ES584.1",
|                     "tseg1": {
|                         "min": 2,
|                         "max": 32
|                     },
|                     "tseg2": {
|                         "min": 1,
|                         "max": 16
|                     },
|                     "sjw": {
|                         "min": 1,
|                         "max": 8
|                     },
|                     "brp": {
|                         "min": 1,
|                         "max": 32
|                     },
|                     "brp_inc": 1,
|                     "tdc": {
|                         "tdco": {
|                             "min": 0,
|                             "max": 127
|                         },
|                         "tdcf": {
|                             "min": 0,
|                             "max": 127
|                         }
|                     }
|                 },
|                 "clock": 80000000
|             }
|         },
|         "num_tx_queues": 1,
|         "num_rx_queues": 1,
|         "gso_max_size": 65536,
|         "gso_max_segs": 65535
|     } ]

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-04 09:43:10 -06:00
Vincent Mailhol 0f7bb8d842 iplink_can: print brp and dbrp bittiming variables
Report the value of the bit-rate prescaler (brp) for both the nominal
and the data bittiming.

Currently, only the constant brp values (brp_{min,max,inc}) are being
reported. Also, brp is the only member of struct can_bittiming not
being reported.

Noticeably, brp could be calculated by hand from the other bittiming
parameters with below formula:

        brp = clock * tq / 1000000000

with clock in hertz and tq in nano second (thus the need of a 1
billion factor to convert it back to second).

But because above formula is not so trivial to remember and is
subjected to rounding errors, it makes sense to directly output
{d,}bpr.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-04 09:42:54 -06:00
Vincent Mailhol 67f3c7a5cc iplink_can: use PRINT_ANY to factorize code and fix signedness
Current implementation heavily relies on some "if (is_json_context())"
switches to decide the context and then does some print_*(PRINT_JSON,
...) when in json context and some fprintf(...) else.

Furthermore, current implementation uses either print_int() or the
conversion specifier %d to print unsigned integers.

This patch factorizes each pairs of print_*(PRINT_JSON, ...) and
fprintf() into a single print_*(PRINT_ANY, ...) call. While doing this
replacement, it uses proper unsigned function print_uint() as well as
the conversion specifier %u when the parameter is an unsigned integer.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-04 09:42:50 -06:00
Vincent Mailhol fd5e958c49 iplink_can: code refactoring of print_ctrlmode()
This patch only does cleanup and do not introduce any functional
changes.

We do some code refactoring of print_ctrlmode() in prevision of the
upcoming patch:

  - remove the first argument of print_ctrlmode(). It is a pointer to
    FILE and is never used.

  - add a new function argument: enum output_type t in order to
    specify the output type (i.e. PRINT_{FP,JSON,ANY}).

  - add a new function argument: const char *key in order to specify
    the name of the json array (e.g. "ctrlmode").

  - replace the _PF() macro with the print_flag() function to increase
    readability.

  - directly return if none of the flags are set (previously, this
    check was done before calling the function).

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-04 09:42:44 -06:00
Vincent Mailhol 8316df6e6d iplink_can: fix configuration ranges in print_usage() and add unit
The configuration ranges in print_usage() are taken from "Table 8 -
Time segments' minimum configuration ranges" in section 11.3.1.2
"Configuration of the bit time parameters" of ISO 11898-1.

The standard clearly specifies that "implementations may allow time
segments that exceed the minimum required configuration ranges
specified in Table 8".

Because no maximum ranges are given in the standard, all given ranges
{ a..b } are simply replaced with { NUMBER }.

The actual ranges are specific to each device and can be confirmed
doing:

$ ip --details link show can0
1: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0 minmtu 0 maxmtu 0
    can state STOPPED restart-ms 0
	  ES582.1/ES584.1: tseg1 2..256 tseg2 2..128 sjw 1..128 brp 1..512 brp-inc 1
	  ES582.1/ES584.1: dtseg1 2..32 dtseg2 1..16 dsjw 1..8 dbrp 1..32 dbrp-inc 1
	  clock 80000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

Finally, the unit (bps, tq, ns or ms) are given. The rationale to add
the units is that the TDC parameters (that will be introduced in the
upcoming patches) are measured in a different unit than the other
bittiming parameters: clock period (a.k.a. minimum time quantum)
instead of time quantum. Adding the units disambiguates things.

For reference, before the change:
$ ip link set can0 type can help
Usage: ip link set DEVICE type can
	[ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
	[ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
 	  phase-seg2 PHASE-SEG2 [ sjw SJW ] ]

	[ dbitrate BITRATE [ dsample-point SAMPLE-POINT] ] |
	[ dtq TQ dprop-seg PROP_SEG dphase-seg1 PHASE-SEG1
 	  dphase-seg2 PHASE-SEG2 [ dsjw SJW ] ]

	[ loopback { on | off } ]
	[ listen-only { on | off } ]
	[ triple-sampling { on | off } ]
	[ one-shot { on | off } ]
	[ berr-reporting { on | off } ]
	[ fd { on | off } ]
	[ fd-non-iso { on | off } ]
	[ presume-ack { on | off } ]

	[ restart-ms TIME-MS ]
	[ restart ]

	[ termination { 0..65535 } ]

	Where: BITRATE	:= { 1..1000000 }
		  SAMPLE-POINT	:= { 0.000..0.999 }
		  TQ		:= { NUMBER }
		  PROP-SEG	:= { 1..8 }
		  PHASE-SEG1	:= { 1..8 }
		  PHASE-SEG2	:= { 1..8 }
		  SJW		:= { 1..4 }
		  RESTART-MS	:= { 0 | NUMBER }

...and after it:
$ ip link set can0 type can help
Usage: ip link set DEVICE type can
	[ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
	[ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
 	  phase-seg2 PHASE-SEG2 [ sjw SJW ] ]

	[ dbitrate BITRATE [ dsample-point SAMPLE-POINT] ] |
	[ dtq TQ dprop-seg PROP_SEG dphase-seg1 PHASE-SEG1
 	  dphase-seg2 PHASE-SEG2 [ dsjw SJW ] ]

	[ loopback { on | off } ]
	[ listen-only { on | off } ]
	[ triple-sampling { on | off } ]
	[ one-shot { on | off } ]
	[ berr-reporting { on | off } ]
	[ fd { on | off } ]
	[ fd-non-iso { on | off } ]
	[ presume-ack { on | off } ]
	[ cc-len8-dlc { on | off } ]

	[ restart-ms TIME-MS ]
	[ restart ]

	[ termination { 0..65535 } ]

	Where: BITRATE	:= { NUMBER in bps }
		  SAMPLE-POINT	:= { 0.000..0.999 }
		  TQ		:= { NUMBER in ns }
		  PROP-SEG	:= { NUMBER in tq }
		  PHASE-SEG1	:= { NUMBER in tq }
		  PHASE-SEG2	:= { NUMBER in tq }
		  SJW		:= { NUMBER in tq }
		  RESTART-MS	:= { 0 | NUMBER in ms }

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-04 09:42:23 -06:00
Taehee Yoo 6e15d27aae ip: add AMT support
Add basic support for Automatic Multicast Tunneling (AMT) network devices.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
2021-11-03 13:24:13 -06:00
David Ahern 9cae1de564 Import amt.h
Impor amt.h uapi from last kernel sync point

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-03 13:23:38 -06:00
David Ahern 258e350ca9 Update kernel headers
Update kernel headers to commit:
    cc0356d6a02e ("Merge tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-11-03 13:22:15 -06:00
Moshe Shemesh 047e9ae516 devlink: Fix cmd_dev_param_set() to check configuration mode
This patch is fixing a bug, when param set user command includes
configuration mode which is not supported, the tool may not respond
with error if the requested value is 0. In such case
cmd_dev_param_set_cb() won't find the requested configuration mode and
returns ctx->value as initialized (equal 0). Then cmd_dev_param_set()
may find that requested value equals current value and returns success.

Fixing the bug by adding a flag cmode_found which is set only if
cmd_dev_param_set_cb() finds the requested configuration mode.

Fixes: 13925ae9eb ("devlink: Add param command support")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-11-02 08:34:33 -07:00
Stephen Hemminger 7a8b7573a4 v5.15.0 2021-11-01 16:41:02 -07:00
Neta Ostrovsky ad3a118f88 rdma: Fix SRQ resource tracking information json
Fix the json output for the QPs that are associated with the SRQ -
The qpn are now displayed in a json array.

Sample output before the fix:
$ rdma res show srq lqpn 126-141 -j -p
[ {
        "ifindex":0,
	"ifname":"ibp8s0f0",
	"srqn":4,
	"type":"BASIC",
	"lqpn":["126-128,130-140"],
	"pdn":9,
	"pid":3581,
	"comm":"ibv_srq_pingpon"
    },{
	"ifindex":0,
	"ifname":"ibp8s0f0",
	"srqn":5,
	"type":"BASIC",
	"lqpn":["141"],
	"pdn":10,
	"pid":3584,
	"comm":"ibv_srq_pingpon"
    } ]

Sample output after the fix:
$ rdma res show srq lqpn 126-141 -j -p
[ {
        "ifindex":0,
	"ifname":"ibp8s0f0",
	"srqn":4,
	"type":"BASIC",
	"lqpn":["126-128","130-140"],
	"pdn":9,
	"pid":3581,
	"comm":"ibv_srq_pingpon"
    },{
	"ifindex":0,
	"ifname":"ibp8s0f0",
	"srqn":5,
	"type":"BASIC",
	"lqpn":["141"],
	"pdn":10,
	"pid":3584,
	"comm":"ibv_srq_pingpon"
    } ]

Fixes: 9b272e138d ("rdma: Add SRQ resource tracking information")
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-29 15:04:45 -07:00
Antoine Tenart 7a235a101b man: devlink-port: fix pfnum for devlink port add
When configuring a devlink PCI port, the pfnumber can be specified
using 'pfnum' and not 'pcipf' as stated in the man page. Fix this.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-29 15:03:44 -07:00
David Ahern e2947f6fd8 Merge branch 'managed-neighbor' into next
Daniel Borkmann  says:

====================

iproute2 patches to add support for managed neighbor entries as per recent
net-next commits:

  2ed08b5ead3c ("Merge branch 'Managed-Neighbor-Entries'")
  c47fedba94bc ("Merge branch 'minor-managed-neighbor-follow-ups'")

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-28 09:00:26 -06:00
Daniel Borkmann 9e009e78e7 ip, neigh: Add NTF_EXT_MANAGED support
Currently, ip neigh does not support the NTF_EXT_MANAGED flag. Add cmdline
support.

Usage example:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 managed extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a managed extern_learn REACHABLE
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-28 08:59:03 -06:00
Daniel Borkmann 040e52526c ip, neigh: Add missing NTF_USE support
Currently, ip neigh does not support the NTF_USE flag. Similar to other flags
such as extern_learn, add cmdline support. The flag dump support is explicitly
missing here, since the kernel does not propagate the flag back to user space.

Usage example:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-28 08:58:55 -06:00
Daniel Borkmann c76a3849ec ip, neigh: Fix up spacing in netlink dump
Fix up spacing to consistently add a single ' ' after an attribute has
been printed. Currently, it is a bit of a mix of before and after which
can lead to double spacing to be printed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-28 08:58:50 -06:00
Nicolas Dichtel 76b30805f9 xfrm: enable to manage default policies
Two new commands to manage default policies:
 - ip xfrm policy setdefault
 - ip xfrm policy getdefault

And the corresponding part in 'ip xfrm monitor'.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-28 08:58:28 -06:00
David Ahern 2be7d99960 Merge branch 'rdma-optional-stats' into next
Mark Zhang  says:

====================

This is supplementary part of kernel series [1], which provides an
extension to the rdma statistics tool that allows to set or list
optional counters dynamically, using netlink.

Thanks

[1] https://www.spinics.net/lists/linux-rdma/msg106283.html

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-16 12:52:02 -06:00
Stephen Hemminger 229eaba507 uapi: pickup fix for xfrm ABI breakage
See kernel
Commit 844f7eaaed9 ("include/uapi/linux/xfrm.h: Fix XFRM_MSG_MAPPING ABI breakage")

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-15 17:40:30 -07:00
Nicolas Dichtel 95cd2a6204 iplink: enable to specify index when changing netns
When an interface is moved to another netns, it's possible to specify a
new ifindex. Let's add this support.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eeb85a14ee34
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 18:05:09 -06:00
David Ahern a936a73fc2 Merge branch 'config-libdir' into next
Andrea Claudi  says:

====================

This series add support for the libdir parameter in iproute2 configure
script. The idea is to make use of the fact that packaging systems may
assume that 'configure' comes from autotools allowing a syntax similar
to the autotools one, and using it to tell iproute2 where the distro
expects to find its lib files.

Patches 1-2 fix a parsing issue on current configure options, that may
trigger an endless loop when no value is provided with some options;

Patch 3 fixes a parsing issue bailing out when more than one value is
provided for a single option;

Patch 4 simplifies options parsing, moving semantic checks out of the
while loop processing options;

Patch 5 introduces support for the --opt=value style on current options,
for uniformity;

Patch 6 adds the --prefix option, that may be used by some packaging
systems when calling the configure script;

Patch 7 finally adds the --libdir option, and also drops the static
LIBDIR var from the Makefile.

Changelog:
----------
v4 -> v5
  - bail out when multiple values are provided with a single option
  - simplify option parsing and reduce code duplication, as suggested
    by Phil Sutter
  - remove a nasty eval on libdir option processing

v3 -> v4
  - fix parsing issue on '--include_dir' and '--libbpf_dir'
  - split '--opt value' and '--opt=value' use cases, avoid code
    duplication moving semantic checks on value to dedicated functions

v2 -> v3
  - fix parsing error on prefix and libdir options.

v1 -> v2
  - consolidate '--opt value' and '--opt=value' use cases, as suggested
    by David Ahern.
  - added patch 2 to manage the --prefix option, used by the Debian
    packaging system, as reported by Luca Boccassi, and use it when
    setting lib directory.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:59:33 -06:00
Andrea Claudi cee0cf84bd configure: add the --libdir option
This commit allows users/packagers to choose a lib directory to store
iproute2 lib files.

At the moment iproute2 ship lib files in /usr/lib and offers no way to
modify this setting. However, according to the FHS, distros may choose
"one or more variants of the /lib directory on systems which support
more than one binary format" (e.g. /usr/lib64 on Fedora).

As Luca states in commit a3272b9372 ("configure: restore backward
compatibility"), packaging systems may assume that 'configure' is from
autotools, and try to pass it some parameters.

Allowing the '--libdir=/path/to/libdir' syntax, we can use this to our
advantage, and let the lib directory to be chosen by the distro
packaging system.

Note that LIBDIR uses "\${prefix}/lib" as default value because autoconf
allows this to be expanded to the --prefix value at configure runtime.
"\${prefix}" is replaced with the PREFIX value in check_lib_dir().

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:57:20 -06:00
Andrea Claudi 0ee1950b5c configure: add the --prefix option
This commit add the '--prefix' option to the iproute2 configure script.

This mimics the '--prefix' option that autotools configure provides, and
will be used later to allow users or packagers to set the lib directory.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:57:17 -06:00
Andrea Claudi 4b8bca5f9e configure: support --param=value style
This commit makes it possible to specify values for configure params
using the common autotools configure syntax '--param=value'.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:57:05 -06:00
Andrea Claudi 99245d1741 configure: simplify options parsing
This commit simplifies options parsing moving all the code not related to
parsing out of the case statement.

- The conditional shift after the assignments is moved right after the
  case, reducing code duplication.
- The semantic checks on the LIBBPF_FORCE value is moved after the loop
  like we already did for INCLUDE and LIBBPF_DIR.
- Finally, the loop condition is changed to check remaining arguments, thus
  making it possible to get rid of the null string case break.

As a bonus, now the help message states that on or off should follow
--libbpf_force

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:57:02 -06:00
Andrea Claudi c330d09794 configure: fix parsing issue with more than one value per option
With commit a9c3d70d90 ("configure: add options ability") users are no
more able to provide wrong command lines like:

$ ./configure --include_dir foo bar

The script simply bails out when user provides more than one value for a
single option. However, in doing so, it breaks backward compatibility with
some packaging system, which expects unknown options to be ignored.

Commit a3272b9372 ("configure: restore backward compatibility") fix this
issue, but makes it possible again for users to provide wrong command lines
such as the one above.

This fixes the issue simply ignoring autoconf-like options such as
'--opt=value'.

Fixes: a3272b9372 ("configure: restore backward compatibility")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:56:57 -06:00
Andrea Claudi 48c379bc2a configure: fix parsing issue on libbpf_dir option
configure is stuck in an endless loop if '--libbpf_dir' option is used
without a value:

$ ./configure --libbpf_dir
./configure: line 515: shift: 2: shift count out of range
./configure: line 515: shift: 2: shift count out of range
[...]

Fix it splitting 'shift 2' into two consecutive shifts, and making the
second one conditional to the number of remaining arguments.

A check is also provided after the while loop to verify the libbpf dir
exists; also, as LIBBPF_DIR does not have a default value, configure bails
out if the user does not specify a value after --libbpf_dir, thus avoiding
to produce an erroneous configuration.

Fixes: 7ae2585b86 ("configure: convert LIBBPF environment variables to command-line options")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:56:53 -06:00
Andrea Claudi 1d819dcc74 configure: fix parsing issue on include_dir option
configure is stuck in an endless loop if '--include_dir' option is used
without a value:

$ ./configure --include_dir
./configure: line 506: shift: 2: shift count out of range
./configure: line 506: shift: 2: shift count out of range
[...]

Fix it splitting 'shift 2' into two consecutive shifts, and making the
second one conditional to the number of remaining arguments.

A check is also provided after the while loop to verify the include dir
exists; this avoid to produce an erroneous configuration.

Fixes: a9c3d70d90 ("configure: add options ability")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:56:48 -06:00
Neta Ostrovsky 19ba785f16 rdma: Add optional-counters set/unset support
This patch provides an extension to the rdma statistics tool
that allows to set/unset optional counters set dynamically,
using new netlink commands.
Note that the optional counter statistic implementation is
driver-specific and may impact the performance.

Examples:
To enable a set of optional counters on link rocep8s0f0/1:
    $ sudo rdma statistic set link rocep8s0f0/1 optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts
To disable all optional counters on link rocep8s0f0/1:
    $ sudo rdma statistic unset link rocep8s0f0/1 optional-counters

Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:52:57 -06:00
Neta Ostrovsky 7d5cb70e94 rdma: Add stat "mode" support
This patch introduces the "mode" command, which presents the enabled or
supported (when the "supported" argument is available) optional
counters.

An optional counter is a vendor-specific counter that may be
dynamically enabled/disabled. This enhancement of hwcounters allows
exposing of counters which are for example mutual exclusive and cannot
be enabled at the same time, counters that might degrades performance,
optional debug counters, etc.

Examples:
To present currently enabled optional counters on link rocep8s0f0/1:
    $ rdma statistic mode link rocep8s0f0/1
    link rocep8s0f0/1 optional-counters cc_rx_ce_pkts

To present supported optional counters on link rocep8s0f0/1:
    $ rdma statistic mode supported link rocep8s0f0/1
    link rocep8s0f0/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts

Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:52:53 -06:00
Neta Ostrovsky d480cb71f5 rdma: Update uapi headers
Update rdma_netlink.h file upto kernel commit 7301d0a9834c
("RDMA/nldev: Add support to get status of all counters")

Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:52:47 -06:00
David Ahern e4ca6a4965 Update kernel headers
Update kernel headers to commit:
    295711fa8fec ("Merge branch 'dpaa2-irq-coalescing'")

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:49:19 -06:00
Stephen Hemminger a31e7b7967 mptcp: cleanup include section.
David reported ipmptcp breaks hard the build when updating the
relevant kernel headers.

We should be more careful in the header section, explicitly
including all the required dependencies respecting the usual order
between systems and local headers.

Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-15 17:48:36 -06:00
Paul Chaignon a500c5ac87 lib/bpf: fix map-in-map creation without prepopulation
When creating map-in-maps, the outer map can be prepopulated using the
inner_idx field of inner maps. That field defines the index of the inner
map in the outer map. It is ignored if set to -1.

Commit 6d61a2b557 ("lib: add libbpf support") however started using
that field to identify inner maps. While iterating over all maps looking
for inner maps, maps with inner_idx set to -1 are erroneously skipped.
As a result, trying to create a map-in-map with prepopulation disabled
fails because the inner_id of the outer map is not correctly set.

This bug can be observed with strace -ebpf (notice the zero inner_map_fd
for the outer map creation):

    bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=130996, max_entries=1, map_flags=0, inner_map_fd=0, map_name="maglev_inner", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0}, 128) = 32
    bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH_OF_MAPS, key_size=2, value_size=4, max_entries=65536, map_flags=BPF_F_NO_PREALLOC, inner_map_fd=0, map_name="maglev_outer", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0}, 128) = -1 EINVAL (Invalid argument)

Fixes: 6d61a2b557 ("lib: add libbpf support")
Signed-off-by: Paul Chaignon <paul@isovalent.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-14 14:37:51 -07:00
Antoine Tenart 7c032cac10 man: devlink-port: remove extra .br
br. were added between options of the same command. That is not needed
and makes the output to be one 3 lines for no particular reason.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-11 19:27:12 -07:00
Antoine Tenart 04ee8e6f06 man: devlink-port: fix style
Values should be .I, square brackets should be used for optional values,
curly brackets for lists. Follow this in the devlink-port man page.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-11 19:27:12 -07:00
Antoine Tenart 14802d84d3 man: devlink-port: fix the devlink port add synopsis
When configuring a devlink PCI SF port, the sfnumber can be specified
using 'sfnum' and not 'pcisf' as stated in the man page. Fix this.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-11 19:27:12 -07:00
David Ahern 8cd517a805 Merge branch 'main' into next
Conflicts:
	ip/ipneigh.c

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-09 17:47:47 -06:00
David Ahern 763fd793fe Merge branch 'ioam-encap-modes' into next
Justin Iurman  says:

====================

Following the series applied to net-next (see [1]), here are the corresponding
changes to iproute2.

In the current implementation, IOAM can only be inserted directly (i.e., only
inside packets generated locally) by default, to be compliant with RFC8200.

This patch adds support for in-transit packets and provides the ip6ip6
encapsulation of IOAM (RFC8200 compliant). Therefore, three ioam6 encap modes
are defined:

 - inline: directly inserts IOAM inside packets (by default).

 - encap:  ip6ip6 encapsulation of IOAM inside packets.

 - auto:   either inline mode for packets generated locally or encap mode for
           in-transit packets.

With current iproute2 implementation, it is configured this way:

$ ip -6 r [...] encap ioam6 trace prealloc [...]

The old syntax does not change (for backwards compatibility) and implicitly uses
the inline mode. With the new syntax, an encap mode can be specified:

(inline mode)
$ ip -6 r [...] encap ioam6 mode inline trace prealloc [...]

(encap mode)
$ ip -6 r [...] encap ioam6 mode encap tundst fc00::2 trace prealloc [...]

(auto mode)
$ ip -6 r [...] encap ioam6 mode auto tundst fc00::2 trace prealloc [...]

A tunnel destination address must be configured when using the encap mode or the
auto mode.

  [1] https://lore.kernel.org/netdev/163335001045.30570.12527451523558030753.git-patchwork-notify@kernel.org/T/#m3b428d4142ee3a414ec803466c211dfdec6e0c09

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-09 17:37:12 -06:00
Justin Iurman 41020eb0fd Update documentation
This patch updates the IOAM documentation (ip-route man page) to reflect the
three encap modes that were introduced.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David Ahern <dsahern@kernel.org>
2021-10-09 17:35:54 -06:00