Recently we added SKBMOD_F_ECN option support to the kernel; support it in
the tc-skbmod(8) front end, and update its man page accordingly.
The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
headers are used to represent different ECN states [1]:
0b00: "Non ECN-Capable Transport", Non-ECT
0b10: "ECN Capable Transport", ECT(0)
0b01: "ECN Capable Transport", ECT(1)
0b11: "Congestion Encountered", CE
This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE,
which is useful for ECN-based rate limiting. For example:
$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod \
ecn
The updated tc-skbmod SYNOPSIS looks like the following:
tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...
Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
command. Trying to use more than one of them at a time is considered
undefined behavior; pipe multiple tc-skbmod commands together instead.
"set" and "swap" only affect Ethernet packets, while "ecn" only affects
IP packets.
Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
support", as well as iproute2 patch "tc/skbmod: Remove misinformation
about the swap action".
[1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch provides man8 documentation for IOAM inside ip, ip-ioam and ip-route.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David Ahern <dsahern@kernel.org>
Make use of the already available brief flag and print the basic details of
the IPv4 or IPv6 neighbour cache in a tabular format for better readability
when the brief output is expected.
$ ip -br neigh
172.16.12.100 bridge0 b0:fc:36:2f:07:43
172.16.12.174 bridge0 8c:16:45:2f:bc:1c
172.16.12.250 bridge0 04:d9:f5:c1:0c:74
fe80::267b:9f70:745e:d54d bridge0 b0:fc:36:2f:07:43
fd16:a115:6a62:0:8744:efa1:9933:2c4c bridge0 8c:16:45:2f:bc:1c
fe80::6d9:f5ff:fec1:c74 bridge0 04:d9:f5:c1:0c:74
And add "ip neigh show" to the list of ip sub commands mentioned in the man
page that support the brief output in tabular format.
Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Fixes: 3a1ca9a5b ("bridge: update man page for new color and json changes")
Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Port function state can have either of the two values - active or
inactive. Update the documentation and help command for these two
values to tell user about it.
With the introduction of state, hw_addr and state are optional.
Hence mark them as optional in man page that also aligns with the help
command output.
Fixes: bdfb9f1bd6 ("devlink: Support set of port function state")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Currently man 8 tc-skbmod says that "...the swap action will occur after
any smac/dmac substitutions are executed, if they are present."
This is false. In fact, trying to "set" and "swap" in a single skbmod
command causes the "set" part to be completely ignored. As an example:
$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
matchall action skbmod \
set dmac AA:AA:AA:AA:AA:AA smac BB:BB:BB:BB:BB:BB \
swap mac
The above command simply does a "swap", without setting DMAC or SMAC to
AA's or BB's. The root cause of this is in the kernel, see
net/sched/act_skbmod.c:tcf_skbmod_init():
parm = nla_data(tb[TCA_SKBMOD_PARMS]);
index = parm->index;
if (parm->flags & SKBMOD_F_SWAPMAC)
lflags = SKBMOD_F_SWAPMAC;
^^^^^^^^^^^^^^^^^^^^^^^^^^
Doing a "=" instead of "|=" clears all other "set" flags when doing a
"swap". Discourage using "set" and "swap" in the same command by
documenting it as undefined behavior, and update the "SYNOPSIS" section
as well as tc -help text accordingly.
If one really needs to e.g. "set" DMAC to all AA's then "swap" DMAC and
SMAC, one should do two separate commands and "pipe" them together.
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Implement a decrement operation for ttl and hoplimit.
Since this is just syntactic sugar, it goes that:
tc filter add ... action pedit ex munge ip ttl dec ...
tc filter add ... action pedit ex munge ip6 hoplimit dec ...
is just a more readable version of this:
tc filter add ... action pedit ex munge ip ttl add 0xff ...
tc filter add ... action pedit ex munge ip6 hoplimit add 0xff ...
This feature was suggested by some pseudo tc examples in Mellanox's
documentation[1], but wasn't present in neither their mlnx-iproute2
nor iproute2.
Tested with skip_sw on Mellanox ConnectX-6 Dx.
[1] https://docs.mellanox.com/pages/viewpage.action?pageId=47033989
v3:
- Use dedicated flags argument in parse_cmd() (David Ahern)
- Minor rewording of the man page
v2:
- Fix whitespace issue (Stephen Hemminger)
- Add to usage info in explain()
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
The ip link property add/delete requires a device; but the
device argument was not show on the man page.
It is correct in the usage message.
Fixes: 3aa0e51be6 ("ip: add support for alternative name addition/deletion/list")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
We introduce the new "End.DT46" action for supporting the SRv6 End.DT46
Behavior in iproute2.
The SRv6 End.DT46 Behavior, defined in RFC 8986 [1] section 4.8, can be
used to implement L3 VPNs based on Segment Routing over IPv6 networks in
multi-tenants environments and it is capable of handling both IPv4 and
IPv6 tenant traffic at the same time.
The SRv6 End.DT46 Behavior decapsulates the received packets and it
performs the IPv4 or IPv6 routing lookup in the routing table of the
tenant.
As for the End.DT4 and for the End.DT6 in VRF mode, the SRv6 End.DT46
Behavior leverages a VRF device in order to force the routing lookup into
the associated routing table using the "vrftable" attribute.
To make the End.DT46 work properly, it must be guaranteed that the
routing table used for routing lookup operations is bound to one and
only one VRF during the tunnel creation. Such constraint has to be
enforced by enabling the VRF strict_mode sysctl parameter, i.e.:
$ sysctl -wq net.vrf.strict_mode=1
Note that the same approach is used for the End.DT4 Behavior and for the
End.DT6 Behavior in VRF mode.
An SRv6 End.DT46 Behavior instance can be created as follows:
$ ip -6 route add 2001:db8::1 encap seg6local action End.DT46 vrftable 100 dev vrf100
Standard Output:
$ ip -6 route show 2001:db8::1
2001:db8::1 encap seg6local action End.DT46 vrftable 100 dev vrf100 metric 1024 pref medium
JSON Output:
$ ip -6 -j -p route show 2001:db8::1
[ {
"dst": "2001:db8::1",
"encap": "seg6local",
"action": "End.DT46",
"vrftable": 100,
"dev": "vrf100",
"metric": 1024,
"flags": [ ],
"pref": "medium"
} ]
This patch updates the route.8 man page and the ip route help with the
information related to End.DT46.
Considering that the same information was missing for the SRv6 End.DT4 and
the End.DT6 Behaviors, we have also added it.
[1] https://www.rfc-editor.org/rfc/rfc8986.html#name-enddt46-decapsulation-and-s
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Signed-off-by: David Ahern <dsahern@kernel.org>
Implement user commands to manage devlink port func rate objects.
List all rate commands:
$ devlink port func rate help
or just
$ devlink port func rate
To list all OR particular rate object:
$ devlink port func rate show
pci/0000:03:00.0/some_group: type node
pci/0000:03:00.0/0: type leaf
pci/0000:03:00.0/1: type leaf
$ devlink prot func rate show pci/0000:03:00.0/1
pci/0000:03:00.0/0: type leaf
$ devlink prot func rate show pci/0000:03:00.0/some_group
pci/0000:03:00.0/some_group: type node
Rate object of type "leaf" created by it's driver where name is the name
of corresponding devlink port. Rate object of type "node" represents
rate group created by the user using commands:
$ devlink port func rate add pci/0000:03:00.0/some_group
or with defining tx rate limits
$ devlink port func rate add pci/0000:03:00.0/some_group \
tx_shara 10kbit tx_max 100mbit
NOTE: node name cannot be a decimal value because it conflicts with
devlink port indexes.
To delete node object:
$ devlink port func rate del pci/0000:03:00.0/some_group
Set rate limits of existing rate object:
$ devlink prot func rate set pci/0000:03:00.0/0 \
tx_share 5MBps tx_max 25GBps
$ devlink prot func rate set pci/0000:03:00.0/some_group \
tx_share 0
Both SET and ADD commands accept any units of rates defined in IEC
60027-2 standard.
NOTE: rate value 0 means that rate is unlimited. Such value is also
ommited in show command output.
NOTE: In SHOW command output rate values will be printed with suffixes
as well, but in JSON output they are always units of Bps.
Set or unset parent of existing rate object:
$ devlink prot func rate set pci/0000:03:00.0/0 parent some_group
$ devlink port func rate set pci/0000:03:00.0/0 noparent
NOTE: Setting parent to empty ("") name due to kernel logic means unset
parent and shouldn't be used to avoid unexpected parent unsets.
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
A user optionally provides the external controller number when user
wants to create devlink port for the external controller.
An example on eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev
$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
function:
hw_addr 00:00:00:00:00:00
$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 external true splittable false
function:
hw_addr 00:00:00:00:00:00 state inactive opstate detached
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
'-b' option allows to request BPF filter opcodes, however
currently the kernel returns only classic BPF filter, so
reflect this in man page.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add support for matching on ct_state flag related.
The related state indicates a packet is associated with an existing
connection.
Example:
$ tc filter add dev ens1f0_0 ingress prio 1 chain 1 proto ip flower \
ct_state -est-rel+trk \
action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_0 ingress prio 1 chain 1 proto ip flower \
ct_state +rel+trk \
action mirred egress redirect dev ens1f0_1
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
We introduce the "count" optional attribute for supporting counters in SRv6
Behaviors as defined in [1], section 6. For each SRv6 Behavior instance,
counters defined in [1] are:
- the total number of packets that have been correctly processed;
- the total amount of traffic in bytes of all packets that have been
correctly processed;
In addition, we introduce a new counter that counts the number of packets
that have NOT been properly processed (i.e. errors) by an SRv6 Behavior
instance.
Each SRv6 Behavior instance can be configured, at the time of its creation,
to make use of counters specifing the "count" attribute as follows:
$ ip -6 route add 2001:db8::1 encap seg6local action End count dev eth0
per-behavior counters can be shown by adding "-s" to the iproute2 command
line, i.e.:
$ ip -s -6 route show 2001:db8::1
2001:db8::1 encap seg6local action End packets 0 bytes 0 errors 0 dev eth0
[1] https://www.rfc-editor.org/rfc/rfc8986.html#name-counters
v2:
- add help and route.8 man page updates
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Signed-off-by: David Ahern <dsahern@kernel.org>
Linux kernel commit b8392808eb3fc28e ("sch_cake: add RFC 8622 LE PHB
support to CAKE diffserv handling") added packets with LE diffserv to
the Bulk priority tin. Update the documentation to reflect this change.
Signed-off-by: Tyson Moore <tyson@tyson.me>
Signed-off-by: David Ahern <dsahern@kernel.org>
The default behavior for source MACVLAN is to duplicate packets to
appropriate type source devices, and then do the normal destination MACVLAN
flow. This patch adds an option to skip destination MACVLAN processing if
any matching source MACVLAN device has the option set.
This allows setting up a "catch all" device for source MACVLAN: create one
or more devices with type source nodst, and one device with e.g. type vepa,
and incoming traffic will be received on exactly one device.
Signed-off-by: Jethro Beekman <kernel@jbeekman.nl>
Signed-off-by: David Ahern <dsahern@kernel.org>
Sample output:
$ rdma res show srq
dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib]
dev ibp8s0f0 srqn 4 type BASIC lqpn 125-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 5 type BASIC lqpn 141-156 pdn 10 pid 3584 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 6 type BASIC lqpn 157-172 pdn 11 pid 3590 comm ibv_srq_pingpon
dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib]
dev ibp8s0f1 srqn 1 type BASIC lqpn 329-344 pdn 4 pid 3586 comm ibv_srq_pingpon
$ rdma res show srq lqpn 126-141
dev ibp8s0f0 srqn 4 type BASIC lqpn 126-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon
dev ibp8s0f0 srqn 5 type BASIC lqpn 141 pdn 10 pid 3584 comm ibv_srq_pingpon
$ rdma res show srq lqpn 127
dev ibp8s0f0 srqn 4 type BASIC lqpn 127 pdn 9 pid 3581 comm ibv_srq_pingpon
Reviewed-by: Ido Kalir <idok@nvidia.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Sample output:
$ rdma res show ctx
dev ibp8s0f0 ctxn 0 pid 980 comm ibv_rc_pingpong
dev ibp8s0f0 ctxn 1 pid 981 comm ibv_rc_pingpong
dev ibp8s0f0 ctxn 2 pid 992 comm ibv_rc_pingpong
dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong
dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong
$ rdma res show ctx dev ibp8s0f1
dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong
dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong
Reviewed-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Ido Kalir <idok@nvidia.com>
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add support for vlan activity monitoring, we display vlan notifications on
vlan add/del/options change. The man page and help are also updated
accordingly.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Use the new bridge vlan rtm dump helper to dump all of the available
vlan information when -details (-d) is used with vlan show. It is also
capable of dumping vlan stats if -statistics (-s) is added.
Currently this is the only interface capable of dumping per-vlan
options. The vlan dump format is compatible with current vlan show, it
uses the same helpers to dump vlan information. The new addition is one
line which will contain the per-vlan options (similar to ip -d link show
for ports). Currently only the vlan STP state is printed.
The call uses compressed vlan format by default.
Example:
$ bridge -s -d vlan show
port vlan-id
virbr1 1 PVID Egress Untagged
state forwarding
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add a new per-vlan option set command. It allows to manipulate vlan
options, those can be bridge-wide or per-port depending on what device
is specified. The first option that can be set is the vlan STP state,
it is identical to the bridge port STP state. The man page is also
updated accordingly.
Example:
$ bridge vlan set vid 10 dev br0 state learning
or a range:
$ bridge vlan set vid 10-20 dev swp1 state blocking
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This adds iproute2 support for mptcp event monitoring, e.g. creation,
establishment, address announcements from the peer, subflow establishment
and so on.
While the kernel-generated events are primarily aimed at mptcpd (e.g. for
subflow management), this is also useful for debugging.
This adds print support for the existing events.
Sample output of 'ip mptcp monitor':
[ CREATED] token=83f3a692 remid=0 locid=0 saddr4=10.0.1.2 daddr4=10.0.1.1 sport=58710 dport=10011
[ ESTABLISHED] token=83f3a692 remid=0 locid=0 saddr4=10.0.1.2 daddr4=10.0.1.1 sport=58710 dport=10011
[SF_ESTABLISHED] token=83f3a692 remid=0 locid=1 saddr4=10.0.2.2 daddr4=10.0.1.1 sport=40195 dport=10011 backup=0
[ CLOSED] token=83f3a692
Signed-off-by: Florian Westphal <fw@strlen.de>
Allow a policer action to enforce a rate-limit based on packets-per-second,
configurable using a packet-per-second rate and burst parameters.
e.g.
# $TC actions add action police pkts_rate 1000 pkts_burst 200 index 1
# $TC actions ls action police
total acts 1
action order 0: police 0x1 rate 0bit burst 0b mtu 4096Mb pkts_rate 1000 pkts_burst 200
ref 1 bind 0
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch adds support for setting and displaying the Traffic Flow
Confidentiality attribute for an XFRM state, which allows padding ESP
packets to a specified length.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add ability to dump multiple nexthop buckets and get a specific one.
Example:
# ip nexthop add id 10 group 1/2 type resilient buckets 8
# ip nexthop
id 1 via 192.0.2.2 dev dummy10 scope link
id 2 via 192.0.2.19 dev dummy20 scope link
id 10 group 1/2 type resilient buckets 8 idle_timer 120 unbalanced_timer 0 unbalanced_time 0
# ip nexthop bucket
id 10 index 0 idle_time 28.1 nhid 2
id 10 index 1 idle_time 28.1 nhid 2
id 10 index 2 idle_time 28.1 nhid 2
id 10 index 3 idle_time 28.1 nhid 2
id 10 index 4 idle_time 28.1 nhid 1
id 10 index 5 idle_time 28.1 nhid 1
id 10 index 6 idle_time 28.1 nhid 1
id 10 index 7 idle_time 28.1 nhid 1
# ip nexthop bucket show nhid 1
id 10 index 4 idle_time 53.59 nhid 1
id 10 index 5 idle_time 53.59 nhid 1
id 10 index 6 idle_time 53.59 nhid 1
id 10 index 7 idle_time 53.59 nhid 1
# ip nexthop bucket get id 10 index 5
id 10 index 5 idle_time 81 nhid 1
# ip -j -p nexthop bucket get id 10 index 5
[ {
"id": 10,
"bucket": {
"index": 5,
"idle_time": 104.89,
"nhid": 1
},
"flags": [ ]
} ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add ability to configure resilient nexthop groups and show their current
configuration. Example:
# ip nexthop add id 10 group 1/2 type resilient buckets 8
# ip nexthop show id 10
id 10 group 1/2 type resilient buckets 8 idle_timer 120 unbalanced_timer 0
# ip -j -p nexthop show id 10
[ {
"id": 10,
"group": [ {
"id": 1
},{
"id": 2
} ],
"type": "resilient",
"resilient_args": {
"buckets": 8,
"idle_timer": 120,
"unbalanced_timer": 0
},
"flags": [ ]
} ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Next patches are going to add a 'resilient' nexthop group type, so allow
users to specify the type using the 'type' argument. Currently, only
'mpath' type is supported.
These two commands are equivalent:
# ip nexthop add id 10 group 1/2/3
# ip nexthop add id 10 group 1/2/3 type mpath
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
The feature is supported by the kernel since 5.11-net-next,
let's allow user-space to use it.
Just parse and dump an additional, per endpoint, u16 attribute
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Talking to varios people, it became apparent that there is a certain
ambiguity in the description of these flags. They refer to egress
flooding, which should perhaps be stated more clearly.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The "usually hardware" and "usually software" distinctions make no
sense, try to clarify what these do based on the actual kernel behavior.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The bridge program does:
fdb_modify:
/* Assume self */
if (!(req.ndm.ndm_flags&(NTF_SELF|NTF_MASTER)))
req.ndm.ndm_flags |= NTF_SELF;
which is clearly against the documented behavior. The only thing we can
do, sadly, is update the documentation.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Explaining the "local" flag by saying that it is "a local permanent fdb
entry" is not very helpful, be more specific.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The bridge does this:
fdb_modify:
/* Assume permanent */
if (!(req.ndm.ndm_state&(NUD_PERMANENT|NUD_REACHABLE)))
req.ndm.ndm_state |= NUD_PERMANENT;
So let's make the user aware of the fact that if they don't want local
entries, they need to specify some other flag like "static".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The bridge program parses "local" and "permanent" in just the same way,
so it makes sense to tell that to users:
fdb_modify:
} else if (matches(*argv, "local") == 0 ||
matches(*argv, "permanent") == 0) {
req.ndm.ndm_state |= NUD_PERMANENT;
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add implementation for the port parameters
getting/setting.
Add bash completion for port param.
Add man description for port param.
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David Ahern <dsahern@kernel.org>
vdpa tool is created to create, delete and query vdpa devices.
examples:
Show vdpa management device that supports creating, deleting vdpa devices.
$ vdpa mgmtdev show
vdpasim:
supported_classes net
$ vdpa mgmtdev show -jp
{
"show": {
"vdpasim": {
"supported_classes": [ "net" ]
}
}
}
Create a vdpa device of type networking named as "foo2" from
the management device vdpasim_net:
$ vdpa dev add mgmtdev vdpasim_net name foo2
Show the newly created vdpa device by its name:
$ vdpa dev show foo2
foo2: type network mgmtdev vdpasim_net vendor_id 0 max_vqs 2 max_vq_size 256
$ vdpa dev show foo2 -jp
{
"dev": {
"foo2": {
"type": "network",
"mgmtdev": "vdpasim_net",
"vendor_id": 0,
"max_vqs": 2,
"max_vq_size": 256
}
}
}
Delete the vdpa device after its use:
$ vdpa dev del foo2
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
In creating documentation for expressions I ran into an interesting case
where if you use two different familie types in the expression, such as
in `ss 'sport inet:ssh or src unix:/run/*'`, then you would only get the
results for one address family (in this case unix sockets).
The reason is that in parse_hostcond if the family is specified we
remove any previously added families from filter->families, and
preserve the "states" if any states are set. I tried changing this to
not reset the families, but ran into some issues with Invalid Argument
errors in inet_show_netlink, I think related to the state.
I can dig into that more if supporting this is useful, but I'm not sure
if these types of expressions would actually be useful in practice. Or
perhaps an error should be given if an expression contains conditions
with multiple families (besides inet and inet6)?
Anyway, for now, this patch just notes the limitation in the man page.
Signed-off-by: Thayne McCombs <astrothayne@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This adds some documentation of the syntax for the FILTER argument to
the ss command to the ss (8) man page.
Signed-off-by: Thayne McCombs <astrothayne@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since this feature's introduction in commit 9c66d1564676 ("taprio: Add
support for hardware offloading") from kernel v5.4, it never got
documented in the man pages. Due to this reason, we see customer reports
of seemingly contradictory information: the community manpages claim
there is no support for full offload, nonetheless many silicon vendors
have already implemented it.
This patch documents the full offload feature (enabled by specifying
"flags 2" to the taprio qdisc) and gives one more example that tries to
illustrate some of the finer points related to the usage.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
* Fix PROTO description in help message (mpls isn't a valid argument).
* Remove SRCPORTMIN description from help message since it doesn't
appear in the syntax string.
* Use same keywords in help message and in man page.
* Use the "ethertype" option name (.B ethertype) rather than the
option value (.I ETHERTYPE) in the man page description of
[no]multiproto.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Support set operation of the devlink port function state.
Example of a PCI SF port function which supports the state:
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
$ devlink port show
pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false
$ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
pci/0000:08:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 splittable false
function:
hw_addr 00:00:00:00:00:00 state inactive opstate detached
$ devlink port show pci/0000:06:00.0/32768
pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 splittable false
function:
hw_addr 00:00:00:00:00:00 state inactive opstate detached
$ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88 state active
$ devlink port show pci/0000:06:00.0/32768 -jp
{
"port": {
"pci/0000:06:00.0/32768": {
"type": "eth",
"netdev": "ens2f0npf0sf88",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 88,
"splittable": false,
"function": {
"hw_addr": "00:00:00:00:88:88",
"state": "active",
"opstate": "attached"
}
}
}
}
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Enable user to add and delete the devlink port.
Examples for adding and deleting one SF port:
Examples of add, show and delete commands:
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
$ devlink port show
pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false
Add devlink port of flavour 'pcipf' for PF number 0 SF number 88:
$ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
pci/0000:06:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 splittable false
function:
hw_addr 00:00:00:00:00:00 state inactive opstate detached
Delete newly added devlink port
$ devlink port del pci/0000:06:00.0/32768
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
The Linux DCBX object is a 1-byte bitfield of flags that configure whether
the DCBX protocol is implemented in the device or in the host, and which
version of the protocol should be used. Add a tool to access the per-port
Linux DCBX object.
For example:
# dcb dcbx set dev eni1np1 host ieee
# dcb dcbx show dev eni1np1
host ieee
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>