iproute2

Commit Graph

Author	SHA1	Message	Date
Nikolay Aleksandrov	a8d7212a4f	bridge: vlan: add global mcast_mld_version option Add control and dump support for the global mcast_mld_version option which controls the MLD version on the vlan (default 1). Syntax: $ bridge vlan global set dev bridge vid 1 mcast_mld_version 2 Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:25:17 -06:00
Nikolay Aleksandrov	29fada0f41	bridge: vlan: add global mcast_igmp_version option Add control and dump support for the global mcast_igmp_version option which controls the IGMP version on the vlan (default 2). Syntax: $ bridge vlan global set dev bridge vid 1 mcast_igmp_version 3 Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:24:09 -06:00
Nikolay Aleksandrov	1f608d590c	bridge: vlan: add global mcast_snooping option Add control and dump support for the global mcast_snooping option which controls if multicast snooping is enabled or disabled for a single vlan. Syntax: $ bridge vlan global set dev bridge vid 1 mcast_snooping 1 Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:23:26 -06:00
Nikolay Aleksandrov	dee5eb05e5	bridge: vlan: add support to set global vlan options Add support to change global vlan options via a new vlan global set subcommand similar to the current vlan set subcommand. The man page and help are updated accordingly. The command works only with bridge devices. It doesn't support any options yet. Syntax: $ bridge vlan global set vid VID dev DEV Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:21:13 -06:00
Nikolay Aleksandrov	ecf6d8b4a1	bridge: vlan: add support for vlan filtering when dumping options In order to allow vlan filtering when dumping options we need to move all print operations into the option dumping functions and add the filtering after we've parsed the nested attributes so we can extract the start and end vlan ids. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:21:09 -06:00
Nikolay Aleksandrov	720f8613bd	bridge: vlan: add support to show global vlan options Add support for new bridge vlan command grouping called global which operates on global options. The first command it supports is "show". To do that we update print_vlan_rtm to recognize the global vlan options attribute and parse it properly. Man page and help are also updated with the new command. Syntax is: $ bridge vlan global show [ vid VID ] [ dev DEV ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:21:04 -06:00
Nikolay Aleksandrov	d3a961a9b1	bridge: vlan: skip unknown attributes when printing options Skip unknown attributes when printing vlan options in print_vlan_rtm. Make sure print_vlan_opts doesn't accept attributes it doesn't understand. Currently we print only one type, later global vlan options support will be added. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:21:00 -06:00
Nikolay Aleksandrov	312e22fe79	bridge: vlan: factor out vlan option printing Factor out the code which prints current per-vlan options from print_vlan_rtm without any changes, later we'll filter based on the vlan attribute and add support for global vlan option printing. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:20:53 -06:00
Nikolay Aleksandrov	d2eecb9d1d	ip: bridge: add support for mcast_vlan_snooping Add support for mcast_vlan_snooping option which controls per-vlan multicast snooping, also update the man page. Syntax: $ ip link set dev bridge type bridge mcast_vlan_snooping 0/1 Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-31 21:20:03 -06:00
Stephen Hemminger	169f36a0c9	v5.14.0	2021-08-31 11:57:59 -07:00
Jakub Kicinski	85b0e73c77	ss: fix fallback to procfs for raw sockets Jonas reports that ss -awp does not display any RAW sockets on a Knoppix 4.4 kernel. sockdiag_send() diverts to tcpdiag_send() to try the older netlink interface. tcpdiag_send() works for TCP and DCCP but not other protocols. Instead of rejecting unsupported protocols (and missing RAW and SCTP) match on supported ones. Link: https://lore.kernel.org/netdev/20210815231738.7b42bad4@mmluhan/ Reported-and-tested-by: Jonas Bechtel <post@jbechtel.de> Fixes: `41fe6c34de` ("ss: Add inet raw sockets information gathering via netlink diag interface") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-18 15:03:46 -07:00
Stephen Hemminger	1afde09498	uapi: update neighbour.h Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-18 14:09:34 -07:00
Gokul Sivakumar	10ecd12690	man: bridge: fix the typo to change "-c[lor]" into "-c[olor]" in man page Fixes: `3a1ca9a5b` ("bridge: update man page for new color and json changes") Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-18 14:04:53 -07:00
Gokul Sivakumar	057d3c6d37	bridge: fdb: don't colorize the "dev" & "dst" keywords in "bridge -c fdb" To be consistent with the colorized output of "ip" command and to increase readability, stop highlighting the "dev" & "dst" keywords in the colorized output of "bridge -c fdb" cmd. Example: in the following "bridge -c fdb" entry, only "00:00:00:00:00:00", "vxlan100" and "2001:db8:2::1" fields should be highlighted in color. 00:00:00:00:00:00 dev vxlan100 dst 2001:db8:2::1 self permanent Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-18 14:04:53 -07:00
Gokul Sivakumar	82149efee9	bridge: reorder cmd line arg parsing to let "-c" detected as "color" option As per the man/man8/bridge.8 page, the shorthand cmd line arg "-c" can be used to colorize the bridge cmd output. But while parsing the args in while loop, matches() detects "-c" as "-compressedvlans" instead of "-color", so fix this by doing the check for "-color" option first before checking for "-compressedvlans". Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-18 14:04:53 -07:00
Hangbin Liu	3a09567f7d	ip/bond: add arp_validate filter support Add arp_validate filter support based on kernel commit 896149ff1b2c ("bonding: extend arp_validate to be able to receive unvalidated arp-only traffic") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-18 14:02:44 -07:00
Parav Pandit	355c49ffa5	devlink: Show port state values in man page and in the help command Port function state can have either of the two values - active or inactive. Update the documentation and help command for these two values to tell user about it. With the introduction of state, hw_addr and state are optional. Hence mark them as optional in man page that also aligns with the help command output. Fixes: `bdfb9f1bd6` ("devlink: Support set of port function state") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-11 15:02:30 -07:00
Hangbin Liu	ebaa603b30	ip/bond: add lacp active support lacp_active specifies whether to send LACPDU frames periodically. If set on, the LACPDU frames are sent along with the configured lacp_rate setting. If set off, the LACPDU frames acts as "speak when spoken to". v2: use strcmp instead of match for new options. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>	2021-08-11 12:26:20 -06:00
David Ahern	8d6134b204	Update kernel headers Update kernel headers to commit: 88be32634905 ("Merge branch 'dsa-tagger-helpers'") Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-11 12:23:33 -06:00
Ilya Dmitrichenko	51d8fc708c	ip/tunnel: always print all known attributes Presently, if a Geneve or VXLAN interface was created with 'external', it's not possible for a user to determine e.g. the value of 'dstport' after creation. This change fixes that by avoiding early returns. This change partly reverts commit `00ff4b8e31` ("ip/tunnel: Be consistent when printing tunnel collect metadata"). Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-11 12:17:52 -06:00
Justin Iurman	71ba9c18e0	ipioam6: use print_nl instead of print_null This patch addresses Stephen's comment: """ > + print_null(PRINT_ANY, "", "\n", NULL); Use print_nl() since it handles the case of oneline output. Plus in JSON the newline is meaningless. """ It also removes two useless print_null's. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-11 12:16:09 -06:00
Phil Sutter	9b7ea92b9e	tc: u32: Fix key folding in sample option In between Linux kernel 2.4 and 2.6, key folding for hash tables changed in kernel space. When iproute2 dropped support for the older algorithm, the wrong code was removed and kernel 2.4 folding method remained in place. To get things functional for recent kernels again, restoring the old code alone was not sufficient - additional byteorder fixes were needed. While being at it, make use of ffs() and thereby align the code with how kernel determines the shift width. Fixes: `267480f553` ("Backout the 2.4 utsname hash patch.") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 20:02:43 -07:00
Andrea Claudi	d1eacf12b5	lib: bpf_glue: remove useless assignment The value of s used inside the cycle is the result of strstr(), so this assignment is useless. Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 20:01:54 -07:00
Andrea Claudi	50a4127022	lib: bpf_legacy: fix potential NULL-pointer dereference If bpf_map_fetch_name() returns NULL, strlen() hits a NULL-pointer dereference on outer_map_name. Fix this checking outer_map_name value, and returning false when NULL, as already done for inner_map_name before. Fixes: `6d61a2b557` ("lib: add libbpf support") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 19:55:12 -07:00
Jacob Keller	954a0077c8	devlink: fix infinite loop on flash update for drivers without status When processing device flash update, cmd_dev_flash function waits until the flash process has completed. This requires the following two conditions to both be true: a) we've received an exit status from the child process b) we've received the DEVLINK_CMD_FLASH_UPDATE_END or we haven't received any status notifications from the driver. The original devlink flash status monitoring code in `9b13cddfe2` ("devlink: implement flash status monitoring") was written assuming that a driver will either send no status updates, or it will send at least one DEVLINK_CMD_FLASH_UPDATE_STATUS before DEVLINK_CMD_FLASH_UPDATE_END. Newer versions of the kernel since commit 52cc5f3a166a ("devlink: move flash end and begin to core devlink") in v5.10 moved handling of the DEVLINK_CMD_FLASH_UPDATE_END into the core stack, and will send this regardless of whether or not the driver sends any of its own status notifications. The handling of DEVLINK_CMD_FLASH_UPDATE_END in cmd_dev_flash_status_cb has an additional condition that it must not be the first message. Otherwise, it falls back to treating it like a DEVLINK_CMD_FLASH_UPDATE_STATUS. This is wrong because it can lead to an infinite loop if a driver does not send any status updates. In this case, the kernel will send DEVLINK_CMD_FLASH_UPDATE_END without any DEVLINK_CMD_FLASH_UPDATE_STATUS. The devlink application will see that ctx->not_first is false, and will treat this like any other status message. Thus, ctx->not_first will be set to 1. The loop condition to exit flash update will thus never be true, since we will wait forever, because ctx->not_first is true, and ctx->received_end is false. This leads to the application appearing to process the flash update, but it will never exit. Fix this by simply always treating DEVLINK_CMD_FLASH_UPDATE_END the same regardless of whether its the first message or not. This is obviously the correct thing to do: once we've received the DEVLINK_CMD_FLASH_UPDATE_END the flash update must be finished. For new kernels this is always true, because we send this message in the core stack after the driver flash update routine finishes. For older kernels, some drivers may not have sent any DEVLINK_CMD_FLASH_UPDATE_STATUS or DEVLINK_CMD_FLASH_UPDATE_END. This is handled by the while loop conditional that exits if we get a return value from the child process without having received any status notifications. An argument could be made that we should exit immediately when we get either the DEVLINK_CMD_FLASH_UPDATE_END or an exit code from the child process. However, at a minimum it makes no sense to ever process DEVLINK_CMD_FLASH_UPDATE_END as if it were a DEVLINK_CMD_FLASH_UPDATE_STATUS. This is easy to test as it is triggered by the selftests for the netdevsim driver, which has a test case for both with and without status notifications. Fixes: `9b13cddfe2` ("devlink: implement flash status monitoring") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 19:54:39 -07:00
Feng Zhou	be99929d60	lib/bpf: Fix btf_load error lead to enable debug log Use tc with no verbose, when bpf_btf_attach fail, the conditions: "if (fd < 0 && (errno == ENOSPC \|\| !ctx->log_size))" will make ctx->log_size != 0. And then, bpf_prog_attach, ctx->log_size != 0. so enable debug log. The verifier log sometimes is so chatty on larger programs. bpf_prog_attach is failed. "Log buffer too small to dump verifier log 16777215 bytes (9 tries)!" BTF load failure does not affect prog load. prog still work. So when BTF/PROG load fail, enlarge log_size and re-fail with having verbose. Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 19:53:54 -07:00
Peilin Ye	e78411948d	tc/skbmod: Introduce SKBMOD_F_ECN option Recently we added SKBMOD_F_ECN option support to the kernel; support it in the tc-skbmod(8) front end, and update its man page accordingly. The 2 least significant bits of the Traffic Class field in IPv4 and IPv6 headers are used to represent different ECN states [1]: 0b00: "Non ECN-Capable Transport", Non-ECT 0b10: "ECN Capable Transport", ECT(0) 0b01: "ECN Capable Transport", ECT(1) 0b11: "Congestion Encountered", CE This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE, which is useful for ECN-based rate limiting. For example: $ tc filter add dev eth0 parent 1: protocol ip prio 10 \ u32 match ip protocol 1 0xff flowid 1:2 \ action skbmod \ ecn The updated tc-skbmod SYNOPSIS looks like the following: tc ... action skbmod { set SETTABLE \| swap SWAPPABLE \| ecn } ... Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod command. Trying to use more than one of them at a time is considered undefined behavior; pipe multiple tc-skbmod commands together instead. "set" and "swap" only affect Ethernet packets, while "ecn" only affects IP packets. Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option support", as well as iproute2 patch "tc/skbmod: Remove misinformation about the swap action". [1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-08 11:56:55 -06:00
David Ahern	09d8ce3db1	Merge branch 'main' into next Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-04 09:24:12 -06:00
David Ahern	e8763fc9ab	Merge branch 'ipv6-oam' into next Justin Iurman says: ==================== The IOAM patchset was merged recently (see net-next commits [1,2,3,4,5,6]). Therefore, this patchset provides support for IOAM inside iproute2, as well as manpage documentation. Here is a summary of added features inside iproute2. (1) configure IOAM namespaces and schemas: $ ip ioam Usage: ip ioam { COMMAND \| help } ip ioam namespace show ip ioam namespace add ID [ data DATA32 ] [ wide DATA64 ] ip ioam namespace del ID ip ioam schema show ip ioam schema add ID DATA ip ioam schema del ID ip ioam namespace set ID schema { ID \| none } (2) provide a new encap type to insert the IOAM pre-allocated trace: $ ip -6 ro ad fc00::1/128 encap ioam6 trace prealloc type 0x800000 ns 1 size 12 dev eth0 [1] db67f219fc9365a0c456666ed7c134d43ab0be8a [2] 9ee11f0fff205b4b3df9750bff5e94f97c71b6a0 [3] 8c6f6fa6772696be0c047a711858084b38763728 [4] 3edede08ff37c6a9370510508d5eeb54890baf47 [5] de8e80a54c96d2b75377e0e5319a64d32c88c690 [6] 968691c777af78d2daa2ee87cfaeeae825255a58 ==================== Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-02 11:34:09 -06:00
Justin Iurman	78832863ef	IOAM man8 This patch provides man8 documentation for IOAM inside ip, ip-ioam and ip-route. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-02 11:33:35 -06:00
Justin Iurman	32f4969d44	New IOAM6 encap type for routes This patch provides a new encap type for routes to insert an IOAM pre-allocated trace: $ ip -6 ro ad fc00::1/128 encap ioam6 trace prealloc type 0x800000 ns 1 size 12 dev eth0 where: - "trace" and "prealloc" may appear as useless but just anticipate for future implementations of other ioam option types. - "type" is a bitfield (=u32) defining the IOAM pre-allocated trace type (see the corresponding uapi). - "ns" is an IOAM namespace ID attached to the pre-allocated trace. - "size" is the trace pre-allocated size in bytes; must be a 4-octet multiple; limited size (see IOAM6_TRACE_DATA_SIZE_MAX). Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-02 11:33:31 -06:00
Justin Iurman	2909812583	Add, show, link, remove IOAM namespaces and schemas This patch provides support for adding, listing and removing IOAM namespaces and schemas with iproute2. When adding an IOAM namespace, both "data" (=u32) and "wide" (=u64) are optional. Therefore, you can either have none, one of them, or both at the same time. When adding an IOAM schema, there is no restriction on "DATA" except its size (see IOAM6_MAX_SCHEMA_DATA_LEN). By default, an IOAM namespace has no active IOAM schema (meaning an IOAM namespace is not linked to an IOAM schema), and an IOAM schema is not considered as "active" (meaning an IOAM schema is not linked to an IOAM namespace). It is possible to link an IOAM namespace with an IOAM schema, thanks to the last command below (meaning the IOAM schema will be considered as "active" for the specific IOAM namespace). $ ip ioam Usage: ip ioam { COMMAND \| help } ip ioam namespace show ip ioam namespace add ID [ data DATA32 ] [ wide DATA64 ] ip ioam namespace del ID ip ioam schema show ip ioam schema add ID DATA ip ioam schema del ID ip ioam namespace set ID schema { ID \| none } Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-02 11:33:05 -06:00
David Ahern	e53f4cd504	Import ioam6 uapi headers Import ioam6 uapi headers from kernel headers at last sync commit. Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-02 11:32:26 -06:00
David Ahern	236696e52c	Update kernel headers Update kernel headers to commit: 1187c8c4642d ("net: phy: mscc: make some arrays static const, makes object smaller") Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-02 10:25:09 -06:00
Gokul Sivakumar	cf866f0a5a	ipneigh: add support to print brief output of neigh cache in tabular format Make use of the already available brief flag and print the basic details of the IPv4 or IPv6 neighbour cache in a tabular format for better readability when the brief output is expected. $ ip -br neigh 172.16.12.100 bridge0 b0:fc:36:2f:07:43 172.16.12.174 bridge0 8c:16:45:2f:bc:1c 172.16.12.250 bridge0 04:d9:f5:c1:0c:74 fe80::267b:9f70:745e:d54d bridge0 b0:fc:36:2f:07:43 fd16:a115:6a62:0:8744:efa1:9933:2c4c bridge0 8c:16:45:2f:bc:1c fe80::6d9:f5ff:fec1:c74 bridge0 04:d9:f5:c1:0c:74 And add "ip neigh show" to the list of ip sub commands mentioned in the man page that support the brief output in tabular format. Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-08-02 10:14:50 -06:00
Peilin Ye	c06d313d86	tc/skbmod: Remove misinformation about the swap action Currently man 8 tc-skbmod says that "...the swap action will occur after any smac/dmac substitutions are executed, if they are present." This is false. In fact, trying to "set" and "swap" in a single skbmod command causes the "set" part to be completely ignored. As an example: $ tc filter add dev eth0 parent 1: protocol ip prio 10 \ matchall action skbmod \ set dmac AA:AA:AA:AA:AA:AA smac BB:BB:BB:BB:BB:BB \ swap mac The above command simply does a "swap", without setting DMAC or SMAC to AA's or BB's. The root cause of this is in the kernel, see net/sched/act_skbmod.c:tcf_skbmod_init(): parm = nla_data(tb[TCA_SKBMOD_PARMS]); index = parm->index; if (parm->flags & SKBMOD_F_SWAPMAC) lflags = SKBMOD_F_SWAPMAC; ^^^^^^^^^^^^^^^^^^^^^^^^^^ Doing a "=" instead of "\|=" clears all other "set" flags when doing a "swap". Discourage using "set" and "swap" in the same command by documenting it as undefined behavior, and update the "SYNOPSIS" section as well as tc -help text accordingly. If one really needs to e.g. "set" DMAC to all AA's then "swap" DMAC and SMAC, one should do two separate commands and "pipe" them together. Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-22 15:14:29 -07:00
Roi Dayan	71d36000dc	police: Fix normal output back to what it was With the json support fix the normal output was changed. set it back to what it was. Print overhead with print_size(). Print newline before ref. Fixes: `0d5cf51e0d` ("police: Add support for json output") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-17 11:14:30 -07:00
Lahav Schlesinger	f760bff328	ipmonitor: Fix recvmsg with ancillary data A successful call to recvmsg() causes msg.msg_controllen to contain the length of the received ancillary data. However, the current code in the 'ip' utility doesn't reset this value after each recvmsg(). This means that if a call to recvmsg() doesn't have ancillary data, then 'msg.msg_controllen' will be set to 0, causing future recvmsg() which do contain ancillary data to get MSG_CTRUNC set in msg.msg_flags. This fixes 'ip monitor' running with the all-nsid option - With this option the kernel passes the nsid as ancillary data. If while 'ip monitor' is running an even on the current netns is received, then no ancillary data will be sent, causing 'msg.msg_controllen' to be set to 0, which causes 'ip monitor' to indefinitely print "[nsid current]" instead of the real nsid. Fixes: `449b824ad1` ("ipmonitor: allows to monitor in several netns") Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-17 11:13:36 -07:00
Stephen Hemminger	7a7e9ed98f	uapi: headers update Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-17 11:12:47 -07:00
Christian Schürmann	1f2c908d53	man8/ip-tunnel.8: fix typo, 'encaplim' is not a valid option Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-15 09:31:51 -07:00
Alexander Mikhalitsyn	115e987035	libnetlink: check error handler is present before a call Fix nullptr dereference of errhndlr from rtnl_dump_filter_arg struct in rtnl_dump_done and rtnl_dump_error functions. Fixes: `459ce6e3d7` ("ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Roi Dayan <roid@nvidia.com> Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com> Reported-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-11 10:33:44 -07:00
Stephen Hemminger	0015ada629	libnetlink: cosmetic changes Don't initialize arguments that are NULL, and format initialization in a more logical way. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-07 07:39:07 -07:00
Alexander Mikhalitsyn	459ce6e3d7	ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped We started to use in-kernel filtering feature which allows to get only needed tables (see iproute_dump_filter()). From the kernel side it's implemented in net/ipv4/fib_frontend.c (inet_dump_fib), net/ipv6/ip6_fib.c (inet6_dump_fib). The problem here is that behaviour of "ip route save" was changed after `c7e6371bc` ("ip route: Add protocol, table id and device to dump request"). If filters are used, then kernel returns ENOENT error if requested table is absent, but in newly created net namespace even RT_TABLE_MAIN table doesn't exist. It is really allocated, for instance, after issuing "ip l set lo up". Reproducer is fairly simple: $ unshare -n ip route save > dump Error: ipv4: FIB table does not exist. Dump terminated Expected result here is to get empty dump file (as it was before this change). v2: reworked, so, now it takes into account NLMSGERR_ATTR_MSG (see nl_dump_ext_ack_done() function). We want to suppress error messages in stderr about absent FIB table from kernel too. v3: reworked to make code clearer. Introduced rtnl_suppressed_errors(), rtnl_suppress_error() helpers. User may suppress up to 3 errors (may be easily extended by changing SUPPRESS_ERRORS_INIT macro). v4: reworked, rtnl_dump_filter_errhndlr() was introduced. Thanks to Stephen Hemminger for comments and suggestions v5: space fixes, commit message reformat, empty initializers Fixes: `c7e6371bc` ("ip route: Add protocol, table id and device to dump request") Cc: David Ahern <dsahern@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-07 07:32:56 -07:00
Stephen Hemminger	8f85d085fe	uapi: update kernel headers from 5.14-rc1 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-06 17:07:24 -07:00
Martynas Pumputis	83d4d61bc9	libbpf: fix attach of prog with multiple sections When BPF programs which consists of multiple executable sections via iproute2+libbpf (configured with LIBBPF_FORCE=on), we noticed that a wrong section can be attached to a device. E.g.: # tc qdisc replace dev lxc_health clsact # tc filter replace dev lxc_health ingress prio 1 \ handle 1 bpf da obj bpf_lxc.o sec from-container # tc filter show dev lxc_health ingress filter protocol all pref 1 bpf chain 0 filter protocol all pref 1 bpf chain 0 handle 0x1 bpf_lxc.o:[__send_drop_notify] <-- WRONG SECTION direct-action not_in_hw id 38 tag 7d891814eda6809e jited After taking a closer look into load_bpf_object() in lib/bpf_libbpf.c, we noticed that the filter used in the program iterator does not check whether a program section name matches a requested section name (cfg->section). This can lead to a wrong prog FD being used to attach the program. Fixes: `6d61a2b557` ("lib: add libbpf support") Signed-off-by: Martynas Pumputis <m@lambda.lt> Acked-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-06 16:59:39 -07:00
David Ahern	02c06ffc13	Merge branch 'main' into next Signed-off-by: David Ahern <dsahern@kernel.org>	2021-07-01 14:29:42 +00:00
Stephen Hemminger	fc3511962d	lib: remove blank line at eof Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-06-29 13:20:44 -07:00
Stephen Hemminger	0e7ea3e8fe	v5.13.0	2021-06-29 11:24:17 -07:00
Ben Hutchings	33cf9306c8	devlink: Fix printf() type mismatches on 32-bit architectures devlink currently uses "%lu" to format values of type uint64_t, but on 32-bit architectures uint64_t is defined as unsigned long long and this does not work correctly. Fix this by using the standard macro PRIu64 instead. Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-06-29 11:10:14 -07:00
Ben Hutchings	4ac0383a59	utils: Fix BIT() to support up to 64 bits on all architectures devlink and vdpa use BIT() together with 64-bit flag fields. devlink is already using bit numbers greater than 31 and so does not work correctly on 32-bit architectures. Fix this by making BIT() use uint64_t instead of unsigned long. Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-06-29 11:10:14 -07:00

1 2 3 4 5 ...

5660 Commits All Branches Search

5660 Commits

All Branches