iproute2

Commit Graph

Author	SHA1	Message	Date
Ralf Baechle	3a92669b3a	AX.25: Add ax25_ntop implementation. AX.25 addresses are based on Amateur radio callsigns followed by an SSID like XXXXXX-SS where the callsign is up to 6 characters which are either letters or digits and the SSID is a decimal number in the range 0..15. Amateur radio callsigns are assigned by a country's relevant authorities and are 3..6 characters though a few countries have assigned callsigns longer than that. AX.25 is not able to handle such longer callsigns. Being based on HDLC AX.25 encodes addresses by shifting them one bit left thus zeroing bit 0, the HDLC extension bit for all but the last bit of a packet's address field but for our purposes here we're not considering the HDLC extension bit that is it will always be zero. Linux' internal representation of AX.25 addresses in Linux is very similar to this on the on-air or on-the-wire format. The callsign is padded to 6 octets by adding spaces, followed by the SSID octet then all 7 octets are left-shifted by one byte. This for example turns "LINUX-1" where the callsign is LINUX and SSID is 1 into 98:92:9c:aa:b0:40:02. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-09-23 20:02:30 -06:00
Gokul Sivakumar	ebbb701714	lib: bpf_legacy: add prog name, load time, uid and btf id in prog info dump The BPF program name is included when dumping the BPF program info and the kernel only stores the first (BPF_PROG_NAME_LEN - 1) bytes for the program name. $ sudo ip link show dev docker0 4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdpgeneric qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:4c:df:a4:54 brd ff:ff:ff:ff:ff:ff prog/xdp id 789 name xdp_drop_func tag 57cd311f2e27366b jited The BPF program load time (ns since boottime), UID of the user who loaded the program and the BTF ID are also included when dumping the BPF program information when the user expects a detailed ip link info output. $ sudo ip -details link show dev docker0 4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdpgeneric qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:4c:df:a4:54 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filt ering 0 vlan_protocol 802.1Q bridge_id 8000.2:42:4c:df:a4:54 designated_root 8000.2:42:4c:df:a4:54 root_port 0 r oot_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer 0.00 tcn_timer 0.00 topology_chan ge_timer 0.00 gc_timer 265.36 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast _hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_ interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query _response_interval 1000 mcast_startup_query_interval 3124 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_v ersion 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 prog/xdp id 789 name xdp_drop_func tag 57cd311f2e27366b jited load_time 2676682607316255 created_by_uid 0 btf_id 708 Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-09-21 09:16:32 -06:00
Stephen Hemminger	7a70524270	ip: remove leftovers from IPX and DECnet Iproute2 has not supported DECnet or IPX since version 5.0. There were some leftover support in the ip options flags and parsing, remove these. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-09-01 14:03:53 -07:00
Andrea Claudi	d1eacf12b5	lib: bpf_glue: remove useless assignment The value of s used inside the cycle is the result of strstr(), so this assignment is useless. Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 20:01:54 -07:00
Andrea Claudi	50a4127022	lib: bpf_legacy: fix potential NULL-pointer dereference If bpf_map_fetch_name() returns NULL, strlen() hits a NULL-pointer dereference on outer_map_name. Fix this checking outer_map_name value, and returning false when NULL, as already done for inner_map_name before. Fixes: `6d61a2b557` ("lib: add libbpf support") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 19:55:12 -07:00
Feng Zhou	be99929d60	lib/bpf: Fix btf_load error lead to enable debug log Use tc with no verbose, when bpf_btf_attach fail, the conditions: "if (fd < 0 && (errno == ENOSPC \|\| !ctx->log_size))" will make ctx->log_size != 0. And then, bpf_prog_attach, ctx->log_size != 0. so enable debug log. The verifier log sometimes is so chatty on larger programs. bpf_prog_attach is failed. "Log buffer too small to dump verifier log 16777215 bytes (9 tries)!" BTF load failure does not affect prog load. prog still work. So when BTF/PROG load fail, enlarge log_size and re-fail with having verbose. Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-08-10 19:53:54 -07:00
Lahav Schlesinger	f760bff328	ipmonitor: Fix recvmsg with ancillary data A successful call to recvmsg() causes msg.msg_controllen to contain the length of the received ancillary data. However, the current code in the 'ip' utility doesn't reset this value after each recvmsg(). This means that if a call to recvmsg() doesn't have ancillary data, then 'msg.msg_controllen' will be set to 0, causing future recvmsg() which do contain ancillary data to get MSG_CTRUNC set in msg.msg_flags. This fixes 'ip monitor' running with the all-nsid option - With this option the kernel passes the nsid as ancillary data. If while 'ip monitor' is running an even on the current netns is received, then no ancillary data will be sent, causing 'msg.msg_controllen' to be set to 0, which causes 'ip monitor' to indefinitely print "[nsid current]" instead of the real nsid. Fixes: `449b824ad1` ("ipmonitor: allows to monitor in several netns") Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-17 11:13:36 -07:00
Alexander Mikhalitsyn	115e987035	libnetlink: check error handler is present before a call Fix nullptr dereference of errhndlr from rtnl_dump_filter_arg struct in rtnl_dump_done and rtnl_dump_error functions. Fixes: `459ce6e3d7` ("ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Roi Dayan <roid@nvidia.com> Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com> Reported-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-11 10:33:44 -07:00
Stephen Hemminger	0015ada629	libnetlink: cosmetic changes Don't initialize arguments that are NULL, and format initialization in a more logical way. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-07 07:39:07 -07:00
Alexander Mikhalitsyn	459ce6e3d7	ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped We started to use in-kernel filtering feature which allows to get only needed tables (see iproute_dump_filter()). From the kernel side it's implemented in net/ipv4/fib_frontend.c (inet_dump_fib), net/ipv6/ip6_fib.c (inet6_dump_fib). The problem here is that behaviour of "ip route save" was changed after `c7e6371bc` ("ip route: Add protocol, table id and device to dump request"). If filters are used, then kernel returns ENOENT error if requested table is absent, but in newly created net namespace even RT_TABLE_MAIN table doesn't exist. It is really allocated, for instance, after issuing "ip l set lo up". Reproducer is fairly simple: $ unshare -n ip route save > dump Error: ipv4: FIB table does not exist. Dump terminated Expected result here is to get empty dump file (as it was before this change). v2: reworked, so, now it takes into account NLMSGERR_ATTR_MSG (see nl_dump_ext_ack_done() function). We want to suppress error messages in stderr about absent FIB table from kernel too. v3: reworked to make code clearer. Introduced rtnl_suppressed_errors(), rtnl_suppress_error() helpers. User may suppress up to 3 errors (may be easily extended by changing SUPPRESS_ERRORS_INIT macro). v4: reworked, rtnl_dump_filter_errhndlr() was introduced. Thanks to Stephen Hemminger for comments and suggestions v5: space fixes, commit message reformat, empty initializers Fixes: `c7e6371bc` ("ip route: Add protocol, table id and device to dump request") Cc: David Ahern <dsahern@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-07 07:32:56 -07:00
Martynas Pumputis	83d4d61bc9	libbpf: fix attach of prog with multiple sections When BPF programs which consists of multiple executable sections via iproute2+libbpf (configured with LIBBPF_FORCE=on), we noticed that a wrong section can be attached to a device. E.g.: # tc qdisc replace dev lxc_health clsact # tc filter replace dev lxc_health ingress prio 1 \ handle 1 bpf da obj bpf_lxc.o sec from-container # tc filter show dev lxc_health ingress filter protocol all pref 1 bpf chain 0 filter protocol all pref 1 bpf chain 0 handle 0x1 bpf_lxc.o:[__send_drop_notify] <-- WRONG SECTION direct-action not_in_hw id 38 tag 7d891814eda6809e jited After taking a closer look into load_bpf_object() in lib/bpf_libbpf.c, we noticed that the filter used in the program iterator does not check whether a program section name matches a requested section name (cfg->section). This can lead to a wrong prog FD being used to attach the program. Fixes: `6d61a2b557` ("lib: add libbpf support") Signed-off-by: Martynas Pumputis <m@lambda.lt> Acked-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-07-06 16:59:39 -07:00
David Ahern	02c06ffc13	Merge branch 'main' into next Signed-off-by: David Ahern <dsahern@kernel.org>	2021-07-01 14:29:42 +00:00
Stephen Hemminger	fc3511962d	lib: remove blank line at eof Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-06-29 13:20:44 -07:00
Guillaume Nault	f8879e85f0	utils: bump max args number to 512 for batch files Large tc filters can have many arguments. For example the following filter matches the first 7 MPLS LSEs, pops all of them, then updates the Ethernet header and redirects the resulting packet to eth1. filter add dev eth0 ingress handle 44 priority 100 \ protocol mpls_uc flower mpls \ lse depth 1 label 1040076 tc 4 bos 0 ttl 175 \ lse depth 2 label 89648 tc 2 bos 0 ttl 9 \ lse depth 3 label 63417 tc 5 bos 0 ttl 185 \ lse depth 4 label 593135 tc 5 bos 0 ttl 67 \ lse depth 5 label 857021 tc 0 bos 0 ttl 181 \ lse depth 6 label 239239 tc 1 bos 0 ttl 254 \ lse depth 7 label 30 tc 7 bos 1 ttl 237 \ action mpls pop protocol mpls_uc pipe \ action mpls pop protocol mpls_uc pipe \ action mpls pop protocol mpls_uc pipe \ action mpls pop protocol mpls_uc pipe \ action mpls pop protocol mpls_uc pipe \ action mpls pop protocol mpls_uc pipe \ action mpls pop protocol ipv6 pipe \ action vlan pop_eth pipe \ action vlan push_eth \ dst_mac 00:00:5e:00:53:7e \ src_mac 00:00:5e:00:53:03 pipe \ action mirred egress redirect dev eth1 This filter has 149 arguments, so it can't be used with tc -batch which is limited to a 100. Let's bump the limit to 512. That should leave a lot of room for big batch commands. v2: -Define the limit in utils.h (Stephen Hemminger) -Bump the limit even higher (256 -> 512) (Stephen Hemminger) Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-06-18 02:57:05 +00:00
Florian Westphal	d3740fdc26	libgenl: make genl_add_mcast_grp set errno on error genl_add_mcast_grp doesn't set errno in all cases. On kernels that support mptcp but lack event support (all kernels <= 5.11) MPTCP_PM_EV_GRP_NAME won't be found and ip will exit with "can't subscribe to mptcp events: Success" Set errno to a meaningful value (ENOENT) when the group name isn't found and also cover other spots where it returns nonzero with errno unset. Fixes: `ff619e4fd3` ("mptcp: add support for event monitoring") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-05-17 11:59:37 -07:00
Heiko Thiery	c5b72cc56b	lib/fs: fix issue when {name,open}_to_handle_at() is not implemented With commit `d5e6ee0dac` the usage of functions name_to_handle_at() and open_by_handle_at() are introduced. But these function are not available e.g. in uclibc-ng < 1.0.35. To have a backward compatibility check for the availability in the configure script and in case of absence do a direct syscall. Fixes: `d5e6ee0dac` ("ss: introduce cgroup2 cache and helper functions") Cc: Dmitry Yakunin <zeil@yandex-team.ru> Cc: Petr Vorel <petr.vorel@gmail.com> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com> Reviewed-by: Petr Vorel <petr.vorel@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-05-17 02:31:29 +00:00
Andrea Claudi	3296d4fe77	lib: bpf_legacy: avoid to pass invalid argument to close() In function bpf_obj_open, if bpf_fetch_prog_arg() return an error, we end up in the out: path with a negative value for fd, and pass it to close. Avoid this checking for fd to be positive. Fixes: `32e93fb7f6` ("{f,m}_bpf: allow for sharing maps") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-05-06 14:43:54 +00:00
Stephen Hemminger	2363bc99f9	Merge git://git.kernel.org/pub/scm/network/iproute2/iproute2-next Required manual fix of devlink/devlink.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-04-27 19:39:39 -07:00
Andrea Claudi	e1ad689545	lib: bpf_legacy: fix missing socket close when connect() fails In functions bpf_{send,recv}_map_fds(), when connect fails after a socket is successfully opened, we return with error missing a close on the socket. Fix this closing the socket if opened and using a single return point for both the functions. Fixes: `6256f8c9e4` ("tc, bpf: finalize eBPF support for cls and act front-end") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-04-26 21:05:19 -07:00
Andrea Claudi	92af24c907	lib: bpf_legacy: treat 0 as a valid file descriptor As stated in the man page(), open returns a non-negative integer as a file descriptor. Hence, when checking for its return value to be ok, we should include 0 as a valid value. This fixes a covscan warning about a missing close() in this function. Fixes: `ecb05c0f99` ("bpf: improve error reporting around tail calls") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-04-26 21:05:19 -07:00
Andrea Claudi	81bfd01a4c	lib: move get_task_name() from rdma The function get_task_name() is used to get the name of a process from its pid, and its implementation is similar to ip/iptuntap.c:pid_name(). Move it to lib/fs.c to use a single implementation and make it easily reusable. Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-04-22 05:22:16 +00:00
Nikolay Aleksandrov	34c14bea22	libnetlink: add bridge vlan dump request helper Add rtnl bridge vlan dump request helper which will be used to retrieve bridge vlan information and options. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-04-22 05:13:29 +00:00
Florian Westphal	ff619e4fd3	mptcp: add support for event monitoring This adds iproute2 support for mptcp event monitoring, e.g. creation, establishment, address announcements from the peer, subflow establishment and so on. While the kernel-generated events are primarily aimed at mptcpd (e.g. for subflow management), this is also useful for debugging. This adds print support for the existing events. Sample output of 'ip mptcp monitor': [ CREATED] token=83f3a692 remid=0 locid=0 saddr4=10.0.1.2 daddr4=10.0.1.1 sport=58710 dport=10011 [ ESTABLISHED] token=83f3a692 remid=0 locid=0 saddr4=10.0.1.2 daddr4=10.0.1.1 sport=58710 dport=10011 [SF_ESTABLISHED] token=83f3a692 remid=0 locid=1 saddr4=10.0.2.2 daddr4=10.0.1.1 sport=40195 dport=10011 backup=0 [ CLOSED] token=83f3a692 Signed-off-by: Florian Westphal <fw@strlen.de>	2021-04-22 05:10:25 +00:00
David Ahern	76bfc185f2	Merge branch 'main' into next Signed-off-by: David Ahern <dsahern@kernel.org>	2021-03-21 17:16:01 +00:00
Ido Schimmel	2be6d18b30	nexthop: Add support for nexthop buckets Add ability to dump multiple nexthop buckets and get a specific one. Example: # ip nexthop add id 10 group 1/2 type resilient buckets 8 # ip nexthop id 1 via 192.0.2.2 dev dummy10 scope link id 2 via 192.0.2.19 dev dummy20 scope link id 10 group 1/2 type resilient buckets 8 idle_timer 120 unbalanced_timer 0 unbalanced_time 0 # ip nexthop bucket id 10 index 0 idle_time 28.1 nhid 2 id 10 index 1 idle_time 28.1 nhid 2 id 10 index 2 idle_time 28.1 nhid 2 id 10 index 3 idle_time 28.1 nhid 2 id 10 index 4 idle_time 28.1 nhid 1 id 10 index 5 idle_time 28.1 nhid 1 id 10 index 6 idle_time 28.1 nhid 1 id 10 index 7 idle_time 28.1 nhid 1 # ip nexthop bucket show nhid 1 id 10 index 4 idle_time 53.59 nhid 1 id 10 index 5 idle_time 53.59 nhid 1 id 10 index 6 idle_time 53.59 nhid 1 id 10 index 7 idle_time 53.59 nhid 1 # ip nexthop bucket get id 10 index 5 id 10 index 5 idle_time 81 nhid 1 # ip -j -p nexthop bucket get id 10 index 5 [ { "id": 10, "bucket": { "index": 5, "idle_time": 104.89, "nhid": 1 }, "flags": [ ] } ] Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-03-19 15:01:25 +00:00
Petr Machata	e757f741e9	json_print: Add print_tv() Add a helper to dump a timeval. Print by first converting to double and then dispatching to print_color_float(). Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-03-19 15:00:08 +00:00
Tony Ambardar	06bee37c1c	lib/bpf: add missing limits.h includes Several functions in bpf_glue.c and bpf_libbpf.c rely on PATH_MAX, which is normally included from <limits.h> in other iproute2 source files. It fixes errors seen using gcc 10.2.0, binutils 2.35.1 and musl 1.1.24: bpf_glue.c: In function 'get_libbpf_version': bpf_glue.c:46:11: error: 'PATH_MAX' undeclared (first use in this function); did you mean 'AF_MAX'? 46 \| char buf[PATH_MAX], *s; \| ^~~~~~~~ \| AF_MAX Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-03-16 22:53:53 -07:00
Parav Pandit	e3a4067e52	utils: Introduce helper routines for generic socket recv Introduce helper for generic socket receive helper and introduce helper to build command with custom family and version. Use API in subsequent devlink patch. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-03-03 04:00:04 +00:00
Stephen Hemminger	52c5f3f043	Merge git://git.kernel.org/pub/scm/network/iproute2/iproute2-next	2021-02-23 23:03:42 -08:00
Andrea Claudi	b2d44b9a95	lib/fs: Fix single return points for get_cgroup2_* Functions get_cgroup2_id() and get_cgroup2_path() may call close() with a negative argument. Avoid that making the calls conditional on the file descriptors. get_cgroup2_path() may also return NULL leaking a file descriptor. Ensure this does not happen using a single return point. Fixes: `d5e6ee0dac` ("ss: introduce cgroup2 cache and helper functions") Fixes: `8f1cd119b3` ("lib: fix checking of returned file handle size for cgroup") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-02-22 18:20:44 -08:00
Andrea Claudi	1de363b180	lib/fs: avoid double call to mkdir on make_path() make_path() function calls mkdir two times in a row. The first one it stores mkdir return code, and then it calls it again to check for errno. This seems unnecessary, as we can use the return code from the first call and check for errno if not 0. Fixes: `ac3415f5c1` ("lib/fs: Fix and simplify make_path()") Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-02-22 18:20:44 -08:00
Andrea Claudi	d4fcdbbec9	lib/bpf: Fix and simplify bpf_mnt_check_target() As stated in commit `ac3415f5c1` ("lib/fs: Fix and simplify make_path()"), calling stat() before mkdir() is racey, because the entry might change in between. As the call to stat() seems to only check for target existence, we can simply call mkdir() unconditionally and catch all errors but EEXIST. Fixes: `95ae9a4870` ("bpf: fix mnt path when from env") Signed-off-by: Andrea Claudi <aclaudi@redhat.com>	2021-02-22 18:19:01 -08:00
Andrea Claudi	1e25de9a92	lib/namespace: fix ip -all netns return code When ip -all netns {del,exec} are called and no netns is present, ip exit with status 0. However this does not happen if no netns has been created since boot time: in that case, indeed, the NETNS_RUN_DIR is not present and netns_foreach() exit with code 1. $ ls /var/run/netns ls: cannot access '/var/run/netns': No such file or directory $ ip -all netns exec ip link show $ echo $? 1 $ ip -all netns del $ echo $? 1 $ ip netns add test $ ip netns del test $ ip -all netns del $ echo $? 0 $ ls -a /var/run/netns . .. This leaves us in the unpleasant situation where the same command, when no netns is present, does the same stuff (in this case, nothing), but exit with two different statuses. Fix this treating ENOENT in a different way from other errors, similarly to what we already do in ipnetns.c netns_identify_pid() Fixes: `e998e118dd` ("lib: Exec func on each netns") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-02-22 18:17:56 -08:00
Parav Pandit	6c76994982	utils: Add helper to map string to unsigned int In subsequent patch need to map a string to a unsigned int. Hence, add an API to map a string to unsigned int. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-02-11 09:09:10 -07:00
Parav Pandit	b822275ad8	utils: Add generic socket helpers Subsequent patch needs to (a) query and use socket family (b) send/receive messages using this family Hence add helper routines to open, close, query family and to perform send receive operations. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-02-11 09:09:07 -07:00
Parav Pandit	bd3709c3a7	utils: Add helper routines for indent handling Subsequent patch needs to use 2 char indentation for nested objects. Hence introduce a generic helpers to allocate, deallocate, increment, decrement and to print indent block. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-02-11 09:08:13 -07:00
Parav Pandit	249465d3bf	devlink: Support get port function state Print port function state and operational state whenever reported by kernel. Example of a PCI SF port function which supports the state: $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 pci/0000:08:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached $ devlink port show pci/0000:06:00.0/32768 pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88 $ devlink port show pci/0000:06:00.0/32768 -jp { "port": { "pci/0000:06:00.0/32768": { "type": "eth", "netdev": "ens2f0npf0sf88", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 88, "splittable": false, "function": { "hw_addr": "00:00:00:00:88:88", "state": "inactive", "opstate": "detached" } } } } Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-02-02 02:06:41 +00:00
Parav Pandit	a9642c5fa6	devlink: Introduce and use string to number mapper Instead of using static mapping in code, introduce a helper routine to map a value to string. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-02-02 02:01:53 +00:00
Luca Boccassi	8498ca92d7	vrf: fix ip vrf exec with libbpf The size of bpf_insn is passed to bpf_load_program instead of the number of elements as it expects, so ip vrf exec fails with: $ sudo ip link add vrf-blue type vrf table 10 $ sudo ip link set dev vrf-blue up $ sudo ip/ip vrf exec vrf-blue ls Failed to load BPF prog: 'Invalid argument' last insn is not an exit or jmp processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 Kernel compiled with CGROUP_BPF enabled? https://bugs.debian.org/980046 Reported-by: Emmanuel DECAEN <Emmanuel.Decaen@xsalto.com> Signed-off-by: Luca Boccassi <bluca@debian.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-01-18 12:32:17 -08:00
Roi Dayan	1a22ad2721	build: Fix link errors on some systems Since moving get_rate() and get_size() from tc to lib, on some systems we fail to link because of missing math lib. Move the functions that require math lib to their own c file and add -lm to dcb that now use those functions. ../lib/libutil.a(utils.o): In function `get_rate': utils.c:(.text+0x10dc): undefined reference to `floor' ../lib/libutil.a(utils.o): In function `get_size': utils.c:(.text+0x1394): undefined reference to `floor' ../lib/libutil.a(json_print.o): In function `sprint_size': json_print.c:(.text+0x14c0): undefined reference to `rint' json_print.c:(.text+0x14f4): undefined reference to `rint' json_print.c:(.text+0x157c): undefined reference to `rint' Fixes: `f3be0e6366` ("lib: Move get_rate(), get_rate64() from tc here") Fixes: `44396bdfcc` ("lib: Move get_size() from tc here") Fixes: `adbe5de966` ("lib: Move sprint_size() from tc here, add print_size()") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-01-18 12:28:47 -08:00
Petr Machata	c13216f7a6	lib: Generalize parse_mapping() The function parse_mapping() assumes the key is a number, with a single configurable exception, which is using "all" to mean "all possible keys". If a caller wishes to use symbolic names instead of numbers, they cannot reuse this function. To facilitate reuse in these situations, convert parse_mapping() into a helper, parse_mapping_gen(), which instead of an allow-all boolean takes a generic key-parsing callback. Rewrite parse_mapping() in terms of this newly-added helper and add a pair of key parsers, one for just numbers, another for numbers and the keyword "all". Publish the latter as well. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-01-18 04:09:29 +00:00
Petr Machata	bf244ee677	lib: rt_names: Add rtnl_dsfield_get_name() For formatting DSCP (not full dsfield), it would be handy to be able to just get the name from the name table, and not get any of the remaining cruft related to formatting. Add a new entry point to just fetch the name table string uninterpreted. Use it from rtnl_dsfield_n2a(). Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@kernel.org>	2021-01-18 04:09:29 +00:00
Petr Machata	44396bdfcc	lib: Move get_size() from tc here The function get_size() serves for parsing of sizes using a handly notation that supports units and their prefixes, such as 10Kbit. This will be useful for the DCB buffer size parsing. Move the function from TC to the general library, so that it can be reused. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:50 +00:00
Petr Machata	f3be0e6366	lib: Move get_rate(), get_rate64() from tc here The functions get_rate() and get_rate64() are useful for parsing rate-like values. The DCB tool will find these useful in the maxrate subtool. Move them over to lib so that they can be easily reused. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:44 +00:00
Petr Machata	aaeda2a768	lib: print_color_rate(): Fix formatting small rates in IEC mode ISO/IEC units are distinguished from the decadic ones by using a prefixes like "Ki", "Mi" instead of "K" and "M". The current code inserts the letter "i" after the decadic unit when in IEC mode. However it does so even when the prefix is an empty string, formatting 1Kbit in IEC mode as "1000ibit". Fix by omitting the letter if there is no prefix. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:41 +00:00
Petr Machata	a0a4b6618c	lib: sprint_size(): Uncrustify the code a bit Ideally this and the rate printing would both be converted to a common helper, but unfortunately the two format differently and this would break tests and scripts out there. So just make the code look less like a wad of hay. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:36 +00:00
Petr Machata	adbe5de966	lib: Move sprint_size() from tc here, add print_size() When displaying sizes of various sorts, tc commonly uses the function sprint_size() to format the size into a buffer as a human-readable string. This string is then displayed either using print_string(), or in some code even fprintf(). As a result, a typical sequence of code when formatting a size is something like the following: SPRINT_BUF(b); print_uint(PRINT_JSON, "foo", NULL, foo); print_string(PRINT_FP, NULL, "foo %s ", sprint_size(foo, b)); For a concept as broadly useful as size, it would be better to have a dedicated function in json_print. To that end, move sprint_size() from tc_util to json_print. Add helpers print_size() and print_color_size() that wrap arount sprint_size() and provide the JSON dispatch as appropriate. Since print_size() should be the preferred interface, convert vast majority of uses of sprint_size() to print_size(). Two notable exceptions are: - q_tbf, which does not show the size as such, but uses the string "$human_readable_size/$cell_size" even in JSON. There is simply no way to have print_size() emit the same text, because print_size() in JSON mode should of course just use the raw number, without human-readable frills. - q_cake, which relies on the existence of sprint_size() in its macro-based formatting helpers. There might be ways to convert this particular case, but given q_tbf simply cannot be converted, leave it as is. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:25 +00:00
Petr Machata	60265cc226	lib: Move print_rate() from tc here; modernize The functions print_rate() and sprint_rate() are useful for formatting rate-like values. The DCB tool would find these useful in the maxrate subtool. However, the current interface to these functions uses a global variable use_iec as a flag indicating whether 1024- or 1000-based powers should be used when formatting the rate value. For general use, a global variable is not a great way of passing arguments to a function. Besides, it is unlike most other printing functions in that it deals in buffers and ignores JSON. Therefore make the interface to print_rate() explicit by converting use_iec to an ordinary parameter. Since the interface changes anyway, convert it to follow the pattern of other json_print functions (except for the now-explicit use_iec parameter). Move to json_print.c. Add a wrapper to tc, so that all the call sites do not need to repeat the use_iec global variable argument, and convert all call sites. In q_cake.c, the conversion is not straightforward due to usage of a macro that is shared across numerous data types. Simply hand-roll the corresponding code, which seems better than making an extra helper for one call site. Drop sprint_rate() now that everybody just uses print_rate(). Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:15 +00:00
David Ahern	b3c4a55064	Only compile mnl_utils when HAVE_MNL is defined New lib/mnl_utils.c fails to compile if libmnl is not installed: mnl_utils.c:9:10: fatal error: libmnl/libmnl.h: No such file or directory 9 \| #include <libmnl/libmnl.h> Make it dependent on HAVE_MNL. Fixes: `72858c7b77` ("lib: Extract from devlink/mnlg a helper, mnlu_socket_open()") Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-04 16:19:05 +00:00
Hangbin Liu	6d61a2b557	lib: add libbpf support This patch converts iproute2 to use libbpf for loading and attaching BPF programs when it is available, which is started by Toke's implementation[1]. With libbpf iproute2 could correctly process BTF information and support the new-style BTF-defined maps, while keeping compatibility with the old internal map definition syntax. The old iproute2 bpf code is kept and will be used if no suitable libbpf is available. When using libbpf, wrapper code in bpf_legacy.c ensures that iproute2 will still understand the old map definition format, including populating map-in-map and tail call maps before load. In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the legacy bytes. When handling the legacy maps, for map-in-maps, we create them manually and re-use the fd as they are associated with id/inner_id. For pin maps, we only set the pin path and let libbp load to handle it. For tail calls, we find it first and update the element after prog load. Other maps/progs will be loaded by libbpf directly. [1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/ Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:14:05 -07:00

1 2 3 4 5 ...

529 Commits