iproute2

Commit Graph

Author	SHA1	Message	Date
Stephen Hemminger	260dc56ae3	lib: fix spelling errors Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-08-12 18:21:10 -07:00
Kurt Kanzenbach	c875433b14	utils: Fix get_s64() function get_s64() uses internally strtoll() to parse the value out of a given string. strtoll() returns a long long. However, the intermediate variable is long only which might be 32 bit on some systems. So, fix it. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-07-29 08:44:20 -07:00
Ivan Delalande	ed54f76484	json: fix backslash escape typo in jsonw_puts Fixes: `fcc16c22` ("provide common json output formatter") Signed-off-by: Ivan Delalande <colona@arista.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-07-19 10:48:38 -07:00
Matteo Croce	1f420318bd	utils: don't match empty strings as prefixes iproute has an utility function which checks if a string is a prefix for another one, to allow use of abbreviated commands, e.g. 'addr' or 'a' instead of 'address'. This routine unfortunately considers an empty string as prefix of any pattern, leading to undefined behaviour when an empty argument is passed to ip: # ip '' 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever # tc '' qdisc noqueue 0: dev lo root refcnt 2 # ip address add 192.0.2.0/24 '' 198.51.100.1 dev dummy0 # ip addr show dev dummy0 6: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 02:9d:5e:e9:3f:c0 brd ff:ff:ff:ff:ff:ff inet 192.0.2.0/24 brd 198.51.100.1 scope global dummy0 valid_lft forever preferred_lft forever Rewrite matches() so it takes care of an empty input, and doesn't scan the input strings three times: the actual implementation does 2 strlen and a memcpy to accomplish the same task. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-07-15 13:48:48 -07:00
John Hurley	11d7087a4e	lib: add mpls_uc and mpls_mc as link layer protocol names Update the llproto_names array to allow users to reference the mpls protocol ids with the names 'mpls_uc' for unicast MPLS and 'mpls_mc' for multicast. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-07-10 14:06:28 -07:00
Andrea Claudi	1e5746d5e1	utils: move parse_percent() to tc_util As parse_percent() is used only in tc. This reduces ip, bridge and genl binaries size: $ bloat-o-meter -t bridge/bridge bridge/bridge.new add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-109 (-109) Total: Before=50973, After=50864, chg -0.21% $ bloat-o-meter -t genl/genl genl/genl.new add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-109 (-109) Total: Before=30298, After=30189, chg -0.36% $ bloat-o-meter ip/ip ip/ip.new add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-109 (-109) Total: Before=674164, After=674055, chg -0.02% Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-28 16:06:26 -07:00
David Ahern	f7eef91897	Merge branch 'master' into next Conflicts: include/uapi/linux/snmp.h Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-21 15:59:24 -07:00
Matteo Croce	b2e2922373	netns: make netns_{save,restore} static The netns_{save,restore} functions are only used in ipnetns.c now, since the restore is not needed anymore after the netns exec command. Move them in ipnetns.c, and make them static. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-20 14:30:41 -07:00
Matteo Croce	903818fbf9	netns: switch netns in the child when executing commands 'ip netns exec' changes the current netns just before executing a child process, and restores it after forking. This is needed if we're running in batch or do_all mode. Some cleanups must be done both in the parent and in the child: the parent must restore the previous netns, while the child must reset any VRF association. Unfortunately, if do_all is set, the VRF are not reset in the child, and the spawned processes are started with the wrong VRF context. This can be triggered with this script: # ip -b - <<-'EOF' link add type vrf table 100 link set vrf0 up link add type dummy link set dummy0 vrf vrf0 up netns add ns1 EOF # ip -all -b - <<-'EOF' vrf exec vrf0 true netns exec setsid -f sleep 1h EOF # ip vrf pids vrf0 314 sleep # ps 314 PID TTY STAT TIME COMMAND 314 ? Ss 0:00 sleep 1h Refactor cmd_exec() and pass to it a function pointer which is called in the child before the final exec. In the netns exec case the function just resets the VRF and switches netns. Doing it in the child is less error prone and safer, because the parent environment is always kept unaltered. After this refactor some utility functions became unused, so remove them. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-20 14:30:41 -07:00
Hangbin Liu	ca697cee4c	ip: add a new parameter -Numeric Add a new parameter '-Numeric' to show the number of protocol, scope, dsfield, etc directly instead of converting it to human readable name. Do the same on tc and ss. This patch is based on David Ahern's previous patch. Suggested-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-18 08:37:47 -07:00
David Ahern	e92d221022	Merge branch 'master' into next Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-14 07:29:40 -07:00
Moshe Shemesh	c934da8aaa	devlink: mnlg: Catch returned error value of dumpit commands Devlink commands which implements the dumpit callback may return error. The netlink function netlink_dump() sends the errno value as the payload of the message, while answering user space with NLMSG_DONE. To enable receiving errno value for dumpit commands we have to check for it in the message. If it is a negative value then the dump returned an error so we should set errno accordingly and check for ext_ack in case it was set. Fixes: `049c58539f` ("devlink: mnlg: Add support for extended ack") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-12 08:43:14 -07:00
David Ahern	74829ca7dd	libnetlink: Add helper to create nexthop dump request Add rtnl_nexthopdump_req to initiate a dump request of nexthop objects. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:53 -07:00
David Ahern	9860becfe3	libnetlink: Add helper to add a group via setsockopt groups > 31 have to be joined using the setsockopt. Since the nexthop group is 32, add a helper to allow 'ip monitor' to listen for nexthop messages. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:48 -07:00
David Ahern	2360b8cb21	libnetlink: Set NLA_F_NESTED in rta_nest Kernel now requires NLA_F_NESTED to be set on new nested attributes. Set NLA_F_NESTED in rta_nest. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:39 -07:00
Matteo Croce	80a931d41c	ip: reset netns after each command in batch mode When creating a new netns or executing a program into an existing one, the unshare() or setns() calls will change the current netns. In batch mode, this can run commands on the wrong interfaces, as the ifindex value is meaningful only in the current netns. For example, this command fails because veth-c doesn't exists in the init netns: # ip -b - <<-'EOF' netns add client link add name veth-c type veth peer veth-s netns client addr add 192.168.2.1/24 dev veth-c EOF Cannot find device "veth-c" Command failed -:7 But if there are two devices with the same name in the init and new netns, ip will build a wrong ll_map with indexes belonging to the new netns, and will execute actions in the init netns using this wrong mapping. This script will flush all eth0 addresses and bring it down, as it has the same ifindex of veth0 in the new netns: # ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff inet 192.168.122.76/24 brd 192.168.122.255 scope global dynamic eth0 valid_lft 3598sec preferred_lft 3598sec # ip -b - <<-'EOF' netns add client link add name veth0 type veth peer name veth1 link add name veth-ns type veth peer name veth0 netns client link set veth0 down address flush veth0 EOF # ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff 3: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether c2:db:d0:34:13:4a brd ff:ff:ff:ff:ff:ff 4: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether ca:9d:6b:5f:5f:8f brd ff:ff:ff:ff:ff:ff 5: veth-ns@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 32:ef:22:df:51:0a brd ff:ff:ff:ff:ff:ff link-netns client The same issue can be triggered by the netns exec subcommand with a sligthy different script: # ip netns add client # ip -b - <<-'EOF' netns exec client true link add name veth0 type veth peer name veth1 link add name veth-ns type veth peer name veth0 netns client link set veth0 down address flush veth0 EOF Fix this by adding two netns_{save,reset} functions, which are used to get a file descriptor for the init netns, and restore it after each batch command. netns_save() is called before the unshare() or setns(), while netns_restore() is called after each command. Fixes: `0dc34c7713` ("iproute2: Add processless network namespace support") Reviewed-and-tested-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-10 10:42:14 -07:00
Nicolas Dichtel	757837230a	lib: suppress error msg when filling the cache Before the patch: $ ip netns add foo $ ip link add name veth1 address 2a:a5:5c:b9:52:89 type veth peer name veth2 address 2a:a5:5c:b9:53:90 netns foo RTNETLINK answers: No such device RTNETLINK answers: No such device But the command was successful. This may break script. Let's remove those error messages. Fixes: `55870dfe7f` ("Improve batch and dump times by caching link lookups") Reported-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-05-28 12:23:52 -07:00
Ralf Baechle	8391023680	ip: display netrom link type For a NETROM "ip link show dev nr0" will show 4: nr0: <NOARP,UP,LOWER_UP> mtu 236 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/generic 88:98:6a:a4:84:40:0a brd 00:00:00:00:00:00:00 But rather link/netrom is expected to be displayed. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-04-11 15:25:50 -07:00
David Ahern	55870dfe7f	Improve batch and dump times by caching link lookups ip route uses ll_name_to_index and ll_index_to_name to convert between device names and indices. At the moment both use for the ioctl based glibc functions if_nametoindex and if_indextoname and does not cache the result. When using a batch file or dumping large number of routes this means the same device lookups can be done repeatedly adding unnecessary overhead (socket + ioctl + close for each device lookup). Add a new function, ll_link_get, to send a netlink based RTM_GETLINK. If successful, cache the result in idx_head and name_head so future lookups can re-use the entry. Update ll_name_to_index and ll_index_to_name to use ll_link_get and only fallback to the glibc functions if it fails. With this change the time to install 720,022 routes with 2 ecmp nexthops where the nexthop device is given is reduced from 31.4 seconds to 19.2 seconds. A dump of those routes drops from 13.3 to 2.8 seconds. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-02-22 18:51:20 -08:00
David Ahern	25c6339b22	ll_map: Add function to remove link cache entry by index Add ll_drop_by_index to remove an entry from the link cache. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-02-22 18:51:15 -08:00
David Ahern	9f78e995a8	Merge branch 'iproute2-master' into next Conflicts: misc/ss.c Signed-off-by: David Ahern <dsahern@gmail.com>	2019-02-22 18:50:39 -08:00
Eric Dumazet	bb5ae621d0	lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg() In the past, we tried to increase the buffer size up to 32 KB in order to reduce number of syscalls per dump. Commit `2d34851cd3` ("lib/libnetlink: re malloc buff if size is not enough") brought the size back to 4KB because the kernel can not know the application is ready to receive bigger requests. See kernel commits 9063e21fb026 ("netlink: autosize skb lengthes") and d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()") for more details. Fixes: `2d34851cd3` ("lib/libnetlink: re malloc buff if size is not enough") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hangbin Liu <liuhangbin@gmail.com> Cc: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-02-13 13:51:44 -08:00
Davide Caratti	ca81444303	use print_{,h}hu instead of print_uint when format specifier is %{,h}hu in this way, a useless cast to unsigned int is avoided in bpf_print_ops() and print_tunnel(). Tested with: # ./tdc.py -c bpf Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Cc: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-02-10 19:00:59 -08:00
Yonghong Song	3da6d055d9	bpf: add btf func and func_proto kind support The issue is discovered for bpf selftest test_skb_cgroup.sh. Currently we have, $ ./test_skb_cgroup_id.sh Wait for testing link-local IP to become available ... OK Object has unknown BTF type: 13! [PASS] In the above the BTF type 13 refers to BTF kind BTF_KIND_FUNC_PROTO. This patch added support of BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC during type parsing. With this patch, I got $ ./test_skb_cgroup_id.sh Wait for testing link-local IP to become available ... OK [PASS] Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-02-05 15:29:20 -08:00
Ido Schimmel	264be1d887	bridge: fdb: Fix FDB dump with strict checking disabled While iproute2 correctly uses ifinfomsg struct as the ancillary header when requesting an FDB dump on old kernels, it sets the message type to RTM_GETLINK. This results in wrong reply being returned. Fix this by using RTM_GETNEIGH instead. Before: $ bridge fdb show brport dummy0 Not RTM_NEWNEIGH: 00000158 00000010 00000002 After: $ bridge fdb show brport dummy0 2a:0b:41:1c:92:d3 vlan 1 master br0 permanent 2a:0b:41:1c:92:d3 master br0 permanent 33:33:00:00:00:01 self permanent 01:00:5e:00:00:01 self permanent Fixes: `05880354c2` ("bridge: fdb: Fix filtering with strict checking disabled") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: LiLiang <liali@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-02-05 15:27:28 -08:00
Chris Mi	17ed56fdf3	libnetlink: linkdump_req: AF_PACKET family also expects ext_filter_mask Without this fix, the VF info can't be showed using command "ip link". 146: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 24:8a:07:ad:78:52 brd ff:ff:ff:ff:ff:ff vf 0 MAC 02:25:d0:12:01:01, spoof checking off, link-state auto, trust off, query_rss off vf 1 MAC 02:25:d0:12:01:02, spoof checking off, link-state auto, trust off, query_rss off Fixes: `d97b16b2c9` ("libnetlink: linkdump_req: Only AF_UNSPEC family expects an ext_filter_mask") Signed-off-by: Chris Mi <chrism@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-02-05 15:25:43 -08:00
Davide Caratti	52d57f6bbd	tc: full JSON support for 'bpf' actions Add full JSON output support in the dump of 'act_bpf'. Example using eBPF: # tc actions flush action bpf # tc action add action bpf object bpf/action.o section 'action-ok' # tc -j action list action bpf \| jq [ { "total acts": 1 }, { "actions": [ { "order": 0, "kind": "bpf", "bpf_name": "action.o:[action-ok]", "prog": { "id": 33, "tag": "a04f5eef06a7f555", "jited": 1 }, "control_action": { "type": "pipe" }, "index": 1, "ref": 1, "bind": 0 } ] } ] Example using cBPF: # tc actions flush action bpf # a=$(mktemp) # tcpdump -ddd not ether proto 0x888e >$a # tc action add action bpf bytecode-file $a index 42 # rm $a # tc -j action list action bpf \| jq [ { "total acts": 1 }, { "actions": [ { "order": 0, "kind": "bpf", "bytecode": { "length": 4, "insns": [ { "code": 40, "jt": 0, "jf": 0, "k": 12 }, { "code": 21, "jt": 0, "jf": 1, "k": 34958 }, { "code": 6, "jt": 0, "jf": 0, "k": 0 }, { "code": 6, "jt": 0, "jf": 0, "k": 262144 } ] }, "control_action": { "type": "pipe" }, "index": 42, "ref": 1, "bind": 0 } ] } ] Tested with: # ./tdc.py -c bpf Cc: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-02-03 09:10:10 -08:00
David Ahern	97b44d571d	libnetlink: linkdump_req is done for AF_BRIDGE as well The bridge command 'vlan show' calls rtnl_linkdump_req_filter for family AF_BRIDGE. Update rtnl_linkdump_req_filter to send the filter for that family as well. Fixes: `d97b16b2c9` ("libnetlink: linkdump_req: Only AF_UNSPEC family expects an ext_filter_mask") Reported-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com> Tested-by: Ido Schimmel <idosch@mellanox.com>	2019-01-07 08:36:58 -08:00
David Ahern	285033bfeb	libnetlink: Add RTNL_HANDLE_F_STRICT_CHK flag Add RTNL_HANDLE_F_STRICT_CHK flag and set in rth flags to let know commands know if the kernel supports strict checking. Extracted from patch from Ido to fix filtering with strict checking enabled. Cc: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-01-04 12:17:17 -08:00
David Ahern	f255ab1225	libnetlink: Add filter function to rtnl_neighdump_req Add filter function to rtnl_neighdump_req and a buffer to the request for the filter functions to append attributes. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-01-04 12:17:11 -08:00
David Ahern	aea41afcfd	ip bridge: Set NETLINK_GET_STRICT_CHK on socket iproute2 has been updated for the new strict policy in the kernel. Add a helper to call setsockopt to enable the feature. Add a call to ip.c and bridge.c The setsockopt fails on older kernels and the error can be safely ignored - any new fields or attributes are ignored by the older kernel. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-27 15:36:29 -08:00
David Ahern	8847097850	ip address: Set device index in dump request Add a filter function to rtnl_addrdump_req to set device index in the address dump request if the user is filtering addresses by device. In addition, add a new ipaddr_link_get to do a single RTM_GETLINK request instead of a device dump yet still store the data in the linfo list. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-27 15:35:49 -08:00
David Ahern	43fd93ae46	ip route: Remove rtnl_rtcache_request Add a filter option to rtnl_routedump_req and use it to set rtm_flags removing the need for rtnl_rtcache_request for dump requests. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-27 15:33:34 -08:00
David Ahern	d97b16b2c9	libnetlink: linkdump_req: Only AF_UNSPEC family expects an ext_filter_mask Only AF_UNSPEC handled by rtnl_dump_ifinfo expects an ext_filter_mask on a dump request. Update the linkdump request functions to only set and send ext_filter_mask for AF_UNSPEC. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-27 15:33:05 -08:00
David Ahern	92e03242c4	libnetlink: Use NLMSG_LENGTH to set nlmsg_len Change nlmsg_len from sizeof(req) to use NLMSG_LENGTH on the header. 2 of the inner headers are not 4-byte aligned, so add a 0-length buf after the header with the __aligned(NLMSG_ALIGNTO) to ensure the size of the request is large enough. Use NLMSG_ALIGN in NLMSG_LENGTH to set nlmsg_len. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-27 15:32:57 -08:00
David Ahern	2750252d7e	libnetlink: dump extack string in done message Print any extack message that has been appended to a NLMSG_DONE message. To avoid duplication, move the existing print code to a new helper. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-27 15:32:31 -08:00
David Ahern	6065ddfaa7	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-19 12:02:17 -08:00
Stephen Hemminger	738aebe52b	drop support for DECnet DECnet belongs in the history museum of dead protocols along with Appletalk and IPX. Linux support has outlived its natural life and the time has come to remove it from iproute2. Dead code is a source of bugs and exploits. If anyone actually has DECnet running on some old distribution they can just keep to the old version of iproute2. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-13 12:50:01 -08:00
Stephen Hemminger	3a1f602ade	remove redundant long int Using unsigned long is sufficient no need to be more verbose and use unsigned long int. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-12-13 11:36:59 -08:00
Stephen Hemminger	33fde2b600	lib/bpf: fix build warning if no elf Function was not used unlesss HAVE_ELF causing: bpf.c:105:13: warning: ‘bpf_map_offload_neutral’ defined but not used [-Wunused-function] Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-12-10 13:50:17 -08:00
David Ahern	fbe7da2306	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-12-07 13:02:08 -08:00
Petr Machata	0951cbcddf	libnetlink: Process further iovs on no error When no error is reported in the first iov, do not prematurely return, but process further iovs. This fixes batch processing. Fixes: `c60389e4f9` ("libnetlink: fix leak and using unused memory on error") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-12-04 14:28:31 -08:00
Stephen Hemminger	ce5071eda6	drop support for IPX IPX has been depracted then removed from upstream kernels. Drop support from ip route as well. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-11-24 07:27:56 -08:00
Jakub Kicinski	b640e85d2d	json: add %hhu helpers Add helpers for printing char-size values. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-11-24 07:09:53 -08:00
Quentin Monnet	1a7d3ad8a5	bpf: initialise map symbol before retrieving and comparing its type In order to compare BPF map symbol type correctly in regard to the latest LLVM, commit `7a04dd84a7` ("bpf: check map symbol type properly with newer llvm compiler") compares map symbol type to both NOTYPE and OBJECT. To do so, it first retrieves the type from "sym.st_info" and stores it into a temporary variable. However, the type is collected from the symbol "sym" before this latter symbol is actually updated. gelf_getsym() is called after that and updates "sym", and when comparison with OBJECT or NOTYPE happens it is done on the type of the symbol collected in the previous passage of the loop (or on an uninitialised symbol on the first passage). This may eventually break map collection from the ELF file. Fix this by assigning the type to the temporary variable only after the call to gelf_getsym(). Fixes: `7a04dd84a7` ("bpf: check map symbol type properly with newer llvm compiler") Reported-by: Ron Philip <ron.philip@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-21 09:36:30 -08:00
Stephen Hemminger	babc56b68c	tc: drop unused name_to_id function Not used in current code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-19 11:42:44 -08:00
Stephen Hemminger	1d2fac4145	libnetlnk: unused and local functions cleanup rntl_talk_extack and parse_rtattr_index not used in current code. rtnl_dump_filter_l is only used in this file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-19 11:42:44 -08:00
Stephen Hemminger	cc5b7e37ac	lib/ll_map: make local function static ll_idx_a2n is only used in ll_map. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-19 11:42:44 -08:00
Stephen Hemminger	f7bf88dfd5	lib/color: make local functions static color_enable etc, only used here. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-19 11:42:44 -08:00
Stephen Hemminger	b8795a3208	lib/utils: make local functions static Some of the print/parsing is only used internally. Drop unused get_s8/get_s16. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-19 11:42:44 -08:00
Stephen Hemminger	07b20a6197	lib/ll_addr: whitespace and indent cleanup Run old ll_addr through kernel Lindent. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-19 11:42:44 -08:00
Luca Boccassi	6d2fd4a53f	Include bsd/string.h only in include/utils.h This is simpler and cleaner, and avoids having to include the header from every file where the functions are used. The prototypes of the internal implementation are in this header, so utils.h will have to be included anyway for those. Fixes: `508f3c231e` ("Use libbsd for strlcpy if available") Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-05 08:38:32 -08:00
Luca Boccassi	508f3c231e	Use libbsd for strlcpy if available If libc does not provide strlcpy check for libbsd with pkg-config to avoid relying on inline version. Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-11-01 12:47:03 -07:00
Yonghong Song	7a04dd84a7	bpf: check map symbol type properly with newer llvm compiler With llvm 7.0 or earlier, the map symbol type is STT_NOTYPE. -bash-4.4$ cat t.c __attribute__((section("maps"))) int g; -bash-4.4$ clang -target bpf -O2 -c t.c -bash-4.4$ readelf -s t.o Symbol table '.symtab' contains 2 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 g The following llvm commit enables BPF target to generate proper symbol type and size. commit bf6ec206615b9718869d48b4e5400d0c6e3638dd Author: Yonghong Song <yhs@fb.com> Date: Wed Sep 19 16:04:13 2018 +0000 [bpf] Symbol sizes and types in object file Clang-compiled object files currently don't include the symbol sizes and types. Some tools however need that information. For example, ctfconvert uses that information to generate FreeBSD's CTF representation from ELF files. With this patch, symbol sizes and types are included in object files. Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Reported-by: Yutaro Hayakawa <yhayakawa3720@gmail.com> Hence, for llvm 8.0.0 (currently trunk), symbol type will be not NOTYPE, but OBJECT. -bash-4.4$ clang -target bpf -O2 -c t.c -bash-4.4$ readelf -s t.o Symbol table '.symtab' contains 3 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS t.c 2: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 g This patch makes sure bpf library accepts both NOTYPE and OBJECT types of global map symbols. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-10-31 08:27:07 -07:00
David Ahern	6e221408e6	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-23 10:55:09 -07:00
David Ahern	cd554f2c2f	Tree wide: Drop sockaddr_nl arg No function, filter, or print function uses the sockaddr_nl arg, so just drop it. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2018-10-22 09:43:48 -07:00
David Ahern	9d16a1de1f	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-22 09:43:33 -07:00
Stephen Hemminger	95debca728	util: spelling fix	2018-10-18 13:23:38 -07:00
Lorenzo Bianconi	c7a3b22961	utils: fix get_rtnl_link_stats_rta stats parsing iproute2 walks through the list of available tunnels using netlink protocol in order to get device info instead of reading them from proc filesystem. However the kernel reports device statistics using IFLA_INET6_STATS/IFLA_INET6_ICMP6STATS attributes nested in IFLA_PROTINFO one but iproutes expects these info in IFLA_STATS64/IFLA_STATS attributes. The issue can be triggered with the following reproducer: $ip link add ip6d0 type ip6tnl mode ip6ip6 local 1111::1 remote 2222::1 $ip -6 -d -s tunnel show ip6d0 ip6d0: ipv6/ipv6 remote 2222::1 local 1111::1 encaplimit 4 hoplimit 64 tclass 0x00 flowlabel 0x00000 (flowinfo 0x00000000) Dump terminated Fix the issue introducing IFLA_INET6_STATS attribute parsing Fixes: `3e95393871` ("iptunnel/ip6tunnel: Use netlink to walk through tunnels list") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>	2018-10-15 09:40:15 -07:00
Sabrina Dubroca	45ec4771d4	json: make 0xhex handle u64 Stephen converted macsec's sci to use 0xhex, but 0xhex handles unsigned int's, not 64 bits ints. Thus, the output of the "ip macsec show" command is mangled, with half of the SCI replaced with 0s: # ip macsec show 11: macsec0: [...] cipher suite: GCM-AES-128, using ICV length 16 TXSC: 0000000001560001 on SA 0 # ip -d link show macsec0 11: macsec0@ens3: [...] link/ether 52:54:00:12:01:56 brd ff:ff:ff:ff:ff:ff promiscuity 0 macsec sci 5254001201560001 [...] where TXSC and sci should match. Fixes: `c0b904de62` ("macsec: support JSON") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-10-15 09:32:18 -07:00
David Ahern	0d30c1f8d4	Merge branch 'master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-13 19:31:37 -07:00
Stephen Hemminger	bfb3bf189f	libnetlink: use local variable Now that err->error is in local variable, use it consistently. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-10-09 09:46:11 -07:00
Vlad Buslov	8c50b728b2	libnetlink: fix use-after-free of message buf In __rtnl_talk_iov() main loop, err is a pointer to memory in dynamically allocated 'buf' that is used to store netlink messages. If netlink message is an error message, buf is deallocated before returning with error code. However, on return err->error code is checked one more time to generate return value, after memory which err points to has already been freed. Save error code in temporary variable and use the variable to generate return value. Fixes: `c60389e4f9` ("libnetlink: fix leak and using unused memory on error") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-10-09 09:41:03 -07:00
Vinicius Costa Gomes	a066bac8a2	utils: Implement get_s64() Add this helper to read signed 64-bit integers from a string. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-07 10:30:28 -07:00
David Ahern	56eeeda978	libnetlink: Rename rtnl_wilddump_stats_req_filter to rtnl_statsdump_req_filter rtnl_wilddump_stats_req_filter only takes RTM_GETSTATS as the type argument so rename to rtnl_statsdump_req_filter for consistency with other request functions and hardcode the type argument. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:39:36 -07:00
David Ahern	31ae2912f7	libnetlink: Rename rtnl_wilddump_* to rtnl_linkdump_* Rename rtnl_wilddump_req_filter to rtnl_linkdump_req_filter, rtnl_wilddump_request to rtnl_linkdump_req and rtnl_wilddump_req_filter_fn to rtnl_linkdump_req_filter_fn. In all cases drop the type argument which at this point is only RTM_GETLINK and hardcode in the functions. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:39:08 -07:00
David Ahern	efb0b383d9	libnetlink: Convert GETNSID dumps to use rtnl_nsiddump_req Add rtnl_nsiddump_req for namespace id dumps using the proper rtgenmsg as the header. Convert existing RTM_GETNSID dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:39:04 -07:00
David Ahern	ff41db8a75	libnetlink: Convert GETNEIGHTBL dumps to use rtnl_neightbldump_req Add rtnl_neightbldump_req for neighbor table dumps using the proper ndtmsg as the header. Convert existing RTM_GETNEIGHTBL dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:39:02 -07:00
David Ahern	9e0ab19c4d	libnetlink: Convert GETNEIGH dumps to use rtnl_neighdump_req Add rtnl_neighdump_req for neighbor dumps using the proper ndmsg as the header. Convert existing rtnl_wilddump_request for RTM_GETNEIGH to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:38:59 -07:00
David Ahern	b05d9a3d58	libnetlink: Convert GETRULE dumps to use rtnl_ruledump_req Add rtnl_ruledump_req for fib fule dumps using the proper fib_rule_hdr as the header. Convert existing RTM_GETRULE dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:38:56 -07:00
David Ahern	ddee16bc96	libnetlink: Convert GETNETCONF dumps to use rtnl_netconfdump_req Add rtnl_netconfdump_req for netconf dumps using the proper netconfmsg as the header. Convert existing RTM_GETNETCONF dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:38:34 -07:00
David Ahern	9dbe6df411	libnetlink: Convert GETMDB dumps to use rtnl_mdbdump_req Add rtnl_mdbdump_req for mdb dumps using the proper br_port_msg as the header. Convert existing RTM_GETMDB dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:38:31 -07:00
David Ahern	393600231a	libnetlink: Convert GETADDRLABEL dumps to use rtnl_addrlbldump_req Add rtnl_addrlbldump_req for address label dumps using the proper ifaddrlblmsg as the header. Convert existing RTM_GETADDRALBEL dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:38:29 -07:00
David Ahern	bfb27dfaac	libnetlink: Convert GETROUTE dumps to use rtnl_routedump_req Add rtnl_routedump_req for route dumps using the proper rtmsg as the header. Convert existing RTM_GETROUTE dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:38:27 -07:00
David Ahern	46917d0895	libnetlink: Convert GETADDR dumps to use rtnl_addrdump_req Add rtnl_addrdump_req for address dumps using the proper ifaddrmsg as the header. Convert existing RTM_GETADDR dumps to use it. Signed-off-by: David Ahern <dsahern@gmail.com>	2018-10-02 18:38:21 -07:00
David Ahern	7b2e200679	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-09-28 09:52:41 -07:00
Stephen Hemminger	b45e300024	libnetlink: don't return error on success Change to error handling broke normal code. Fixes: `c60389e4f9` ("libnetlink: fix leak and using unused memory on error") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-09-25 10:08:48 +02:00
David Ahern	34212c73b7	Merge branch 'iproute2-master' into iproute2-next Conflicts: ip/iproute_lwtunnel.c In addition to merge conflict between `bd59e5b151` and `94a8722f2f`, updated the code added by the latter commit based on the change of the former (ie., added ret = to the new rta_addattr_l). Signed-off-by: David Ahern <dsahern@gmail.com>	2018-09-20 17:53:27 -07:00
Stephen Hemminger	c60389e4f9	libnetlink: fix leak and using unused memory on error If an error happens in multi-segment message (tc only) then report the error and stop processing further responses. This also fixes refering to the buffer after free. The sequence check is not necessary here because the response message has already been validated to be in the window of the sequence number of the iov. Reported-by: Mahesh Bandewar <mahesh@bandewar.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Mahesh Bandewar <maheshb@google.com>	2018-09-17 08:58:21 -07:00
Stephen Hemminger	b85076cd74	lib: introduce print_nl Common pattern in iproute commands is to print a line seperator in non-json mode. Make that a simple function. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-09-11 08:29:33 -07:00
Dave Taht	abf70ef494	tc: support conversions to or from 64 bit nanosecond-based time Using a 32 bit field to represent time in nanoseconds results in a maximum value of about 4.3 seconds, which is well below many observed delays in WiFi and LTE, and barely in the ballpark for a trip past the Earth's moon, Luna. Using 64 bit time fields in nanoseconds allows us to simulate network diameters of several hundred light-years. However, only conversions to and from ns, us, ms, and seconds are provided. The iproute2 64 bit api uses signed values for time. Being able to represent positive or negative time allows us to calculate +/- deltas between, for example, the CLOCK_TAI and CLOCK_REALTIME clocks. Time related utility functions in tc_util.c are moved to lib/utils.c. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-08-30 11:04:38 -07:00
Mahesh Bandewar	5d5586b058	iproute: make clang happy These are primarily fixes for "string is not string literal" warnings / errors (with -Werror -Wformat-nonliteral). This should be a no-op change. I had to replace couple of print helper functions with the code they call as it was becoming harder to eliminate these warnings, however these helpers were used only at couple of places, so no major change as such. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-08-30 07:58:09 -07:00
Phil Sutter	515a766cd2	lib: Make check_enable_color() return boolean As suggested, turn return code into true/false although it's not checked anywhere yet. Fixes: `4d82962ccc` ("Merge common code for conditionally colored output") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-08-20 08:55:16 -07:00
Phil Sutter	ff1ab8edf8	Make colored output configurable Allow for -color={never,auto,always} to have colored output disabled, enabled only if stdout is a terminal or enabled regardless of stdout state. Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-08-20 08:54:06 -07:00
Phil Sutter	4d82962ccc	Merge common code for conditionally colored output Instead of calling enable_color() conditionally with identical check in three places, introduce check_enable_color() which does it in one place. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-08-15 09:55:27 -07:00
David Ahern	c044be6b34	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-08-13 07:47:21 -07:00
Lubomir Rintel	3655f788d3	lib/namespace: avoid double-mounting a /sys This partly reverts `8f0807023d`, bringing back the umount(/sys) attempt. In a LXC container we're unable to umount the sysfs instance, nor mount a read-write one. We still are able to create a new read-only instance. Nevertheless, it still makes sense to attempt the umount() even though the sysfs is mounted read-only. Otherwise we may end up attempting to mount a sysfs with the same flags as is already mounted, resulting in an EBUSY error (meaning "Already mounted"). Perhaps this is not a very likely scenario in real world, but we hit it in NetworkManager test suite and makes netns_switch() somewhat more robust. It also fixes the case, when /sys wasn't mounted at all. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-07-27 13:40:12 -07:00
David Ahern	a0bc57e1ef	Merge branch 'iproute2-master' into iproute2-next Conflicts: include/uapi/linux/bpf.h Signed-off-by: David Ahern <dsahern@gmail.com>	2018-07-25 10:08:04 -07:00
Mathieu Xhonneux	04cb3c0d43	ip: add support for seg6local End.BPF action This patch adds support for the End.BPF action of the seg6local lightweight tunnel. Functions from the BPF lightweight tunnel are re-used in this patch. Example: $ ip -6 route add fc00::18 encap seg6local action End.BPF endpoint obj my_bpf.o sec my_func dev eth0 $ ip -6 route show fc00::18 fc00::18 encap seg6local action End.BPF endpoint my_bpf.o:[my_func] dev eth0 metric 1024 pref medium v2: - re-use of print_encap_bpf_prog instead of fprintf - introduction of "endpoint" keyword for more consistency with others parameters Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-07-18 15:56:18 -07:00
Daniel Borkmann	f823f36012	bpf: implement btf handling and map annotation Implement loading of .BTF section from object file and build up internal table for retrieving key/value id related to maps in the BPF program. Latter is done by setting up struct btf_type table. One of the issues is that there's a disconnect between the data types used in the map and struct bpf_elf_map, meaning the underlying types are unknown from the map description. One way to overcome this is to add a annotation such that the loader will recognize the relation to both. BPF_ANNOTATE_KV_PAIR(map_foo, struct key, struct val); has been added to the API that programs can use. The loader will then pick the corresponding key/value type ids and attach it to the maps for creation. This can later on be dumped via bpftool for introspection. Example with test_xdp_noinline.o from kernel selftests: [...] struct ctl_value { union { __u64 value; __u32 ifindex; __u8 mac[6]; }; }; struct bpf_map_def __attribute__ ((section("maps"), used)) ctl_array = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(struct ctl_value), .max_entries = 16, .map_flags = 0, }; BPF_ANNOTATE_KV_PAIR(ctl_array, __u32, struct ctl_value); [...] Above could also further be wrapped in a macro. Compiling through LLVM and converting to BTF: # llc --version LLVM (http://llvm.org/): LLVM version 7.0.0svn Optimized build. Default target: x86_64-unknown-linux-gnu Host CPU: skylake Registered Targets: bpf - BPF (host endian) bpfeb - BPF (big endian) bpfel - BPF (little endian) [...] # clang [...] -O2 -target bpf -g -emit-llvm -c test_xdp_noinline.c -o - \| llc -march=bpf -mcpu=probe -mattr=dwarfris -filetype=obj -o test_xdp_noinline.o # pahole -J test_xdp_noinline.o Checking pahole dump of BPF object file: # file test_xdp_noinline.o test_xdp_noinline.o: ELF 64-bit LSB relocatable, unknown arch 0xf7 version 1 (SYSV), with debug_info, not stripped # pahole test_xdp_noinline.o [...] struct ctl_value { union { __u64 value; /* 0 8 / __u32 ifindex; / 0 4 / __u8 mac[0]; / 0 0 / }; / 0 8 / / size: 8, cachelines: 1, members: 1 / / last cacheline: 8 bytes */ }; Now loading into kernel and dumping the map via bpftool: # ip -force link set dev lo xdp obj test_xdp_noinline.o sec xdp-test # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric/id:227 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever [...] # bpftool prog show id 227 227: xdp tag a85e060c275c5616 gpl loaded_at 2018-07-17T14:41:29+0000 uid 0 xlated 8152B not jited memlock 12288B map_ids 381,385,386,382,384,383 # bpftool map dump id 386 [{ "key": 0, "value": { "": { "value": 0, "ifindex": 0, "mac": [] } } },{ "key": 1, "value": { "": { "value": 0, "ifindex": 0, "mac": [] } } },{ [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-07-17 19:38:44 -07:00
Daniel Borkmann	b5cb33aec6	bpf: implement bpf to bpf calls support Implement missing bpf to bpf calls support. The loader will recognize .text section and handle relocation entries that are emitted by LLVM. First step is processing of map related relocation entries for .text section, and in a second step loader will copy .text section into program section and adjust call instruction offset accordingly. Example with test_xdp_noinline.o from kernel selftests: 1) Every function as __attribute__ ((always_inline)), rest left unchanged: # ip -force link set dev lo xdp obj test_xdp_noinline.o sec xdp-test # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric/id:233 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever [...] # bpftool prog dump xlated id 233 [...] 1669: (2d) if r3 > r2 goto pc+4 1670: (79) r2 = (u64 )(r10 -136) 1671: (61) r2 = (u32 )(r2 +0) 1672: (63) (u32 )(r1 +0) = r2 1673: (b7) r0 = 1 1674: (95) exit <-- 1674 insns total 2) Every function as __attribute__ ((noinline)), rest left unchanged: # ip -force link set dev lo xdp obj test_xdp_noinline.o sec xdp-test # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric/id:236 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever [...] # bpftool prog dump xlated id 236 [...] 1000: (bf) r1 = r6 1001: (b7) r2 = 24 1002: (85) call pc+3 <-- pc-relative call insns 1003: (1f) r7 -= r0 1004: (bf) r0 = r7 1005: (95) exit 1006: (bf) r0 = r1 1007: (bf) r1 = r2 1008: (67) r1 <<= 32 1009: (77) r1 >>= 32 1010: (bf) r3 = r0 1011: (6f) r3 <<= r1 1012: (87) r2 = -r2 1013: (57) r2 &= 31 1014: (67) r0 <<= 32 1015: (77) r0 >>= 32 1016: (7f) r0 >>= r2 1017: (4f) r0 \|= r3 1018: (95) exit <-- 1018 insns total Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-07-17 19:38:43 -07:00
Daniel Borkmann	6e5094dbb7	bpf: remove strict dependency on af_alg Do not bail out when AF_ALG is not supported by the kernel and only do so when a map is requested in object ns where we're calculating the hash. Otherwise, the loader can operate just fine, therefore lets not fail early when it's not needed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-07-17 19:38:40 -07:00
Daniel Borkmann	282a1fe1f8	bpf: move bpf_elf_map fixup notification under verbose No need to spam the user with this if it can be fixed gracefully anyway. Therefore, move it under verbose option. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-07-17 19:38:38 -07:00
Donald Sharp	a313455c6c	iproute2: Add support for a few routing protocols Add support for: BGP ISIS OSPF RIP EIGRP Routing protocols to iproute2. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-06-11 11:18:30 -07:00
David Ahern	45c0dd7286	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-06-01 08:17:23 -07:00
Stephen Hemminger	405e0c4ffe	tc: allow 0% for percent options Allowing 0% is sometimes useful for example in netem loss and drop or perhaps dropping all traffic in a HTB bin. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199745 Reported-by: stuartmarsden@gmail.com Fixes: `927e3cfb52` ("tc: B.W limits can now be specified in %.") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-05-17 16:20:50 -07:00
David Ahern	961d0991bc	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-05-16 14:10:27 -07:00
Luca Boccassi	9b13cc98f5	ip: do not drop capabilities if net_admin=i is set Users have reported a regression due to ip now dropping capabilities unconditionally. zerotier-one VPN and VirtualBox use ambient capabilities in their binary and then fork out to ip to set routes and links, and this does not work anymore. As a workaround, do not drop caps if CAP_NET_ADMIN (the most common capability used by ip) is set with the INHERITABLE flag. Users that want ip vrf exec to work do not need to set INHERITABLE, which will then only set when the calling program had privileges to give itself the ambient capability. Fixes: `ba2fc55b99` ("Drop capabilities if not running ip exec vrf with libcap") Signed-off-by: Luca Boccassi <bluca@debian.org>	2018-05-14 21:07:34 -07:00
Jakub Kicinski	0c0394ff83	bpf: don't offload perf array maps Perf arrays are handled specially by the kernel, don't request offload even when used by an offloaded program. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-05-05 11:08:00 -07:00
Toke Høiland-Jørgensen	4db2ff0db4	json_print: Fix hidden 64-bit type promotion print_uint() will silently promote its variable type to uint64_t, but there is nothing that ensures that the format string specifier passed along with it fits (and the function name suggest to pass "%u"). Fix this by changing print_uint() to use a native 'unsigned int' type, and introduce a separate print_u64() function for printing 64-bit values. All call sites that were actually printing 64-bit values using print_uint() are converted to use print_u64() instead. Since print_int() was already using native int types, just add a print_s64() to match, but don't convert any call sites. For symmetry, also add a print_luint() method (with no users). Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-04-25 11:08:55 -07:00
Stephen Hemminger	260a92afe6	bpf: fix warnings on gcc-8 about string truncation In theory, the path for BPF could exceed the 4K PATH_MAX. In practice, not really possible. But shut up gcc. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-04-20 10:38:00 -07:00
David Ahern	d42c7891d2	utils: Do not reset family for default, any, all addresses Thomas reported a change in behavior with respect to autodectecting address families. Specifically, 'ip ro add default via fe80::1' syntax was failing to treat fe80::1 as an IPv6 address as it did in prior releases. The root causes appears to be a change in family when the default keyword is parsed. 'default', 'any' and 'all' are relevant outside of AF_INET. Leave the family arg as is for these when setting addr. Fixes: `93fa12418d` ("utils: Always specify family and ->bytelen in get_prefix_1()") Reported-by: Thomas Deutschmann <whissi@gentoo.org> Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Serhey Popovych <serhe.popovych@gmail.com>	2018-04-16 17:00:48 -07:00
David Ahern	2c62a64d60	Merge branch 'iproute2-master' into iproute2-next Conflicts: bridge/mdb.c misc/ss.c tc/tc.c Signed-off-by: David Ahern <dsahern@gmail.com>	2018-04-02 10:47:34 -07:00
Steve Wise	8958a15c04	rdma: Add MR resource tracking information Sample output: Without CAP_NET_ADMIN: $ rdma resource show mr mrlen 65536 dev mlx4_0 mrlen 65536 pid 0 comm [nvme_rdma] dev cxgb4_0 mrlen 65536 pid 0 comm [nvme_rdma] With CAP_NET_ADMIN: # rdma resource show mr mrlen 65536 dev mlx4_0 rkey 0x12702 lkey 0x12702 iova 0x85724a000 mrlen 65536 pid 0 comm [nvme_rdma] dev cxgb4_0 rkey 0x68fe4e9 lkey 0x68fe4e9 iova 0x835b91000 mrlen 65536 pid 0 comm [nvme_rdma] Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-04-01 08:18:56 -07:00
Alexander Zubkov	c121807250	arrange prefix parsing code after redundant patches A problem was reported with parsing of prefixes all/any/default. Commit `7696f1097f` fixes the problem, but there were also other pathces applied: `00b31a6b2e`, which were intended to fix the same problem. And they became redundant now. This patch reverts changes introduced by those redundant patches. Signed-off-by: Alexander Zubkov <green@msu.ru>	2018-03-29 08:42:04 -07:00
Stephen Hemminger	89e3c36b06	namespace: limit the length of namespace name to avoid snprintf overflow This fixes problem reported by gcc-8 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-03-29 08:40:26 -07:00
Stephen Hemminger	08a93b32f5	bpf: avoid compiler warnings about strncpy Use strlcpy to avoid cases where sizeof(buf) == strlen(buf) Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2018-03-29 08:32:48 -07:00
David Ahern	54eae5f76d	Merge branch 'iproute2-master' into iproute2-next Signed-off-by: David Ahern <dsahern@gmail.com>	2018-03-27 12:33:02 -07:00
Luca Boccassi	ba2fc55b99	Drop capabilities if not running ip exec vrf with libcap ip vrf exec requires root or CAP_NET_ADMIN, CAP_SYS_ADMIN and CAP_DAC_OVERRIDE. It is not possible to run unprivileged commands like ping as non-root or non-cap-enabled due to this requirement. To allow users and administrators to safely add the required capabilities to the binary, drop all capabilities on start if not invoked with "vrf exec". Update the manpage with the requirements. Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-03-27 11:48:23 -07:00
Alexander Zubkov	7696f1097f	treat "default" and "all"/"any" addresses differenty Debian maintainer found that basic command: # ip route flush all No longer worked as expected which breaks user scripts and expectations. It no longer flushed all IPv4 routes. Recently behavior of "default" prefix parameter was corrected. But at the same time behavior of "all"/"any" was altered too, because they were the same branch of the code. As those parameters mean different, they need to be treated differently in code too. This patch reflects the difference. Also after mentioned change, address parsing code was changed more and address family was set explicitly even for "all"/"any" addresses. And that broke matching conditions further. This patch fixes that too and returns AF_UNSPEC to "all"/"any" address. Now "default" is treated as top-level prefix (for example 0.0.0.0/0 in IPv4) and "all"/"any" always matches anything in exact, root and match modes. Reported-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Alexander Zubkov <green@msu.ru>	2018-03-27 08:58:26 -07:00
David Ahern	e9625d6aea	Merge branch 'iproute2-master' into iproute2-next Conflicts: bridge/mdb.c Updated bridge/bridge.c per removal of check_if_color_enabled by commit `1ca4341d2c` ("color: disable color when json output is requested") Signed-off-by: David Ahern <dsahern@gmail.com>	2018-03-13 17:48:10 -07:00
Stephen Hemminger	96303c25ee	Revert "iproute: "list/flush/save default" selected all of the routes" This reverts commit `9135c4d603`. Debian maintainer found that basic command: # ip route flush all No longer worked as expected which breaks user scripts and expectations. It no longer flushed all IPv4 routes. Reported-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-03-12 14:02:36 -07:00
Serhey Popovych	fe99adbca4	utils: Introduce and use nodev() helper routine There is a couple of places where we report error in case of no network device is found. In all of them we output message in the same format to stderr and either return -1 or 1 to the caller or exit with -1. Introduce new helper function nodev() that takes name of the network device caused error and returns -1 to it's caller. Either call exit() or return to the caller to preserve behaviour before change. Use -nodev() in traffic control (tc) code to return 1. Simplify expression for checking for argument being 0/NULL in @if statement. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>	2018-03-11 17:58:36 -07:00
Stephen Hemminger	d9d8c8393e	json_writer: add SPDX Identifier (GPL-2/BSD-2) I wrote this code so put SPDX License on it and intentionally allow use in BSD code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-03-06 14:39:19 -08:00
David Ahern	3dec72672f	libnetlink: __rtnl_talk_iov should only loop max iovlen times William reported ip hanging and bisected to a recent commit for batching allowing more than 1 command to be sent per message. The loop over recvmsg should never cycle more than iovlen times -- 1 response for each command in the message. Fixes: `72a2ff3916` ("lib/libnetlink: Add a new function rtnl_talk_iov") Signed-off-by: David Ahern <dsahern@gmail.com>	2018-03-02 13:30:34 -08:00
Joe Stringer	a0405444f7	bpf: Print section name when hitting non ld64 issue It's useful to be able to tell which section is being processed in the ELF when this error is triggered, so print that detail. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-03-02 13:28:53 -08:00
Donald Sharp	728eb8d00b	ip: Properly display AF_BRIDGE address information for neighbor events The vxlan driver when a neighbor add/delete event occurs sends NDA_DST filled with a union: union vxlan_addr { struct sockaddr_in sin; struct sockaddr_in6 sin6; struct sockaddr sa; }; This eventually calls rt_addr_n2a_r which had no handler for the AF_BRIDGE family and "???" was being printed. Add code to properly display this data when requested. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-02-23 11:27:09 -08:00
Arkadi Sharshevsky	049c58539f	devlink: mnlg: Add support for extended ack Add support for extended ack. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-02-23 08:36:05 -08:00
Vincent Bernat	1ca4341d2c	color: disable color when json output is requested Instead of declaring -color and -json exclusive, ignore -color when -json is provided. The rationale is to allow to put -color in an alias for ip while still being able to use -json. -color is merely a presentation suggestion and we can assume there is nothing to color in the JSON output. Signed-off-by: Vincent Bernat <vincent@bernat.im> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-02-23 08:18:33 -08:00
Lubomir Rintel	8f0807023d	lib/namespace: don't try to mount rw /sys over a ro one It will fail with EPERM on Linux 4.15. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2018-02-23 08:18:06 -08:00
Stephen Hemminger	4328b687b4	ip: always print interface name in color Even in brief mode the interface name should be printed in color if desired. This makes output consistent across regular and brief mode. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-21 08:42:04 -08:00
Serhey Popovych	f5b50a18ae	utils: Introduce and use print_name_and_link() to print name@link There is at least three places implementing same things: two in ipaddress.c print_linkinfo() & print_linkinfo_brief() and one in bridge/link.c. They are diverge from each other very little: bridge/link.c does not support JSON output at the moment and print_linkinfo_brief() does not handle IFLA_LINK_NETNS case. Introduce and use print_name_and_link() routine to handle name@link output in all possible variations; respect IFLA_LINK_NETNS attribute to handle case when link is in different namespace; use ll_idx_n2a() for interface name instead of "<nil>" to share logic with other code (e.g. ll_name_to_index() and ll_index_to_name()) supporting such template. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-16 08:14:22 -08:00
Serhey Popovych	fcac966526	utils: Introduce and use get_ifname_rta() Be consistent in handling of IFLA_IFNAME attribute in all places: if there is no attribute report bug to stderr and use ll_idx_n2a() as last measure to get name in "if%u" format instead of "<nil>". Use check_ifname() to validate network device name: this catches both unexpected return from kernel and ll_idx_n2a(). Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-16 08:14:20 -08:00
Serhey Popovych	0cec58dac4	lib: Correct object file dependencies Neither internal libnetlink nor libgenl depends on ll_map.o: prepare for upcoming changes that brings much more cleaner dependency between utils.o and ll_map.o. However ll_map.o depends on libnetlink.o functions so we need to provide libnetlink.a after libutil.a in LIBNETLINK at global Makefile. Tested using make clean && make -j4. No problems so far. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-16 08:14:18 -08:00
Serhey Popovych	fe269b6e7c	utils: Reimplement ll_idx_n2a() and introduce ll_idx_a2n() Now all users of ll_idx_n2a() replaced with ll_index_to_name() we can move it's functionality to ll_index_to_name() and implement index to name conversion using snprintf() and "if%u". Use %u specifier in "if%..." template consistently: network device indexes are always greather than zero. Also introduce ll_idx_n2a() conterpart: ll_idx_a2n() that is used to translate name of the "if%u" form to index using sscanf(). Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-16 08:14:13 -08:00
Serhey Popovych	5433656705	ip: Use single variable to represent -pretty After commit `a233caa0aa` ("json: make pretty printing optional") I get following build failure: LINK rtmon ../lib/libutil.a(json_print.o): In function `new_json_obj': json_print.c:(.text+0x35): undefined reference to `show_pretty' collect2: error: ld returned 1 exit status make[1]: * [rtmon] Error 1 make: * [all] Error 2 It is caused by missing show_pretty variable in rtmon. On the other hand tc/tc.c there are two distinct variables and single matches() call that handles -pretty option thus setting show_pretty will never happen. Note that since commit `44dcfe8201` ("Change formatting of u32 back to default") show_pretty is used in tc/f_u32.c so this is first place where -pretty introduced. Furthermore other utilities like misc/ifstat.c and misc/nstat.c define pretty variable, however only for their own purposes. They both support JSON output and thus depend show_pretty in new_json_obj(). Assuming above use common variable to represent -pretty option, define it in utils.c and declare in utils.h that is commonly used. Replace show_pretty with pretty. Fixes: `a233caa0aa` ("json: make pretty printing optional") Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-16 08:13:36 -08:00
Stephen Hemminger	6cbd9465bc	json: fix newline at end of array The json print library was toggling pretty print at the end of an array to workaround a bug in underlying json_writer. Instead, just fix json_writer to pretty print array correctly. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-10 08:18:49 -08:00
Stephen Hemminger	a233caa0aa	json: make pretty printing optional Since JSON is intended for programmatic consumption, it makes sense for the default output format to be concise as possible. For programmer and other uses, it is helpful to keep the pretty whitespace format; therefore enable it with -p flag. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-10 08:15:08 -08:00
Serhey Popovych	9a7bd5442b	ip: Introduce get_rtnl_link_stats_rta() to get link statistics Assume all statistics in ip(8) represented either by IFLA_STATS64 or IFLA_STATS is 64 bit. It is clean that we can store __u32 counters of @struct rtnl_link_stats in __u64 counters in @struct rtnl_link_stats64. New get_rtnl_link_stats_rta() follows __print_link_stats() behaviour on handling of stats attribute: copy no more than size of data structure and no less than attribute length zeroing rest. Drop print_link_stats32() as it's functionality can be handled by 64bit variant. Move code from __print_link_stats() to print_link_stats64() and finally rename print_link_stats64() to __print_link_stats(). More users of introduced function will come in future. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-02-07 16:15:28 -08:00
Jakub Kicinski	097415d510	tc: red: JSON-ify RED output Make JSON output work with RED Qdiscs. Float/double printing helpers have to be added/uncommented to print the probability. Since TC stats in general are not split out to a separate object the xstats printed by this patch are not separated either. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-26 12:59:55 -08:00
Serhey Popovych	27c523e209	utils: Introduce get_addr_rta() and inet_addr_match_rta() First is used to get address from netlink attribute to inet_prefix data structure. Use memcpy() with constant value to let complier optimize by replacing a call by inlining load/store instructions. Second is used to match address in given netlink attribute with one given as reference. It matches successfully if no attribute is given (@rta is NULL), reference address family is AF_UNSPEC or it's length isn't given; fails if get_attr_rta() can't get attribute or it's family does not match reference; calls inet_addr_match() to get final verdict. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-25 09:31:16 -08:00
Serhey Popovych	6caad8f505	ip: Get rid of inet_get_addr() Both geneve and vxlan modules are converted to use get_addr() we can replace inet_get_addr() in less problematic places and finally get rid of inet_get_addr(). Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-21 09:38:26 -08:00
Serhey Popovych	7bf5e876d0	utils: Fast inet address classification after get_addr() It looks very useful to receive additional information from get_addr_1() and get_addr() about address to simplify caller and get rid of code duplications. For now following information can be returned: 1) address is unspecified (zero) 2) address is multicast 3) address is internet: family is either AF_INET or AF_INET6. More information can be added in the future. Introduce inline helpers to make code using this new address classification interface more self explaining: bool is_addrtype_inet(inet_prefix addr) true if @addr is inet address bool is_addrtype_inet_unspec(inet_prefix addr) true if @addr is unspecified inet address bool is_addrtype_inet_multi(inet_prefix addr) true if @addr is multicast inet address bool is_addrtype_inet_not_unspec(inet_prefix addr) true if @addr is not unspecified inet address false if @addr is not inet or unspecified inet bool is_addrtype_inet_not_multi(inet_prefix *addr) true if @addr is not multicast inet address false if @addr is not inet or multicast inet Last two are useful for case when we need inet address that is not unspecified or multicast. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-21 09:38:21 -08:00
Serhey Popovych	93fa12418d	utils: Always specify family and ->bytelen in get_prefix_1() Handle default/all/any special case in get_addr_1() to setup ->family and ->bytelen correctly. Make get_addr_1() return ->bitlen == -2 instead of -1 to distinguish default/all/any special case from the rest: it is safe because all callers check ->bitlen < 0, not explicit value -1. Reduce intendation by one level and get rid of goto/label to make code more readable. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-21 09:38:19 -08:00
Serhey Popovych	f2522007d8	utils: Always specify family for address in get_addr_1() Set ->family correctly when string representing address is "default", "all" or "any": get_addr_1() might be called with AF_UNSPEC (e.g. get_addr() -> get_addr_1()). Extend support for zero address to all address families, not only AF_INET and AF_INET6 when one explicitly given as @family: use af_byte_len() to correctly set address length. Still assume AF_INET when @family is AF_UNSPEC. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-21 09:38:17 -08:00
Jakub Kicinski	5691e6bc58	bpf: support map offload When program is loaded with a specified ifindex, use that ifindex also when creating maps. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-19 12:35:41 -08:00
Chris Mi	72a2ff3916	lib/libnetlink: Add a new function rtnl_talk_iov rtnl_talk can only send a single message to kernel. Add a new function rtnl_talk_iov that can send multiple messages to kernel. rtnl_talk_iov takes struct iovec * and iovlen as arguments. Signed-off-by: Chris Mi <chrism@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2018-01-14 09:03:33 -08:00
Serhey Popovych	1ed8a5ca87	utils: ll_addr: Handle ARPHRD_IP6GRE in ll_addr_n2a() ll_addr_n2a() correctly prints tunnel endpoints for gre, ipip, sit and ip6tnl, but not for ip6gre. Fix this by adding ARPHRD_IP6GRE to IPv6 tunnel endpoing address conversion. Before: ------- $ ip link show ... 18: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default link/tunnel6 :: brd :: 19: ip6gre0: <NOARP> mtu 1456 qdisc noop state DOWN mode DEFAULT group default link/gre6 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 brd \ 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 After: ------ $ ip link show ... 18: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default link/tunnel6 :: brd :: 19: ip6gre0: <NOARP> mtu 1456 qdisc noop state DOWN mode DEFAULT group default link/gre6 :: brd :: Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>	2017-12-26 09:07:42 -08:00
Alexander Zubkov	9135c4d603	iproute: "list/flush/save default" selected all of the routes When running "ip route list default" and not specifying address family, one will get all of the routes instead of just default only. The same is for "exact default" and "match default". It behaves in such a way because default route with unspecified family has the same all-zeroes value like no prefix specified at all. Thus following code blindly ignores the fact, that prefix was actually specified. This patch adds the flag PREFIXLEN_SPECIFIED to the default route too. And then checks its value when filtering routes. Signed-off-by: Alexander Zubkov <green@msu.ru> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-12-19 08:23:09 -08:00
Stephen Hemminger	bd9cea5d8c	utils: fix makeargs stack overflow The makeargs() function did not handle end of string correctly and would reference past end of string. Found by fuzzing with ASAN. Reported-by:Bug Basher <iamliketohack@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-12-18 11:19:48 -08:00
Jakub Kicinski	65fdae3d18	bpf: allow loading programs for a specific ifindex For BPF offload we need to specify the ifindex when program is loaded now. Extend the bpf common code to accommodate that. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-26 11:57:57 -08:00
Jakub Kicinski	4a847fcb51	bpf: expose bpf_parse_common() and bpf_load_common() Expose bpf_parse_common() and bpf_load_common() functions for those users who may want to modify the parameters to load after parsing is done. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-26 11:57:57 -08:00
Jakub Kicinski	399db8392b	bpf: rename bpf_parse_common() to bpf_parse_and_load_common() bpf_parse_common() parses and loads the program. Rename it accordingly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-26 11:57:57 -08:00
Jakub Kicinski	3f0b9e620c	bpf: split parse from program loading Parsing command line is currently done together with potentially loading a new eBPF program. This makes it more difficult to provide additional parameters for loading (which may come after the eBPF program info on the command line). Split the two (only internally for now). Verbose parameter has to be saved in struct bpf_cfg_in to be carried between the stages. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-26 11:57:57 -08:00
Jakub Kicinski	51be754690	bpf: allocate opcode table in struct bpf_cfg_in struct bpf_cfg_in already carries a pointer to sock_filter ops. It's currently set to a local variable in bpf_parse_opt_tbl(), shared between parsing and loading stages. Move the array entirely to struct bpf_cfg_in, this will allow us to split parsing and loading. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-26 11:57:57 -08:00
Jakub Kicinski	f20ff2f195	bpf: keep parsed program mode in struct bpf_cfg_in bpf_parse() will parse command line arguments to find out the program mode. This mode will later be needed at loading time. Instead of keeping it locally add it to struct bpf_cfg_in, this will allow splitting parsing and loading stages. enum bpf_mode has to be moved to the header file, because C doesn't allow forward declaration of enums. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-26 11:57:57 -08:00
Jakub Kicinski	658cfebc27	bpf: pass program type in struct bpf_cfg_in Program type is needed both for parsing and loading of the program. Parsing may also induce the type based on signatures from __bpf_prog_meta. Instead of passing the type around keep it in struct bpf_cfg_in. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-26 11:57:57 -08:00
Stephen Hemminger	6054c1ebf7	SPDX license identifiers For all files in iproute2 which do not have an obvious license identification, mark them with SPDK GPL-2 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-11-24 12:21:35 -08:00
Nishanth Devarajan	927e3cfb52	tc: B.W limits can now be specified in %. This patch adapts the tc command line interface to allow bandwidth limits to be specified as a percentage of the interface's capacity. Adding this functionality requires passing the specified device string to each class/qdisc which changes the prototype for a couple of functions: the .parse_qopt and .parse_copt interfaces. The device string is a required parameter for tc-qdisc and tc-class, and when not specified, the kernel returns ENODEV. In this patch, if the user tries to specify a bandwidth percentage without naming the device, we return an error from userspace. Signed-off-by: Nishanth Devarajan<ndev2021@gmail.com>	2017-11-24 11:22:13 -08:00
Jakub Kicinski	f6a54d72a5	bpf: initialize the verifier log If program loading fails before verifier prints its first message, the verifier log will not be initialized. Always set the first character of the log buffer to zero to make sure we don't dump non-printable characters to the terminal. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-23 20:47:38 -08:00

1 2 3 4 5 ...

539 Commits