Since cgroups are not namespace aware, the directory heirarchy used by
ip vrf should account for network namespaces. In this case, change the
path from CGRP/BASE/vrf/NAME to CGRP/BASE/NETNS/vrf/NAME where CGRP is
the cgroup2 mount path, BASE in any base heirarchy inherited before VRF
is applied and NAME is the VRF name.
The intent is as follows: a user logs into the box into some namespace
with a name known to iproute2. Some other policy may have put the
process into a BASE heirarchy. From there the user executes a task in
a VRF and in doing so the task heirarchy becomes CGRP/BASE/NETNS/vrf/NAME.
The namespace level is omitted for the default namespace.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Move guts of netns_identify into a standalone function that returns
the netns name in a given buffer.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add support for VRF in a pre-existing hierarchy. For example, if the
current process is running in CGRP/foo/bar, the 'ip vrf exec NAME CMD'
should run CMD in the cgroup CGRP/foo/bar/vrf/NAME.
When listing process ids in a VRF, search for the directory vrf/NAME
regardless of base path (foo/bar/vrf/NAME and vrf/NAME) are still
running against the same vrf NAME.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
This patch implements support for the IFLA_BRPORT_FLUSH attribute
in iproute2 so it can flush bridge slave's fdb dynamic entries.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_MLD_VERSION
attribute in iproute2 so it can change the mcast mld version.
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
This patch implements support for the IFLA_BR_MCAST_IGMP_VERSION
attribute in iproute2 so it can change the mcast igmp version.
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
This patch implements support for the IFLA_BR_MCAST_STATS_ENABLED
attribute in iproute2 so it can enable/disable mcast stats accounting.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_VLAN_STATS_ENABLED
attribute in iproute2 so it can enable/disable vlan stats accounting.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_FDB_FLUSH attribute
in iproute2 so it can flush bridge fdb dynamic entries.
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
This patch adds a new field that is printed in the end of the line which
denotes the real entry state. Before this patch an entry's IIF could
disappear and it would look like an unresolved one (iif = unresolved):
(3.0.16.1, 225.11.16.1) Iif: unresolved
with no way to really distinguish it from an unresolved entry.
After the patch if the dumped entry has RTNH_F_UNRESOLVED set we get:
(3.0.16.1, 225.11.16.1) Iif: unresolved State: unresolved
for unresolved entries and:
(0.0.0.0, 225.11.11.11) Iif: eth4 Oifs: eth3 State: resolved
for resolved entries after the OIF list. Note that "State:" has ':' in
it so it cannot be mistaken for an interface name.
And for the example above, we'd get:
(0.0.0.0, 225.11.11.11) Iif: unresolved State: resolved
Also when dumping all routes via ip route show table all,
it will show up as:
multicast 225.11.16.1/32 from 3.0.16.1/32 table default proto 17 unresolved
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
To specify multiple nexthops in a route the user is expected to use the
"nexthop" keyword which ip route uses to create the RTA_MULTIPATH.
However, ip route always accepts multiple 'via' keywords where only the
last one is used in the route leading to confusion. For example, ip
accepts this syntax:
$ ip ro add vrf red 1.1.1.0/24 via 10.100.1.18 via 10.100.2.18
but the route entered inserted by the kernel is just the last gateway:
1.1.1.0/24 via 10.100.2.18 dev eth2
which is not the full request from the user. Detect the presense of
multiple 'via' and give the user a hint to add nexthop:
$ ip ro add vrf red 1.1.1.0/24 via 10.100.1.18 via 10.100.2.18
Error: argument "via" is wrong: use nexthop syntax to specify multiple via
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fix "Policy buffer overflow" when trying to use deleteall with many
policies installed.
Signed-off-by: Alexander Heinlein <alexander.heinlein@secunet.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Entries with long vhost names in /proc/net/igmp have no whitespace
between name and colon, so sscanf() adds it to vhost and
'ip maddr show iface' doesn't include inet result.
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Show ipv6 tunnel keys on presence of GRE_KEY flag for tunnel types
other than GRE. Aligns ipv6 behaviour with ipv4.
Signed-off-by: dforster@brocade.com
Next up a non-root user gets various bpf related error messages:
$ ip vrf exec mgmt bash
Failed to load BPF prog: 'Operation not permitted'
Kernel compiled with CGROUP_BPF enabled?
Catch the EPERM error and do not show the kernel config option.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
A vrf is local to a namespace. Drop any VRF association before trying
to exec a command in the new namespace.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Path in vrf_switch for "default" VRF is supposed to be MNT/vrf not
MNT/default. Also, default_vrf flag is redundant with ifindex. Remove
the flag in favor of ifindex != 0.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Split ipvrf_identify into arg processing and a function that does the
actual cgroup file parsing. The latter function is used in a follow
on patch.
In the process, convert the reading of the cgroups file to use fopen
and fgets just in case the file ever grows beyond 4k. Move printing
of any error message and the vrf name to the caller of the new
vrf_identify.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Move the hint about CGROUP_BPF enabled to prog_load failure since
it fails before the attach. Update the existing error message to
print to stderr.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
'ip vrf' follows the user semnatics established by 'ip netns'.
The 'ip vrf' subcommand supports 3 usages:
1. Run a command against a given vrf:
ip vrf exec NAME CMD
Uses the recently committed cgroup/sock BPF option. vrf directory
is added to cgroup2 mount. Individual vrfs are created under it. BPF
filter attached to vrf/NAME cgroup2 to set sk_bound_dev_if to the VRF
device index. From there the current process (ip's pid) is addded to
the cgroups.proc file and the given command is exected. In doing so
all AF_INET/AF_INET6 (ipv4/ipv6) sockets are automatically bound to
the VRF domain.
The association is inherited parent to child allowing the command to
be a shell from which other commands are run relative to the VRF.
2. Show the VRF a process is bound to:
ip vrf id
This command essentially looks at /proc/pid/cgroup for a "::/vrf/"
entry with the VRF name following.
3. Show process ids bound to a VRF
ip vrf pids NAME
This command dumps the file MNT/vrf/NAME/cgroup.procs since that file
shows the process ids in the particular vrf cgroup.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
iplink_vrf has 2 functions used to validate a user given device name is
a VRF device and to return the table id. If the user string is not a
device name ip commands with a vrf keyword show a confusing error
message: "RTNETLINK answers: No such device".
Add a variant of rtnl_talk that does not display the "RTNETLINK answers"
message and update iplink_vrf to use it.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Adds support to configure BPF programs as nexthop actions via the LWT
framework.
Example:
ip route add 192.168.253.2/32 \
encap bpf out obj lwt_len_hist_kern.o section len_hist \
dev veth0
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Now that we made the BPF loader generic as a library, reuse it
for loading XDP programs as well. This basically adds a minimal
start of a facility for iproute2 to load XDP programs. There
currently only exists the xdp1_user.c sample code in the kernel
tree that sets up netlink directly and an iovisor/bcc front-end.
Since we have all the necessary infrastructure in place already
from tc side, we can just reuse its loader back-end and thus
facilitate migration and usability among the two for people
familiar with tc/bpf already. Sharing maps, performing tail calls,
etc works the same way as with tc. Naturally, once kernel
configuration API evolves, we will extend new features for XDP
here as well, resp. extend dumping of related netlink attributes.
Minimal example:
clang -target bpf -O2 -Wall -c prog.c -o prog.o
ip [-force] link set dev em1 xdp obj prog.o # attaching
ip [-d] link # dumping
ip link set dev em1 xdp off # detaching
For the dump, intention is that in the first line for each ip
link entry, we'll see "xdp" to indicate that this device has an
XDP program attached. Once we dump some more useful information
via netlink (digest, etc), idea is that 'ip -d link' will then
display additional relevant program information below the "link/
ether [...]" output line for such devices, for example.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
In case of an older kernel that doesn't set L2TP_ATTR_UDP_ZERO_CSUM6_{RX,TX}
the old hard-coded value is being preserved, since the attribute flag will be
missing.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
L2TP_ATTR_UDP_CSUM is read by the kernel as a NLA_FLAG value,
but is validated as a NLA_U8, so we will write it as an u8,
but the value isn't actually being read by the kernel.
It is written by the kernel as a NLA_U8, so we will read as
such.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
L2TP_ATTR_RECV_SEQ and L2TP_ATTR_SEND_SEQ are declared as NLA_U8
attributes in the kernel, so let's threat them accordingly.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
udp6_csum_{tx,rx}, tunnel and session are the only ones
currently used.
recv_seq, send_seq, lns_mode and data_seq are partially
implemented in a useless way.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
Adjusting iproute2 utility to support new macvlan link type mode called
"source".
Example of commands that can be applied:
ip link add link eth0 name macvlan0 type macvlan mode source
ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr del 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr flush
ip -details link show dev macvlan0
Based on previous work of Stefan Gula <steweg@gmail.com>
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Cc: steweg@gmail.com
v5:
- rebase and fix checkpatch
v4:
- add MACADDR_SET support
- skip FLAG_UNICAST / FLAG_UNICAST_ALL as this is not upstream
- fix man page
- Support adding, deleting and showing IP rules with UID ranges.
- Support querying per-UID routes via "ip route get uid <UID>".
UID range routing was added to net-next in 4fb7450683 ("Merge
branch 'uid-routing'")
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Commit 7b8179c780 ("iproute2: Add new command to ip link to
enable/disable VF spoof check") tried to add support for
IFLA_VF_SPOOFCHK in a backwards-compatible manner, but aparently overdid
it: parse_rtattr_nested() handles missing attributes perfectly fine in
that it will leave the relevant field unassigned so calling code can
just compare against NULL. There is no need to layback from the previous
(IFLA_VF_TX_RATE) attribute to the next to check if IFLA_VF_SPOOFCHK is
present or not. To the contrary, it establishes a potentially incorrect
assumption of these two attributes directly following each other which
may not be the case (although up to now, kernel aligns them this way).
This patch cleans up the code to adhere to the common way of checking
for attribute existence. It has been tested to return correct results
regardless of whether the kernel exports IFLA_VF_SPOOFCHK or not.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Greg Rose <grose@lightfleet.com>
Recently a new per-port flag was added which controls the flooding of
unknown multicast, this patch adds support for controlling it via iproute2.
It also updates the man pages with information about the new flag.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Adjusting iproute2 utility to support new macvlan link type mode called
"source".
Example of commands that can be applied:
ip link add link eth0 name macvlan0 type macvlan mode source
ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr del 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr flush
ip -details link show dev macvlan0
Based on previous work of Stefan Gula <steweg@gmail.com>
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Cc: steweg@gmail.com
iprule_flush() and iprule_list_or_save() both call function
rtnl_wilddump_request() and rtnl_dump_filter(). So merge them
together just like other files do.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Introduce a new API that exposes a list of vlans per VF (IFLA_VF_VLAN_LIST),
giving the ability for user-space application to specify it for the VF as
an option to support 802.1ad (VST QinQ).
We introduce struct vf_vlan_info, which extends struct vf_vlan and adds
an optional VF VLAN proto parameter.
Default VLAN-protocol is 802.1Q.
Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older kernel versions.
Suitable ip link tool command examples:
- Set vf vlan protocol 802.1ad (S-TAG)
ip link set eth0 vf 1 vlan 100 proto 802.1ad
- Set vf vlan S-TAG and vlan C-TAG (VST QinQ)
ip link set eth0 vf 1 vlan 100 proto 802.1ad vlan 30 proto 802.1Q
- Set vf to VST (802.1Q) mode
ip link set eth0 vf 1 vlan 100 proto 802.1Q
- Or by omitting the new parameter (backward compatible)
ip link set eth0 vf 1 vlan 100
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Add support to dump the mroute cache entry age if the show_stats (-s)
switch is provided.
Example:
$ ip -s mroute
(0.0.0.0, 239.10.10.10) Iif: eth0 Oifs: eth0
0 packets, 0 bytes, Age 245.44
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
The calling of netns_map_init() before command parsing introduced
a performance issue with large number of namespaces.
As commands such as add, del and exec do not need to iterate through
/var/run/netns it would be good not no build the cache before executing
these commands.
Example:
unpatched:
time seq 1 1000 | xargs -n 1 ip netns add
real 0m16.832s
user 0m1.350s
sys 0m15.029s
patched:
time seq 1 1000 | xargs -n 1 ip netns add
real 0m3.859s
user 0m0.132s
sys 0m3.205s
Signed-off-by: Anton Aksola <aakso@iki.fi>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
The original bond/bridge/vrf and slaves use same id, which make people
confused. Use bond/bridge/vrf_slave as id name will make code more clear.
Acked-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
In ip monitor, netns_map_init will check getnsid is supported or not.
But when /proc/self/ns/net does not exist, we just print out error
messages and exit. So user cannot use ip monitor anymore when
CONFIG_NET_NS is disabled:
# ip monitor
open("/proc/self/ns/net"): No such file or directory
If open "/proc/self/ns/net" failed, set have_rtnl_getnsid to false.
Fixes: d652ccbf81 ("netns: allow to dump and monitor nsid")
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
ftell() may return -1 in error case, which is not handled and
therefore pass a negative offset to fseek(). The return code of
fseek() is also not checked.
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
The new mode 'l3s' can be set like -
ip link add link <master> dev <IPvlan-slave> type ipvlan mode l3s
e.g. ip link add link eth0 dev ipvl0 type ipvlan mode l3s
Also did some trivial code restructuring.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
the maximum possible ICV length in a MACsec frame is 16 octects, not 32:
fix get_icvlen() accordingly, so that a proper error message is displayed
in case input 'icvlen' is greater than 16.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
use get_be64() in place of get_u64() when parsing input 'sci' parameter,
so that 'sci' can be entered using network byte order regardless the
endianness of target system; use ntohll() when printing out 'sci'. While
at it, improve documentation of 'sci' in ip-link.8.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
remove hardcoded base 10 parsing of 'port' parameter, update man page
and fix usage() functions as well. Fix misleading line in man page that
theoretically allowed specifying 'port' keyword right after 'sci' keyword.
Provide documentation of 'address' parameter in man pages and in usage()
functions as well.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Show which processes are using which tun/tap devices, e.g.:
$ ip -d tuntap
tun0: tun
Attached to processes: vpnc(9531)
vnet0: tap vnet_hdr
Attached to processes: qemu-system-x86(10442)
virbr0-nic: tap UNKNOWN_FLAGS:800
Attached to processes:
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
If we have multicast routes and do ip route show table all we'll get the
following output:
...
multicast ???/32 from ???/32 table default proto static iif eth0
The "???" are because the rtm_family is set to RTNL_FAMILY_IPMR instead
(or RTNL_FAMILY_IP6MR for ipv6). Add a simple workaround that returns the
real family based on the rtm_type (always RTN_MULTICAST for ipmr routes)
and the rtm_family. Similar workaround is already used in ipmroute, and
we can use this helper there as well.
After the patch the output is:
multicast 239.10.10.10/32 from 0.0.0.0/32 table default proto static iif eth0
Also fix a minor whitespace error and switch to tabs.
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
The code is a bit messy, as it starts with space after text and at some
point switches to space before text. But either way, printing space
before *and* after text almost certainly leads to printing more
whitespace than necessary.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Prior to this patch, If one route entry's RTA_PREFSRC and RTA_GATEWAY
both were NULL, it was supposed to be restored ONLY as a local address.
But as it didn't check tb[RTA_PREFSRC] when restoring local networks,
rtattr_cmp would return a success if it was NULL, this route entry would
be restored again as a local network.
This patch is to add tb[RTA_PREFSRC] check when restoring local networks.
Fixes: 74af8dd962 ("ip route: restore route entries in correct order")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Phil Sutter <phil@nwl.cc>
Currently, the `ip ila` command tries to initialize a genl context
even when we just want to see the help for the command, which doesn't
require to talk to the kernel at all.
Delay genl initialization, which can fail if the module isn't loaded,
until the point where we will actually need it.
Fixes: ec71cae0bb ("ila: Support for configuring ila to use netfilter hook")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Currently, the `ip fou` command tries to initialize a genl context even
when we just want to see the help for the command, which doesn't require
to talk to the kernel at all.
Delay genl initialization, which can fail if the module isn't loaded,
until the point where we will actually need it.
Fixes: 6928747b6e ("ip fou: Support to configure foo-over-udp RX")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Currently, the `ip macsec` command tries to initialize a genl context
even when we just want to see the help for the command, which doesn't
require to talk to the kernel at all.
Delay genl initialization, which can fail if the module isn't loaded,
until the point where we will actually need it.
Fixes: b26fc590ce ("ip: add MACsec support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
All users of genl have the same code to open a genl socket and resolve
the family for their specific protocol. Introduce a helper to initialize
the handle, and use it in all the genl code.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
since kernel driver has valid default values for 'cipher' and 'icvlen',
there is no need for requiring users to specify both of them when a new
link is added. Also, prompt an error message and exit with appropriate
exit status in case of unsupported cipher suite.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
fix output of "ip address help" and "ip link help". Update TYPE list in man
pages ip-address.8 and ip-link.8 as well.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Since parse_rtattr_flags() calls memset already, there is no need for
callers to do so themselves.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
This big patch was compiled by vimgrepping for memset calls and changing
to C99 initializer if applicable. One notable exception is the
initialization of union bpf_attr in tc/tc_bpf.c: changing it would break
for older gcc versions (at least <=3.4.6).
Calls to memset for struct rtattr pointer fields for parse_rtattr*()
were just dropped since they are not needed.
The changes here allowed the compiler to discover some unused variables,
so get rid of them, too.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Sometimes we cannot restore route entries, because in kernel
[1] fib_check_nh()
[2] fib_valid_prefsrc()
cause some routes to depend on existence of others while adding.
For example, we saved all the routes, and flushed all tables
[a] default via 192.168.122.1 dev eth0
[b] 192.168.122.0/24 dev eth0 src 192.168.122.21
[c] broadcast 127.0.0.0 dev lo table local src 127.0.0.1
[d] local 127.0.0.0/8 dev lo table local src 127.0.0.1
[e] local 127.0.0.1 dev lo table local src 127.0.0.1
[f] broadcast 127.255.255.255 dev lo table local src 127.0.0.1
[g] broadcast 192.168.122.0 dev eth0 table local src 192.168.122.21
[h] local 192.168.122.21 dev eth0 table local src 192.168.122.21
[i] broadcast 192.168.122.255 dev eth0 table local src 192.168.122.21
Now start to restore them:
If we want to add [a], we have to add [b] first, as [1] and
'via 192.168.122.1' in [a].
If we want to add [b], we have to add [h] first, as [2] and
'src 192.168.122.21' in [b].
So the correct order to restore should be like:
[e][h] -> [b][c][d][f][g][i] -> [a]
This patch fixes it by traversing the file 3 times, it only restores
part of them in each run according to the following conditions, to
make sure every entry can be restored successfully.
1. !gw && (!fib_prefsrc || fib_prefsrc == cfg->fc_dst)
2. !gw && (fib_prefsrc != cfg->fc_dst)
3. gw
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Add two NLA's that allow configuration of Infiniband node or port GUIDs
by referencing the IPoIB net device set over the physical function. The
format to be used is as follows:
ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78
Signed-off-by: Eli Cohen <eli@mellanox.com>
Add vrf keyword to 'ip route' commands. Allows:
1. Users can list routes by VRF name:
$ ip route show vrf NAME
VRF tables have all routes including local and broadcast routes.
The VRF keyword filters LOCAL and BROADCAST routes; to see all
routes the table option can be used. Or to see local routes only
for a VRF:
$ ip route show vrf NAME type local
2. Add or delete a route for a VRF:
$ ip route {add|delete} vrf NAME <route spec>
3. Do a route lookup for a VRF:
$ ip route get vrf NAME ADDRESS
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add ipvrf_get_table to lookup table id for device name. Returns 0
on any error or if name is not a VRF device.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add vrf keyword to 'ip neigh' commands. Allows listing neighbor
entries for all links associated with a given VRF.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add vrf keyword to 'ip link' and 'ip addr' commands (common list code).
Allows:
1. Adding a link to a VRF
$ ip link set NAME vrf NAME
Removing a link from a VRF still uses 'ip link set NAME nomaster'
2. Showing links associated with a VRF:
$ ip link show vrf NAME
3. List addresses associated with links in a VRF
$ ip -br addr show vrf red
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Since the function won't ever change the data 'kind' is pointing at, it
can sanely be made const.
Fixes: e0513807f6 ("ip-address: Support filtering by slave type, too")
Suggested-by: Stephen Hemminger <shemming@brocade.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Currently a timeout is multiplied by HZ in user-space and
then it multiplied by HZ in kernel-space.
$ ./ip/ip r add 2002::0/64 dev veth1 expires 10
$ ./ip/ip -6 r
2002::/64 dev veth1 metric 1024 linkdown expires 996sec pref medium
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Stephen Hemminger <shemming@brocade.com>
Fixes: 68eede2505 ("route: allow routes to be configured with expire values")
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
This patch allows to query all interfaces enslaved to a bridge or bond
using the following syntax:
| ip addr show type bridge_slave
Filtering has to be done in userspace since the kernel does not support
filtering on IFLA_INFO_SLAVE_KIND.
Functionality introduced in this patch is not fully complete since it
does not allow to match on type and slave type at the same time, but it
doesn't prevent implementing a dedicated slave_type match, either.
Signed-off-by: Phil Sutter <phil@nwl.cc>
On Tue, Jun 21, 2016 at 06:18 PM CEST, Phil Sutter <phil@nwl.cc> wrote:
> By combining the attribute extraction and check for existence, the
> additional indentation level in the 'else' clause can be avoided.
>
> In addition to that, common actions for 'daddr' are combined since the
> function returns if neither of the branches are taken.
>
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
> ip/tcp_metrics.c | 45 ++++++++++++++++++---------------------------
> 1 file changed, 18 insertions(+), 27 deletions(-)
>
> diff --git a/ip/tcp_metrics.c b/ip/tcp_metrics.c
> index f82604f458ada..899830c127bcb 100644
> --- a/ip/tcp_metrics.c
> +++ b/ip/tcp_metrics.c
> @@ -112,47 +112,38 @@ static int process_msg(const struct sockaddr_nl *who, struct nlmsghdr *n,
> parse_rtattr(attrs, TCP_METRICS_ATTR_MAX, (void *) ghdr + GENL_HDRLEN,
> len);
>
> - a = attrs[TCP_METRICS_ATTR_ADDR_IPV4];
> - if (a) {
> + if ((a = attrs[TCP_METRICS_ATTR_ADDR_IPV4])) {
Copy the pointer inside the branch?
Same gain on indentation while keeping checkpatch happy.
I only compile-tested the patch below.
Thanks,
Jakub
I forgot to change the variable in the conditional, too.
Fixes: 8fe58d5894 ("iplink: Check address length via netlink")
Signed-off-by: Phil Sutter <phil@nwl.cc>
This is a feature which was lost during the conversion to netlink
interface: If the device exists and a user tries to change the link
layer address, query the kernel for the old address first and reject the
new one if sizes differ.
This patch adds the same check when setting VF address by assuming same
length as PF device.
Note that at least for VFs the check can't be done in kernel space since
struct ifla_vf_mac lacks a length field and due to netlink padding the
exact size can't be communicated to the kernel.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Kernel commit 96c63fa7393d ("net: Add l3mdev rule") added support for
the FRA_L3MDEV attribute. The attribute enables use of l3mdev rules
which mean 'get table id from l3 master device'. This patch adds
support to iproute2 to show, add and delete rules with this attribute.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Not sure why this was limited to ip-link before. It is semantically
equal to the 'master' keyword, which is not restricted at all.
The man page and help text adjustments include the 'master' keyword as
well since that is also supported but wasn't documented before.
Cc: Vadim Kochan <vadim4j@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Extend ip-link to create MACsec devices
ip link add link <master> <macsec> type macsec [options]
Add `ip macsec` command to configure receive-side secure channels and
secure associations within a macsec netdevice.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Phil Sutter <phil@nwl.cc>
Similar to the Linux kernel and perf add infrastructure to reduce the
amount of output tossed to a user during a build. Full build output
can be obtained with 'make V=1'
Builds go from:
make[1]: Leaving directory `/home/dsa/iproute2.git/lib'
make[1]: Entering directory `/home/dsa/iproute2.git/ip'
gcc -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -c -o ip.o ip.c
gcc -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -c -o ipaddress.o ipaddress.c
to:
...
AR libutil.a
ip
CC ip.o
CC ipaddress.o
...
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
A new HSR version was added in 4.7 that can be enabled
via iproute2. Per default the old version is selected,
however, with "ip link add [..] type hsr [..] version 1"
the newer version can be enabled.
Signed-off-by: Peter Heise <peter.heise@airbus.com>
Kernel gained support for filtering link dumps with commit dc599f76c22b
("net: Add support for filtering link dump by master device and kind").
Add support to ip link command. If a user passes master device or
kind to ip link command they are added to the link dump request message.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Since we can only configure unicast, we probably want to be able to
display unicast, rather than multicast.
Fixes: 906ac5437a ("geneve: add support for IPv6 link partners")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Display only attributes that are relevant when a GRE interface is in
'external' mode instead of the default values (which are ignored by the
kernel even if passed back).
Fixes: 926b39e1fe ("gre: add support for collect metadata flag")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
For GRE interfaces in 'external' mode, the kernel ignores all manual
settings like remote IP address or TTL. However, for some of those
attributes, kernel checks their value and does not allow them to be zero
(even though they're ignored later).
Currently, 'ip link' always includes all attributes in the netlink message.
This leads to problem with creating interfaces in 'external' mode. For
example, this command does not work:
ip link add gre1 type gretap external
and needs a bogus remote IP address to be specified, as the kernel enforces
remote IP address to be either not present, or not null.
Ignore the parameters that do not make sense in 'external' mode.
Unfortunately, we cannot error out, as there may be existing deployments
that workarounded the bug by specifying bogus values.
Fixes: 926b39e1fe ("gre: add support for collect metadata flag")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Use the same rtnl_dump_request_n call as the show. The rtnl_wilddump_request
assumes the type uses an ifinfomsg which is not the case for the neighbor
table.
Signed-off-by: Jeff Harris <jefftharris@gmail.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
It doesn't make sense to use external control plane and fill internal FDB at
the same time. It's even an illegal combination for VXLAN-GPE.
Just switch off learning when 'external' is specified.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Follow-up for kernel commit 8eb3b99554b8 ("geneve: support setting
IPv6 flow label") to allow setting the label for the device config.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Follow-up for kernel commit e7f70af111f0 ("vxlan: support setting
IPv6 flow label") to allow setting the label for the device config.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Enable support for configuring outer UDP checksums on Geneve tunnels:
ip link add type geneve id 10 remote 10.0.0.2 udpcsum
Signed-off-by: Jesse Gross <jesse@kernel.org>
On recent kernels, UDP checksum computation has become more efficient and
the default behavior was changed, however, the ip command overrides this
by always specifying a particular behavior.
If the user does not specify that UDP checksums should either be computed
or not then we don't need to send an explicit netlink message - the kernel
can just use its default behavior.
Signed-off-by: Jesse Gross <jesse@kernel.org>
There is only a single user who needs it to be reentrant (not really,
but it's safer like this), add rt_addr_n2a_r() for it to use.
Signed-off-by: Phil Sutter <phil@nwl.cc>
There are only three users which require it to be reentrant, the rest is
fine without. Instead, provide a reentrant format_host_r() for users
which need it.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This adds two helper functions which map a given data field to a color,
so color_fprintf() statements don't have to be duplicated with only a
different color value depending on that data field's value. In order for
this to work in a generic way, COLOR_CLEAR has been added to serve as a
fallback default of uncolored output.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Add support for ignore route attribute, and refine the code to use
rta_getattr_* function to get attribute value.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
To not make the output overly confusing, list them in a definition of
the STATE placeholder which is already used in the show/flush syntax but
wasn't explained before.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This is a bit pedantic, but brackets ([]) show optional values and since
TYPE must not become empty, they're not suited to surround the type
keyword choices. Use curly braces instead.
Also add some missing whitespace to the parameter list above.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Neither 'list' nor 'flush' actions accept parameters, and with given
prefix the action keyword is not optional anymore.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Although the ip command accepts both "neighbor" and "neighbour" as
subcommand, I assume it's sufficient to list it in help text as just
"neigh" like ip.8 does.
Signed-off-by: Phil Sutter <phil@nwl.cc>
The help text was misleading: One could think it is possible to list
rules by selector, which would be nice but isn't. This change also
clarifies that 'ip rule' defaults to 'list' if no further arguments are
given.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Add IFLA_VF_TRUST message to trust the VF.
PF can accept some privileged operation from the trusted VF.
For example, ixgbe PF doesn't allow to enable VF promiscuous mode until
the VF is trusted because it may hurt performance.
To trust VF.
# ip link set dev eth0 vf 1 trust on
To untrust VF.
# ip link set dev eth0 vf 1 trust off
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Add support to be able to view and change IFLA_BRPORT_MULTICAST_ROUTER port
attribute.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Export all the read-only values that get returned about a bridge port
such as the timers, the ids, designated_port and cost,
topology_change_ack and config_pending. For the bridge ids the
br_dump_bridge_id function is exported from iplink_bridge.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
netns_map_add() does a malloc of (sizeof (struct nsid_cache) +
strlen(name)) and then proceed with strcpy() of name into the
zero-length member at the end of the nsid_cache structure. The
nul-terminator is written outside of the allocated memory and may
overwrite the allocator's internal structure.
This can trigger a segmentation fault on i386 uclibc with names of size 8:
after the corruption occurs, the call to closedir() on netns_map_init()
crashes while freeing the DIR structure.
Here is the relevant valgrind output:
==1251== Memcheck, a memory error detector
==1251== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1251== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
info
==1251== Command: ./ip netns
==1251==
==1251== Invalid write of size 1
==1251== at 0x4011975: strcpy (in
/usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==1251== by 0x8058B00: netns_map_add (ipnetns.c:181)
==1251== by 0x8058E2A: netns_map_init (ipnetns.c:226)
==1251== by 0x8058E79: do_netns (ipnetns.c:776)
==1251== by 0x804D9FF: do_cmd (ip.c:110)
==1251== by 0x804D814: main (ip.c:300)
Support for the new rx_nohandler statistic.
This code is designed to handle the case where the kernel reported statistic
structure is smaller than the larger structure in later releases (and vice versa).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds support to add mpls multipath
routes.
example:
ip -f mpls route add 100 \
nexthop as 200 via inet 10.1.1.2 dev swp1 \
nexthop as 700 via inet 10.1.1.6 dev swp2
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
It seems that I've made a mistake when I exported these, instead of a
space in the end I've put a newline character which is wrong and breaks
the single line output.
Fixes: 7d6bc3b87a ("bonding: export 3ad actor and partner port state")
Reported-by: Sam Tannous <stannous@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_NF_CALL_(IP|IP6|ARP)TABLES
attributes in iproute2 so it can change their values.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_STARTUP_QUERY_INTVL
attribute in iproute2 so it can change the startup query interval.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_QUERY_RESPONSE_INTVL
attribute in iproute2 so it can change the query response interval.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_QUERY_INTVL attribute
in iproute2 so it can change the query interval.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_QUERIER_INTVL
attribute in iproute2 so it can change the querier interval.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_MEMBERSHIP_INTVL
attribute in iproute2 so it can change the membership interval.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_LAST_MEMBER_INTVL
attribute in iproute2 so it can change the last member interval.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_STARTUP_QUERY_CNT
attribute in iproute2 so it can change the startup query count.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_LAST_MEMBER_CNT
attribute in iproute2 so it can change the last member count value.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_HASH_MAX attribute
in iproute2 so it can change the maximum hashed entries.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_HASH_ELASTICTITY
attribute in iproute2 so it can change the hash elasticity value.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_QUERIER attribute
in iproute2 so it can toggle the mcast querier value.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_QUERY_USE_IFADDR
attribute in iproute2 so it can toggle the multicast_query_use_ifaddr val.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_SNOOPING attribute
in iproute2 so it can change the multicast snooping value.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_ROUTER attribute
in iproute2 so it can change the multicast router value.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_VLAN_DEFAULT_PVID
attribute in iproute2 so it can change the default pvid.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_GROUP_ADDR attribute
in iproute2 so it can change the group address.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_GROUP_FWD_MASK attribute
in iproute2 so it can change the group forwarding mask.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Netlink already provides hello_timer, tcn_timer, topology_change_timer
and gc_timer, so let's make them visible.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This change add the ability to create lwt/flow based/externally
controlled geneve device and to select the udp destination port used
by a full geneve tunnel.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
'ip monitor all' is broken on older kernels.
This patch fixes 'ip monitor all' to match
'all' and not 'all-nsid'.
It moves parsing arg 'all-nsid' to after parsing
'all'.
Before:
$ip monitor all
NETLINK_LISTEN_ALL_NSID: Protocol not available
After:
$ip monitor all
[NEIGH]Deleted 10.0.0.1 dev eth1 lladdr c4:54:44:4f:b2:dd STALE
Fixes: 449b824ad1 ("ipmonitor: allows to monitor in several netns")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
the warning was:
iproute.c:301:12: warning: 'val' may be used uninitialized in this
function [-Wmaybe-uninitialized]
features &= ~RTAX_FEATURE_ECN;
^
iproute.c:575:10: note: 'val' was declared here
__u32 val;
^
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Options 'group' and 'remote' cannot take 'any' as value but 'local' can.
Signed-off-by: Thomas Faivre <thomas.faivre@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This patch replaces exits with returns in iplink
command. Helps to continue on errors when
invoked with ip -force -batch.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
"random" is a new IPv6 addrgenmode, enabling "stable_secret" type
addresses with an auto-generated secret.
$ ip link set eth0 addrgenmode random
$ ip -d link show dev eth0
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
link/ether 00:21:86:a3:25:7d brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode random
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
It is possible to switch to another addrgenmode after setting a
valid secret. Allow switching back without reconfiguring the
secret for completeness.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
I repeatedly failed to get this right, so now I have to clean up my mess
afterwards.
Fixes: 7d6aadcd0a ("ip{,6}tunnel: have a shared stats parser/printer")
Signed-off-by: Phil Sutter <phil@nwl.cc>
This has a slight side-effect of not aborting when /proc/net/dev is
malformed, but OTOH stats are not parsed for uninteresting interfaces.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Currently ip6 encap support for lwtunnel is missing.
This patch implement it, mostly duplicating the ipv4 parts.
Also be sure to insert a space after the encap type, when
showing lwtunnel, to avoid the tunnel type and the following
argument being merged into a single word.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch add support for IFLA_VXLAN_COLLECT_METADATA via the
'external' keyword to the vxlan link.
Also enforce mutual exclusion between 'vni' and 'external'.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently parse_encap_ip() does not update correctly argv/argc;
if multiple lwtunnel arguments are provided, the parsing fails after
the first one, i.e.
ip route add 172.16.101.0/24 dev vxlan1 encap ip id 42 dst 192.168.255.1
fails with:
Error: either "to" is duplicate, or "dst" is a garbage.
This commit addresses the issue, stepping to next argument at each iteration
of the parsing loop.
Fixes: 1e5293056a ("lwtunnel: Add encapsulation support to ip route")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Commit 0f7543322c ("route: ignore RTAX_HOPLIMIT of value -1")
accidentally reordered fprintf statements. This patch restores the
original ordering.
Fixes: 0f7543322c ("route: ignore RTAX_HOPLIMIT of value -1")
Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch:
- Adds a utility function for parsing a 64 bit address
- Adds a utility function for converting a 64 bit address to ASCII
- Adds and ILA encap type in lwt tunnels
Signed-off-by: Tom Herbert <tom@herbertland.com>
Currently, the table id for VRF devices requires an integer. Convert
it to use rtnl_rttable_a2n which handles table names from the iproute2
directory.
This also fixes a bug in the original commit where table name are not
properly handled.
Fixes: 15faa0a30b ("add support for VRF device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Older kernels use -1 internally as indicator to use the sysctl default,
but they still export the setting. Newer kernels use 0 to indicate that
(which is why the conversion from -1 to 0 was done here), but they also
stopped exporting the value. Since the meaning of -1 is clear, treat it
equally like default on newer kernels (which is to not print anything).
Signed-off-by: Phil Sutter <phil@nwl.cc>
On 24.11.2015 02:26, Stephen Hemminger wrote:
> On Thu, 12 Nov 2015 21:10:08 +0000
> Konstantin Shemyak <konstantin@shemyak.com> wrote:
>
>> When creating an IP tunnel over IPv6, the address family must be passed in
>> the option, e.g.
>>
>> ip -6 tunnel add mode ip6gre local 1::1 remote 2::2
>>
>> This makes it impossible to create both IPv4 and IPv6 tunnels in one batch.
>>
>> In fact the address family option is redundant here, as each tunnel mode is
>> relevant for only one address family.
>> The patch determines whether the applicable address family is AF_INET6
>> instead of the default AF_INET and makes the "-6" option unnecessary for
>> "ip tunnel add".
>>
>> Signed-off-by: Konstantin Shemyak <konstantin@shemyak.com>
>> ---
>> ip/iptunnel.c | 26 ++++++++++++++++++++++++++
>> testsuite/tests/ip/tunnel/add_tunnel.t | 14 ++++++++++++++
>> 2 files changed, 40 insertions(+)
>> create mode 100755 testsuite/tests/ip/tunnel/add_tunnel.t
>>
>> diff --git a/ip/iptunnel.c b/ip/iptunnel.c
>> index 78fa988..7826a37 100644
>> --- a/ip/iptunnel.c
>> +++ b/ip/iptunnel.c
>> @@ -629,8 +629,34 @@ static int do_6rd(int argc, char **argv)
>> return tnl_6rd_ioctl(cmd, medium, &ip6rd);
>> }
>>
>> +static int tunnel_mode_is_ipv6(char *tunnel_mode) {
>> + char *ipv6_modes[] = {
>> + "ipv6/ipv6", "ip6ip6",
>> + "vti6",
>> + "ip/ipv6", "ipv4/ipv6", "ipip6", "ip4ip6",
>> + "ip6gre", "gre/ipv6",
>> + "any/ipv6", "any"
>> + };
>> + int i;
>> +
>> + for (i = 0; i < sizeof(ipv6_modes) / sizeof(char *); i++) {
>> + if (strcmp(ipv6_modes[i], tunnel_mode) == 0)
>> + return 1;
>> + }
>> + return 0;
>> +}
>> +
>
> The ipv6_modes table should be static const.
Thank you for the note! attached the corrected patch.
> Also is it possible to use strstr for ipv6 and ip6 or even strchr(tunnel_mode, '6')
> to simplify this?
There is IPv6 tunnel mode 'any', and IPv4 tunnel mode 'ipv6/ip' (aka
'sit'). It looks to me that attempts to find some substring match
would not make the code much shorter, but definitely less readable.
Konstantin Shemyak.
>From 42d27db0055c3a114fe6eb86d680bef9ec098ad4 Mon Sep 17 00:00:00 2001
From: Konstantin Shemyak <konstantin@shemyak.com>
Date: Thu, 12 Nov 2015 20:52:02 +0200
Subject: [PATCH] Tunnel address family is determined from the tunnel mode
When the tunnel mode already tells the IP address family, "ip tunnel"
command determines it and does not require option "-4"/"-6" to be passed.
This makes possible creating both IPv4 and IPv6 tunnels in one batch.
Signed-off-by: Konstantin Shemyak <konstantin@shemyak.com>
This patch adds support to remote checksum checksum offload
to VXLAN. This patch adds remcsumtx and remcsumrx to ip vxlan
configuration to enable remote checksum offload for transmit
and receive on the VXLAN tunnel.
https://tools.ietf.org/html/draft-herbert-vxlan-rco-00
Example:
ip link add name vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0 \
udpcsum remcsumtx remcsumrx
Testing:
Ran single netperf over mlnx4 to illustrate the effest:
- Without RCO (UDP csum set to zero)
4335.99 Mbps
- With RCO enabled
7661.81 Mbps
Signed-off-by: Tom Herbert <tom@herbertland.com>
Technically, the range of possible hoplimit values are defined by IPv4
and IPv6 header formats. Both define the field to be eight bits in size,
which leads to a value range of [0;255]. Setting a packet's hoplimit
field to 0 though makes not much sense, as the next hop would
immediately drop the packet. Therefore Linux uses 0 as a special value
indicating to use the system's default hoplimit (configurable via
sysctl). In iproute, setting the hoplimit of a route to 0 is equivalent
to omitting the hoplimit parameter alltogether, so it is actually not
necessary to allow that value to be specified, but keep it anyway for
backwards compatibility.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Linux version 3.1 introduced a consistency check for netlink dumps in
commit 670dc28 ("netlink: advertise incomplete dumps"). This bites
iproute2 when flushing more addresses than can fit into a single
RTM_GETADDR response. To silence the spurious error message "Dump was
interrupted and may be inconsistent.", advise rtnl_dump_filter_l() to
not care about NLM_F_DUMP_INTR.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Since it's no longer relevant whether an IP address is primary or
secondary when flushing, ipaddr_flush() can be simplified a bit.
Signed-off-by: Phil Sutter <phil@nwl.cc>
I found recently that, if I disabled address promotion in the kernel, that
ip addr flush dev <dev>
would fail with an EADDRNOTAVAIL errno (though the flush operation would in fact
flush all addresses from an interface properly)
Whats happening is that, if I add a primary and multiple secondary addresses to
an interface, the flush operation first ennumerates them all with a GETADDR |
DUMP operation, then sends a delete request for each address. But the kernel,
having promotion disabled, deletes all secondary addresses when the primary is
removed. That means, that several delete requests may still be pending in the
netlink request for addresses that have been removed on our behalf, resulting in
EADDRNOTAVAIL return codes.
It seems the simplest thing to do is to understand that EADDRUNAVAIL isn't a
fatal outcome on a flush operation, as it just indicates that an address which
you want to remove is already removed, so it can safely be ignored.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
- Drop 'extern' keyword from all function prototypes.
- Make line breaking of print_* functions consistent.
- Make print_ntable() and ipntable_reset_filter() static and remove
their declaration.
- Drop declaration of non-existent ipaddr_list() and iproute_monitor().
Signed-off-by: Phil Sutter <phil@nwl.cc>
Since p->name is only IFNAMSIZ bytes, do not copy more than IFNAMSIZ - 1
bytes into it so there remains at least a single null byte in the end.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Instead of parsing an unsigned integer and checking boundaries, simply
parse u8. This and the added ttl alias 'hlim' provide consistency with
ip6tunnel.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This makes output consistent with iptunnel, also supporting reverse DNS
lookup for remote address if requested.
Signed-off-by: Phil Sutter <phil@nwl.cc>
In iptunnel, declare loop variables inside the loop as done in
ip6tunnel.
Fix and simplify goto logic in ip6tunnel:
- Failure to read over header lines would have left fp opened.
- By returning directly upon fopen() failure, fp can be closed
unconditionally in the end.
Use the same goto logic in iptunnel, as well.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Instead of duplicating the same code six times (key, ikey and okey in
iptunnel and ip6tunnel), have a common parsing routine. This has the
added benefit of having the same verbose error message in ip6tunnel as
well as iptunnel.
I'm not sure if parsing an IPv4 address as key makes sense for
ip6tunnel, but the code was there before so this patch at least doesn't
make it worse.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Put whitespace in the beginning of optional parts, not as suffix
anywhere. Also drop double whitespaces in between words.
Signed-off-by: Phil Sutter <phil@nwl.cc>
If get_rt_realms() fails, try to get a possible raw u32 realms
value for the u32 RTA_FLOW/FRA_FLOW attribute, as it might be
useful to directly configure the hex value itself. And only if
that fails, then bail out.
The source realm is provided in the upper u16 (mask: 0xffff0000)
and the destination realm through the lower u16 part (mask:
0x0000ffff). This can be useful for tc's bpf realm matcher, but
also a full hex/mask param can be provided already for matching
through iptables' --realm cmdline option, for example.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch adds save and restore commands to "ip rule"
similar the same is made in commit f4ff11e3e2 for "ip route".
The feature is useful in checkpoint/restore for container
migration, also it may be helpful in some normal situations.
Signed-off-by: Kirill Tkhai <ktkhai@odin.com>
This has been inconsistent since the beginning of Git and seems to be
merely a documentation leftover, therefore just remove it from help
output and man page.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch adds support to parse and print lwtunnel
encapsulation attributes attached to routes for MPLS
and IP tunnels.
example:
Add ipv4 route with mpls encap attributes:
Examples:
MPLS:
$ ip route add 40.1.2.0/30 encap mpls 200 via inet 40.1.1.1 dev eth3
$ ip route show
40.1.2.0/30 encap mpls 200 via 40.1.1.1 dev eth3
Add ipv4 multipath route with mpls encap attributes:
$ ip route add 10.1.1.0/30 nexthop encap mpls 200 via 10.1.1.1 dev eth0 \
nexthop encap mpls 700 via 40.1.1.2 dev eth3
$ ip route show
10.1.1.0/30
nexthop encap mpls 200 via 10.1.1.1 dev eth0 weight 1
nexthop encap mpls 700 via 40.1.1.2 dev eth3 weight 1
IP:
$ ip route add 10.1.1.1/24 encap ip id 200 dst 20.1.1.1 dev vxlan0
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jiri Benc <jbenc@redhat.com>
It helps to grep for one string "Deleted" when monitoring all events.
Fixes: 6ea3ebafe0 ("iproute2: inform user when a neighbor is removed")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This flag is allowed for devices in passthru mode to prevent forcing the
underlying interface into promiscuous mode.
Signed-off-by: Phil Sutter <phil@nwl.cc>
After eliminating the minor differences in both files which existed
solely because features/fixes were applied to only one of them and not
the other, the remaining differences were in function naming and error
messages. The latter is addressed by using the 'id' field of struct
link_util.
Fold both files into one in order to share common code and eliminate the
chance of having fixes/enhancements applied to only one of them.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Add ifindex to dump request when filtering by device. If the kernel
supports it adding the index to the request limits the amount of data
the kernel pushes to userpsace.
The feature exists in userspace already, so no need to warn the user
if kernel side support does not exist. Using the kernel side filter
makes the request more efficient.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add support for filtering neighbor dumps by master device. Kernel side
support provided by commit 21fdd092acc7. Since the feature is not
available in older kernels the user is given a warning message if the
kernel does not support the request.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Device names that match "help" or a prefix thereof should be allowed anywhere
a device name can be used. Note that a suitable keyword ("dev" or "name", the
latter for "ip tunnel") has to be used in these cases to resolve ambiguities.
Signed-off-by: Christoph Schulz <develop@kristov.de>
Reported-by: Leonhard Preis <leonhard@pre.is>
Reported-by: Wilhelm Wijkander <lists@0x5e.se>
The brief format does not honer the master and type filters:
$ ip link show master vrf-mgmt
7: dummy0: <BROADCAST,NOARP,SLAVE> mtu 1500 qdisc noop master vrf-mgmt state DOWN mode DEFAULT group default qlen 1000
link/ether 66:39:cc:2b:e9:bd brd ff:ff:ff:ff:ff:ff
$ ip -br link show master vrf-mgmt
lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
eth0 UP 08:00:27🇩🇪14:c8 <BROADCAST,MULTICAST,UP,LOWER_UP>
eth1 UP 08:00:27:87:02:f1 <BROADCAST,MULTICAST,UP,LOWER_UP>
eth2 UP 08:00:27:61:1e:fd <BROADCAST,MULTICAST,UP,LOWER_UP>
vrf-blue UNKNOWN a6:3f:09:34:7e:74 <NOARP,MASTER,UP,LOWER_UP>
vrf-red DOWN fe:a2:2d:e1:bc:ac <NOARP,MASTER>
dummy0 DOWN 66:39:cc:2b:e9:bd <BROADCAST,NOARP,SLAVE>
dummy1 DOWN 4a:4f:13:91:64:b1 <BROADCAST,NOARP,SLAVE>
dummy2 DOWN b2:4f:b6💿bd:a6 <BROADCAST,NOARP>
dummy3 DOWN 1e:06:3d:40:b8:c2 <BROADCAST,NOARP,SLAVE>
vrf-mgmt DOWN ce:b2:74:41:21:df <NOARP,MASTER>
With this patch the expected output is shown:
$ ip -br link show master vrf-mgmt
dummy0 DOWN 66:39:cc:2b:e9:bd <BROADCAST,NOARP,SLAVE>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Currently 'ip route get' does not show the table the lookup result comes
from and prior to kernel commit c36ba6603a11 the response from the kernel
was hardcoded to the main table. From the discussion this appears to be
a leftover from the route cache where the cached entry lost the table id
and so the result was hardcoded to main table.
c36ba6603a11 added the RTM_F_LOOKUP_TABLE flag to maintain that behavior
but to allow new tools to ask for the actual table id for the lookup.
This patch adds that flag to ip route get request and if the result is
not the main table shows the table id.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Currently when we specify AF_INET6 when it is disabled, we will get
all routes.
For example, we can boot kernel with ipv6.disable=1 and try to get ipv6
routes:
$ ip -6 route show
default via 192.168.122.1 dev eth0 proto static metric 100
192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.141 metric 100
Here are ipv4 routes and this is unexpected behaviour.
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Commit 0532555 ('Support "ip link add help" for rtnl_link API') added a
check for specified help parameter. Though due to the place where it has
been added to, it is not possible anymore to force a given parameter to
be interpreted as interface name by prefixing it with 'dev '. Fix this
by forcing whatever follows 'dev' to be presumed as interface name.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch adds support for bridge vlan_protocol.
Example:
$ ip link set br0 type bridge vlan_protocol 802.1ad
$ ip -d link show br0
4: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP mode DEFAULT group default qlen 1000
link/ether 44:37:e6🆎cd:ef brd ff:ff:ff:ff:ff:ff promiscuity 0
bridge forward_delay 0 hello_time 200 max_age 2000 ageing_time 30000
stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1ad
addrgenmode eui64
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
This adds support for slightly less output than is normally provided by
'ip link show' and 'ip addr show'. This is a bit better when you have a
host with lots of interfaces. Sample output:
$ ip -br link show
lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
p2p1 UP 08:00:27:ee:0b:3b <BROADCAST,MULTICAST,UP,LOWER_UP>
p7p1 UP 08:00:27:9d:62:9f <BROADCAST,MULTICAST,UP,LOWER_UP>
p8p1 DOWN 08:00:27:dc:d8:ca <NO-CARRIER,BROADCAST,MULTICAST,UP>
p9p1 UP 08:00:27:76:d9:75 <BROADCAST,MULTICAST,UP,LOWER_UP>
p7p1.100@p7p1 UP 08:00:27:9d:62:9f <BROADCAST,MULTICAST,UP,LOWER_UP>
$ ip -br -4 addr show
lo UNKNOWN 127.0.0.1/8
p2p1 UP 192.168.56.2/24
p7p1 UP 70.0.0.1/24
p8p1 DOWN 80.0.0.1/24
p9p1 UP 10.0.5.15/24
p7p1.100@p7p1 UP 200.0.0.1/24
$ ip -br -6 addr show
lo UNKNOWN ::1/128
p2p1 UP fe80::a00:27ff:feee:b3b/64
p7p1 UP 7000::1/8 fe80::a00:27ff:fe9d:629f/64
p8p1 DOWN 8000::1/8
p9p1 UP fe80::a00:27ff:fe76:d975/64
p7p1.100@p7p1 UP fe80::a00:27ff:fe9d:629f/64
$ ip -br addr show p7p1
p7p1 UP 70.0.0.1/24 7000::1/8 fe80::a00:27ff:fe9d:629f/64
v2: Now with color support!
v3: Better field width estimation (except netdev names to keep output at a
decent width) and whitespace fixup.
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Allow user to create a vrf device and specify its table binding.
Based on the iplink_vlan implementation.
Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
* Improve manual page synopsis and built-it help
* Use full subcommand names (e.g. 'address' and 'maddress')
* Specify when IPv4, IPv6 or both are affected
* Add lifetimes, home and nodad
* Remove any remaining excess spaces
Commit 43d29f7 substantially improves generated ip-address.8 instead of
ip-address.8.in and commit e419f2d removes the generated one losing the
improvements entirely. This commit recovers the lost changes, adapts
them to the current manual page and adds more man page and help
improvements.
Original commit by: Kenyon Ralph <kenyon@kenyonralph.com>
This patch implements support for the IFLA_BR_VLAN_FILTERING attribute
in iproute2 so it can enable/disable vlan_filtering.
Example:
$ ip link set br0 type bridge vlan_filtering 1
$ ip -d link show br0
6: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP mode DEFAULT group default
link/ether 08:00:27:ea:07:38 brd ff:ff:ff:ff:ff:ff promiscuity 0
bridge forward_delay 1500 hello_time 200 max_age 2000 vlan_filtering 1
addrgenmode eui64
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
When showing bridge attributes, show also ageing_time, stp_state and
priority if available.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Add support to be able to set and show the value of tlb_dynamic_lb
(IFLA_BOND_TLB_DYNAMIC_LB).
Example:
$ ip -d link show dev bond0 type bond
7: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default
link/ether ce:2f:e1:6e:d7:e0 brd ff:ff:ff:ff:ff:ff promiscuity 0
bond mode balance-tlb miimon 100 updelay 0 downdelay 0 use_carrier 1
arp_interval 0 arp_validate none arp_all_targets any primary_reselect
always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1
num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1
packets_per_slave 1 lacp_rate slow ad_select stable tlb_dynamic_lb 1
addrgenmode eui64
$ ip -d l set dev bond0 type bond tlb_dynamic_lb 0
$ ip -d link show dev bond0 type bond
7: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default
link/ether ce:2f:e1:6e:d7:e0 brd ff:ff:ff:ff:ff:ff promiscuity 0
bond mode balance-tlb miimon 100 updelay 0 downdelay 0 use_carrier 1
arp_interval 0 arp_validate none arp_all_targets any primary_reselect
always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1
num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1
packets_per_slave 1 lacp_rate slow ad_select stable tlb_dynamic_lb 0
addrgenmode eui64
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reset the 'preferred_family' global variable
to its initially set value before each batch
file command is processed.
Signed-off-by: Antti Paila <antti.paila@gmail.com>
This patch adds support to set and display protodown on a switch port. The
switch driver can handle this error state by doing a phys down on the port.
One example user space application setting this flag is a multi-chassis
LAG application to handle split-brain situation on peer-link failure.
Example:
root@net-next:~# ip link set eth1 protodown on
root@net-next:~/iproute2# ip link show eth1
4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 52:54:00:12:35:01 brd ff:ff:ff:ff:ff:ff protodown on
root@net-next:~/iproute2# ip link set eth1 protodown off
root@net-next:~/iproute2# ip link show eth1
4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 52:54:00:12:35:01 brd ff:ff:ff:ff:ff:ff
root@net-next:~/iproute2#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Prefer using the POSIX constant PATH_MAX instead of the legacy BSD
derived MAXPATHLEN. The necessary includes for MAXPATHLEN and PATH_MAX
are <sys/param.h> and <limits.h>, respectively.
Signed-off-by: Felix Janda <felix.janda@posteo.de>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
Make sure that return value of each socket() call is properly checked
and do not continue processing if the call failed.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
We forgot to include this patch somehow. So do it now.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
This patch replaces exits with returns in
ip route get command handling. This allows batching
of ip route get commands.
$cat route_get_batch.txt
route get 10.0.14.2
route get 12.0.14.2
route get 10.0.14.4
$ip -batch route_get_batch.txt
local 10.0.14.2 dev lo src 10.0.14.2
cache <local>
12.0.14.2 via 192.168.0.2 dev eth0 src 192.168.0.15
cache
10.0.14.4 dev dummy0 src 10.0.14.2
cache
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
This patch adds support to retrieve the new bond slave attributes:
IFLA_BOND_SLAVE_AD_ACTOR_OPER_PORT_STATE
IFLA_BOND_SLAVE_AD_PARTNER_OPER_PORT_STATE
which are read-only.
(Removed if_link.h changes already updated in net-next)
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Enable reading and displaying SRIOV VFs traffic statistics through
the host PF netdevice using the nested IFLA_VF_STATS attribute.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
This patch fixes incorrect -EINVAL errors due to invalid
scope and type during mpls route deletes.
$ip -f mpls route add 100 as 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route show
100 as to 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route del 100 as 200 via inet 10.1.1.2 dev swp1
RTNETLINK answers: Invalid argument
$ip -f mpls route del 100
RTNETLINK answers: Invalid argument
After patch:
$ip -f mpls route show
100 as to 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route del 100 as 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route show
Always set type to RTN_UNICAST for mpls route add/deletes.
Also to keep things consistent with kernel set scope to
RT_SCOPE_UNIVERSE for both mpls and ipv6 routes. Both mpls and ipv6 route
deletes ignore scope.
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Vivek Venkataraman <vivek@cumulusnetworks.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
The command "ip mroute show" is not showing routes when "to" and/or "from"
filter is applied.
root@mazhar:~# ip mroute show
(10.202.30.101, 235.1.2.3) Iif: eth0 Oifs: eth1
But When I applied filter, it does not show anything.
root@mazhar:~# ip mroute show 235.1.2.3 from 10.202.30.101
root@mazhar:~#
Signed-off-by: Mazhar Rana <ranamazharp@gmail.com>
If a tunnel is created with a local address, you can't change it to any.
# ip tunnel add tunl1 mode ipip remote 10.16.42.37 local 10.16.42.214 ttl 64
# ip tunnel show tunl1
tunl1: ip/ip remote 10.16.42.37 local 10.16.42.214 ttl 64
# ip tunnel change tunl1 local any
# echo $?
0
# ip tunnel show tunl1
tunl1: ip/ip remote 10.16.42.37 local 10.16.42.214 ttl 64
It happens that parse_args zeroes ip_tunnel_parm, and when creating the
tunnel, it is OK to leave it as is if the address is any. However, when
changing the tunnel, the current parameters will be read from
ip_tunnel_parm, and local and remote address won't be zeroes anymore, so
it needs to be explicitly set to any.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
There have been several instances where response from kernel
has overrun the stack buffer from the caller. Avoid future problems
by passing a size argument.
Also drop the unused peer and group arguments to rtnl_talk.
With this patch, it's now possible to listen in all netns that have an nsid
assigned into the netns where is socket is opened.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
With this patch, it's now possible to listen in all netns that have an nsid
assigned into the netns where the socket is opened.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This adds support for setting and displaying the following bonding
options:
* ad_user_port_key
* ad_actor_sys_prio
* ad_actor_system
Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com>
If ip rule command fails talking to kernel, exit code should be 2.
The sub-command is called by cmd loop and the exit code is negative
of return value from the command callback.
If kernel complains about ip route request, exit status should be
2 not 1.
This fixes regression introduced by:
commit 42ecedd4ba
Author: Roopa Prabhu <roopa@cumulusnetworks.com>
Date: Tue Mar 17 19:26:32 2015 -0700
fix ip -force -batch to continue on errors
Add a new option to toggle the ability of querying the RSS configuration of a specific VF.
VF RSS information like RSS hash key may be considered sensitive on some devices where
this information is shared between VF and PF and thus its querying may be prohibited by default.
This new option allows a system administrator with privileges to modify a PF state
to control if the above VF querying is allowed or not.
For example:
To enable RSS querying of VF[0] of ethX:
>> ip link set dev ethX vf 0 query_rss on
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Show deleting by group in 'ip link help' output:
...
ip link delete { DEVICE | dev DEVICE | group DEVGROUP } type TYPE [ ARGS ]
...
Also show separately DEVICE option in { } list.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
It is hard to quickly find what you are looking for in the output of the
ip command. Color helps.
This patch adds a '-c' flag to highlight these with individual colors:
- interface name
- ip address
- mac address
- up/down state
Signed-off-by: Mathias Nyman <m.nyman@iki.fi>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
This flag is only for the netlink protocol (multi-part messages), no reason
to reject messages without it.
Note that this flag was removed by the following kernel patches (v3.14)
65886f439ab0 ipmr: fix mfc notification flags
f518338b1603 ip6mr: fix mfc notification flags
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
XFRM netlink family is independent from the route netlink family. It's wrong
to call rtnl_wilddump_request(), because it will add a 'struct ifinfomsg' into
the header and the kernel will complain (at least for xfrm state):
netlink: 24 bytes leftover after parsing attributes in process `ip'.
Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Two commands are added:
- ip netns list-id
- ip monitor nsid
A cache is also added to remember the association between the iproute2 netns
name (from /var/run/netns/) and the nsid.
To avoid interfering with the rth socket, a new rtnl socket (rtnsh) is used to
get nsid (we may send rtnl request during listing on rth).
Example:
$ ip netns list-id
nsid 0 (iproute2 netns name: foo)
$ ip monitor nsid
Deleted nsid 0 (iproute2 netns name: foo)
nsid 16 (iproute2 netns name: bar)
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
When creating an IPsec SA that sets 'proto any' (IPPROTO_IP) and
specifies 'sport' and 'dport' at the same time in selector, the
following error is issued:
"sport" and "dport" are invalid with proto=ip
However using IPPROTO_IP with ports is completely legal and necessary
when one wants to share the SA on both TCP and UDP. One of the
applications requiring sharing SAs is 3GPP IMS AKA authentication.
See also:
* https://bugzilla.redhat.com/show_bug.cgi?id=497355
Reported-by: Jiří Klimeš <jklimes@redhat.com>
Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
The kernel now has the capability to offload FDB and FIB entries to hardware.
It is important to let users know if table entries are also offloaded to
hardware. Currently offloaded FDB entries are indicated by the existence of
the flag 'external' on the entry as of the following commit:
commit 28467b7f3f
Author: Scott Feldman <sfeldma@gmail.com>
Date: Thu Dec 4 09:57:15 2014 +0100
bridge/fdb: add flag/indication for FDB entry synced from offload device
When the patch to add support for indicating that FIB entries were also
offloaded as posted to netdev by Scott Feldman it became clear that 'external'
would not be an ideal name for routes. There could definitely be confusion
about what this might mean since many routes are to external networks -- a
collision/confusion that did not happen with FDB.
Scott Feldman asked me to check with others and build concensus around a name.
After speaking with several people about this I am proposing we refer to both
FDB and FIB entries that are currently backed by hardware (based on the work
done in rocker) with the flag 'offload' appended to the end ofthe entry.
Some people liked the string 'external,' others liked 'hardware,' but the point
is to communicate that these routes are available to something that will will
offload the forwarding normally done by the kernel. Since the term 'offload'
is used so frequently it seems appropriate to use the same language in
ip/bridge output.
The term 'offload' also seems to resonate with many of the people who have
responded on Scott's original thread or to those who I reached out to directly
and did respond to my query, so it seems we have reached consensus that it
should be the term used going forward.
v2: rebased against net-next branch
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: John W. Linville <linville@tuxdriver.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Scott Feldman <sfeldma@gmail.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
The goal of this patch is to test during the runtime if the command RTM_GETNSID
is supported by the kernel.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
add a new command to configure the SPD hash table:
ip xfrm policy set [ hthresh4 LBITS RBITS ] [ hthresh6 LBITS RBITS ]
and code to display the SPD hash configuration:
ip -s -s xfrm policy count
hthresh4: defines minimum local and remote IPv4 prefix lengths of
selectors to hash a policy. If prefix lengths are greater or equal
to the thresholds, then the policy is hashed, otherwise it falls back
in the policy_inexact chained list.
hthresh6: defines minimum local and remote IPv6 prefix lengths of
selectors to hash a policy, otherwise it falls back
in the policy_inexact chained list.
Example:
% ip -s -s xfrm policy count
SPD IN 0 OUT 0 FWD 0 (Sock: IN 0 OUT 0 FWD 0)
SPD buckets: count 7 Max 1048576
SPD IPv4 thresholds: local 32 remote 32
SPD IPv6 thresholds: local 128 remote 128
% ip xfrm pol set hthresh4 24 16 hthresh6 64 56
% ip -s -s xfrm policy count
SPD IN 0 OUT 0 FWD 0 (Sock: IN 0 OUT 0 FWD 0)
SPD buckets: count 7 Max 1048576
SPD IPv4 thresholds: local 24 remote 16
SPD IPv6 thresholds: local 64 remote 56
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
This allows querying and setting the route preference. It's usually set from
the IPv6 Neighbor Discovery Router Advertisement messages.
Introduced in "ipv6: expose RFC4191 route preference via rtnetlink", enqueued
for Linux 4.1.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
- Pull in the uapi mpls.h
- Update rtnetlink.h to include the mpls rtnetlink notification multicast group.
- Define AF_MPLS in utils.h if it is not defined from elsewhere
as is done with AF_DECnet
The address syntax for multiple mpls labels is a complete invention.
When I looked there seemed to be no wide spread convention for talking
about an mpls label stack in text for. Sometimes people did:
"{ Label1, Label2, Label3 }", sometimes people would do:
"[ label3, label2, label1 ]", and most of the time label
stacks were not explicitly shown at all.
The syntax I wound up using, so it would not have spaces and so it
would visually distinct from other kinds of addresses is.
label1/label2/label3 Where label1 is the label at the top of the label
stack and label3 is the label at the bottom on the label stack.
When there is a single label this matches what seems to be convention
with other tools. Just print out the numeric value of the mpls label.
The netlink protocol for labels uses the on the wire format for a
label stack. The ttl and traffic class are expected to be 0. Using
the on the wire format is common and what happens with other address
types. BGP when passing label stacks also uses this technique with the
exception that the ttl byte is not included making each label in a BGP
label stack 3 bytes instead of 4.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This attribute is like RTA_DST except it specifies the destination
address to place on a packet when it leaves the host. For ip based
protocols this is destination NAT and not a common part of forwarding.
For protocols like MPLS label swapping is something that typically
happens on every hop.
There is likely to be a RTA_NEWSRC at some point so RTA_NEWDST
is printed as "as to" and can be specified either as "as to"
or just "as"
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Add support for the RTA_VIA attribute that specifies an address family
as well as an address for the next hop gateway.
To make it easy to pass this reorder inet_prefix so that it's tail
is a proper RTA_VIA attribute.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Add the functions family_name and read_family to convert an address
family to a string and to convernt a string to an address family.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
For some address families (like AF_PACKET) it is helpful to have the
length when prenting the address.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Joining multicast group on ethernet level via "ip maddr" command would
not work if we have an Ethernet switch that does igmp snooping since
the switch would not replicate multicast packets on ports that did not
have IGMP reports for the multicast addresses.
Linux vxlan interfaces created via "ip link add vxlan" have the group option
that enables then to do the required join.
By extending ip address command with option "autojoin" we can get similar
functionality for openvswitch vxlan interfaces as well as other tunneling
mechanisms that need to receive multicast traffic.
example:
ip address add 224.1.1.10/24 dev eth5 autojoin
ip address del 224.1.1.10/24 dev eth5
On ip route print dump, label externally offloaded routes with "external".
Offloaded routes are flagged with RTNH_F_EXTERNAL, a recent additon to
net-next. For example:
$ ip route
default via 192.168.0.2 dev eth0
11.0.0.0/30 dev swp1 proto kernel scope link src 11.0.0.2 external
11.0.0.4/30 via 11.0.0.1 dev swp1 proto zebra metric 20 external
11.0.0.8/30 dev swp2 proto kernel scope link src 11.0.0.10 external
11.0.0.12/30 via 11.0.0.9 dev swp2 proto zebra metric 20 external
12.0.0.2 proto zebra metric 30 external
nexthop via 11.0.0.1 dev swp1 weight 1
nexthop via 11.0.0.9 dev swp2 weight 1
12.0.0.3 via 11.0.0.1 dev swp1 proto zebra metric 20 external
12.0.0.4 via 11.0.0.9 dev swp2 proto zebra metric 20 external
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.15
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Don't insert newline in -o (oneline) mode; print mark as hex.
Oneline mode is supposed to force all output to be on oneline and
machine-parsable, but this isn't the case for "ip xfrm" as shown:
% ip -o xfrm monitor
...
src 0.0.0.0/0 dst 0.0.0.0/0 \ dir out priority 2051 ptype main \ mark -1879048191/0xffffffff
tmpl src 203.0.130.10 dst 198.51.130.30\ proto esp reqid 16384 mode tunnel\
...
as that's 2 lines, not one. Also, the "mark" is shown in signed
decimal, but the mask is in hex. This is confusing: let's use
hex for both.
Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>
This patch replaces exits with returns in several
iproute2 commands. This fixes `ip -batch -force`
to not exit but continue on errors.
$cat c.txt
route del 1.2.3.0/24 dev eth0
route del 1.2.4.0/24 dev eth0
route del 1.2.5.0/24 dev eth0
route add 1.2.3.0/24 dev eth0
$ip -force -batch c.txt
RTNETLINK answers: No such process
Command failed c.txt:2
RTNETLINK answers: No such process
Command failed c.txt:3
Reported-by: Sven-Haegar Koch <haegar@sdinet.de>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Where used in the ip tool, the 'show' option always has the synonyms
'list' and 'lst', except for ip-token and ip-addrlabel, which are missing
'lst'. Add this as a synonym for these commands.
Signed-off-by: Mark Einon <mark.einon@gmail.com>
Observed on the Linux 3.18:
# ip netns
RTNETLINK answers: Operation not supported
net0
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Fixes: d182ee1307 ("ipnetns: allow to get and set netns ids")
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
This new attribute is now advertised by the kernel for x-netns interfaces.
It's also possible to set it when an interface is created (and thus creating a
x-netns interface with one single message).
Example:
$ ip netns add foo
$ ip netns add bar
$ ip -n foo netns set bar 15
$ ip -n foo link add ipip1 link-netnsid 15 type ipip remote 10.16.0.121 local 10.16.0.249
$ ip -n foo link ls ipip1
3: ipip1@NONE: <POINTOPOINT,NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default
link/ipip 10.16.0.249 peer 10.16.0.121 link-netnsid 15
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
The kernel now provides ids for peer netns. This patch implements a new command
'set' to assign an id.
When netns are listed, if an id is assigned, it is now displayed.
Example:
$ ip netns add foo
$ ip netns set foo 1
$ ip netns
foo (id: 1)
init_net
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This patch adds support to remote checksum checksum offload
confinguration for IPIP, SIT, and GRE tunnels. This patch
adds a [no]encap-remcsum to ip link command which applicable
when configured tunnels that use GUE.
http://tools.ietf.org/html/draft-herbert-remotecsumoffload-00
Example:
ip link add name tun1 type gre remote 192.168.1.1 local 192.168.1.2 \
ttl 225 encap fou encap-sport auto encap-dport 7777 encap-csum \
encap-remcsum
This would create an GRE tunnel in GUE encapsulation where the source
port is automatically selected (based on hash of inner packet),
checksums in the encapsulating UDP header are enabled (needed.for
remote checksum offload), and remote checksum ffload is configured to
be used on the tunnel (affects TX side).
Signed-off-by: Tom Herbert <therbert@google.com>
This patch makes CAN_CTRLMODE_FD_NON_ISO netlink feature configurable.
During the CAN FD standardization process within the ISO it turned out that
the failure detection capability has to be improved.
The CAN in Automation organization (CiA) defined the already implemented CAN
FD controllers as 'non-ISO' and the upcoming improved CAN FD controllers as
'ISO' compliant. See at http://www.can-cia.com/index.php?id=1937
Starting with the - currently non-ISO - driver for M_CAN v3.0.1 introduced in
Linux 3.18 this bit needs to be propagated to userspace. In future drivers this
bit will become configurable depending on the CAN FD controllers capabilities.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
"ip addr show up" would exclude the interface (link), but include the
addresses of down interfaces (which looked like they where indented
under a different interface). This fixes the filtering.
For a full example see the original bug report at:
http://bugs.debian.org/776040
Reported-by: Paul Slootman <paul@debian.org>
CC: 776040@bugs.debian.org
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
This change allows to exec some cmd on each
named netns (except default) by specifying '-all' option:
# ip -all netns exec ip link
Each command executes synchronously.
Exit status is not considered, so there might be a case
that some CMD can fail on some netns but success on the other.
EXAMPLES:
1) Show link info on all netns:
$ ip -all netns exec ip link
netns: test_net
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 500
link/ether 1a:19:6f:25:eb:85 brd ff:ff:ff:ff:ff:ff
netns: home0
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 500
link/ether ea:1a:59:40:d3:29 brd ff:ff:ff:ff:ff:ff
netns: lan0
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 500
link/ether ce:49:d5:46:81:ea brd ff:ff:ff:ff:ff:ff
2) Set UP tap0 device for the all netns:
$ ip -all netns exec ip link set dev tap0 up
netns: test_net
netns: home0
netns: lan0
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
This patch adds configuration and dumping of congestion control metric
for ip route, for example:
ip route add <dst> dev foo congctl [lock] dctcp
Reference: http://thread.gmane.org/gmane.linux.network/344733
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
The issue was caused that ifla_vf_rate does not exist on
older kernels and should be checked if it exists as nested attr.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Reported-by: William Dauchy <william@gandi.net>
Tested-by: William Dauchy <william@gandi.net>
Added new '-netns' option to simplify executing following cmd:
ip netns exec NETNS ip OPTIONS COMMAND OBJECT
to
ip -n[etns] NETNS OPTIONS COMMAND OBJECT
e.g.:
ip -net vnet0 link add br0 type bridge
ip -n vnet0 link
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
New netns_switch func moved to the lib/namespace.c from ip/ipnetns.c
so it can be used from the other tools for fast switching
network namespace.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Added new option 'type' to 'ip link show'
command which allows to filter devices by type:
ip link show type bridge
ip link show type vlan
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Sometimes it's needed to have "ip address show" list only addresses
with certain flags not being set, e.g. in network scripts.
As an example one might want to exclude addresses in "tentative"
or "deprecated" state.
Support listing addresses with flags tentative, deprecated, dadfailed
not being set by prefixing the respective flag with a minus.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Added another timestamp format to look like more logging info:
[2014-12-22T22:36:50.489 ] 2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default
link/ether 3c:97:0e:a3:86:2e brd ff:ff:ff:ff:ff:ff
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
This patch makes CAN_CTRLMODE_PRESUME_ACK netlink feature configurable.
When enabled, the feature sets CAN controller in mode in which
acknowledgement absence is ignored.
Signed-off-by: Nikita Edward Baruzdin <nebaruzdin@gmail.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
The issue was observed when IPv6 router broadcasted NDUSEROPT
messages which are not handled by monitor and caused printing
'Timestamps' w/o message because such kind of rtnl messages is not
handled by monitor.
As 'ip monitor' by default subscribes to the all mcast rtnl groups except
RTGRP_TC then all messages of these rtnl groups which are not handled by
monitor may cause such issues.
Fixed by subscribing by default to rtnl mcast groups which are
supported by 'ip monitor'.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
This option was used only for 'ip link', but it can be useful to have it for
'ip address'. Thus it is possible to display link details and addresses with one
command.
Example:
$ ip -d a ls dev gre1
9: gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1468 qdisc noqueue state UNKNOWN group default
link/gre 10.16.0.249 peer 10.16.0.121 promiscuity 0
gre remote 10.16.0.121 local 10.16.0.249 ttl inherit ikey 0.0.0.10 okey 0.0.0.10 icsum ocsum
inet 192.168.0.249 peer 192.168.0.121/32 scope global gre1
valid_lft forever preferred_lft forever
inet6 fe80::5efe:a10:f9/64 scope link
valid_lft forever preferred_lft forever
Suggested-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This option was used only for 'ip link', but it can be useful to have it for
'ip address'. Thus it is possible to display link details and addresses with one
command.
Example:
$ ip -d a ls dev gre1
9: gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1468 qdisc noqueue state UNKNOWN group default
link/gre 10.16.0.249 peer 10.16.0.121 promiscuity 0
gre remote 10.16.0.121 local 10.16.0.249 ttl inherit ikey 0.0.0.10 okey 0.0.0.10 icsum ocsum
inet 192.168.0.249 peer 192.168.0.121/32 scope global gre1
valid_lft forever preferred_lft forever
inet6 fe80::5efe:a10:f9/64 scope link
valid_lft forever preferred_lft forever
Suggested-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This permits to selectively enable explicit congestion notification via
the routing table.
If this ecn feature is not set, the kernel will use the tcp_ecn sysctl
to decide wheter to use ECN when establising a TCP connection.
At the time of this writing, the kernel supports ecn and allfrags, but
allfrags is of dubious value and not implemented here.
Example:
ip route change 192.168.2.0/24 dev eth0 features ecn
Signed-off-by: Florian Westphal <fw@strlen.de>
Adding basic support to create virtual devices using 'ip'
utility. Following is the syntax -
ip link add link <master> <virtual> type ipvlan mode [ l2 | l3 ]
e.g. ip link add link eth0 ipvl0 type ipvlan mode l3
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Laurent Chavey <chavey@google.com>
Cc: Tim Hockin <thockin@google.com>
Cc: Brandon Philips <brandon.philips@coreos.com>
Cc: Pavel Emelianov <xemul@parallels.com>
As 'ip' util will share the same netns from the caller
process then we can just look at /proc/self/.. to show
the netns of the current process by:
ip netns id
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Add udpcsum option to enable transmitting UDP checksums when doing
VXLAN/IPv4. Add udp6zerocsumtx, and udp6zerocsumrx options to enable
sending zero checksums and receiving zero checksums in VXLAN/IPv6.
Signed-off-by: Tom Herbert <therbert@google.com>
This patch adds support to configure foo-over-udp (FOU) and Generic
UDP Encapsulation for GRE tunnels. This configuration allows selection
of FOU or GUE for the tunnel, specification of the source and
destination ports for UDP tunnel, and enabling TX checksum. This
configuration only affects the transmit side of a tunnel.
Example:
ip link add name tun1 type gre remote 192.168.1.1 local 192.168.1.2 \
ttl 225 encap fou encap-sport auto encap-dport 7777 encap-csum
This would create an GRE tunnel in GUE encapsulation where the source
port is automatically selected (based on hash of inner packet) and
checksums in the encapsulating UDP header are enabled.
Signed-off-by: Tom Herbert <therbert@google.com>
This patch adds support to configure foo-over-udp (FOU) and Generic
UDP Encapsulation for IPIP and sit tunnels. This configuration allows
selection of FOU or GUE for the tunnel, specification of the source and
destination ports for UDP tunnel, and enabling TX checksum. This
configuration only affects the transmit side of a tunnel.
Example:
ip link add name tun1 type ipip remote 192.168.1.1 local 192.168.1.2 \
ttl 225 encap gue encap-sport auto encap-dport 9999 encap-csum
This would create an IPIP tunnel in GUE encapsulation where the source
port is automatically selected (based on hash of inner packet) and
checksums in the encapsulating UDP header are enabled.
Signed-off-by: Tom Herbert <therbert@google.com>
Added 'ip fou...' commands to enable/disable UDP ports for doing
foo-over-udp and Generic UDP Encapsulation variant. Arguments are port
number to bind to and IP protocol to map to port (for direct FOU).
Examples:
ip fou add port 7777 gue
ip fou add port 8888 ipproto 4
The first command creates a GUE port, the second creates a direct FOU
port for IPIP (receive payload is a assumed to be an IPv4 packet).
Signed-off-by: Tom Herbert <therbert@google.com>
- any ipv6 tunnel mode (proto == 0) could not be set
due to incomplete set of cases in do_add, do_del.
- vti6 logic was inverted: it was using "ip6_vti0" basedev
UNLESS mode is set to vti6.
We don't need a switch by p.proto in do_add()/do_del(): it
already exists in parse_args(). So if parse_args() call
was successful, no need to check tunnel mode again.
Signed-off-by: Alexey Andriyanov <alan@al-an.info>
This patch allows to configure ESN and anti-replay window.
Signed-off-by: dingzhi <zhi.ding@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Allow to print particular link type usage by:
ip link help [TYPE]
Currently to print usage for some link type it is needed
to use the following way:
ip link { add | del | set } type TYPE help
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
'ip -oneline tunnel show' was not "oneline" for GRE tunnels with iseq:
# ip tun add gre_test remote 1.1.1.1 local 2.2.2.2 mode gre iseq oseq
# ip -oneline tun show gre_test | wc -l
2
The problem existed because of a typo: '\n' was printed when it shouldn't be.
Fixed.
Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Make ip address show accept the -s option similarly to ip link. This creates
an one command replacement for "ifconfig -a" useful for people who still
stay with ifconfig because of this feature.
Print the stats as the last thing for the interface. This requires some code
shuffling.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Since commit 3c682146ae, iplink requires assigning negative
ifindex (-1) to the kernel when creating interface without
specifying index.
v2: checking whether index is -1, suggested by Cong Wang.
Cc: Cong Wang <cwang@twopensource.com>
Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp>
Acked-by: Cong Wang <cwang@twopensource.com>
In case if unknown message was handled then it will be displayed as:
Unknown message: type=0x00000044(68) flags=0x00000000(0) len=0x0000004c(76)
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
This checking was performed only when adding interface but
it is needed also when deleting, otherwise the error will be:
ioctl(TUNSETIFF): Invalid argument
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Parenthesis are required else maxaddr value is a bool and thus output is always
1 when the option is set.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This patch adds the necessary changes to allow altering a slave device's
options via ip link set <device> type <master type>_slave specific-option.
It also adds support to set the bonding slaves' queue_id.
Example:
ip link set eth0 type bond_slave queue_id 10
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>