Commit Graph

707 Commits

Author SHA1 Message Date
Paolo Abeni f0df40810f lwtunnel: fix argument parsing
Currently parse_encap_ip() does not update correctly argv/argc;
if multiple lwtunnel arguments are provided, the parsing fails after
the first one, i.e.

 ip route add 172.16.101.0/24 dev vxlan1 encap ip id 42 dst 192.168.255.1

fails with:

 Error: either "to" is duplicate, or "dst" is a garbage.

This commit addresses the issue, stepping to next argument at each iteration
of the parsing loop.

Fixes: 1e5293056a ("lwtunnel: Add encapsulation support to ip route")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2015-12-17 17:16:02 -08:00
Phil Sutter ed6b8652f7 route: Fix printing of locked entries
Commit 0f7543322c ("route: ignore RTAX_HOPLIMIT of value -1")
accidentally reordered fprintf statements. This patch restores the
original ordering.

Fixes: 0f7543322c ("route: ignore RTAX_HOPLIMIT of value -1")
Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-12-17 17:07:07 -08:00
Konstantin Khlebnikov e834eb8eba ip neigh: device is optional for proxy entries
Though dumping such entries crashes present kernels.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
2015-12-17 17:07:07 -08:00
Tom Herbert 5866bddd9a ila: Add support for ILA lwtunnels
This patch:
 - Adds a utility function for parsing a 64 bit address
 - Adds a utility function for converting a 64 bit address to ASCII
 - Adds and ILA encap type in lwt tunnels

Signed-off-by: Tom Herbert <tom@herbertland.com>
2015-12-17 17:07:07 -08:00
Stephen Hemminger 654ae881de ip: fix format string when reading statistics
The tunnel code was doing sscanf(buf, "%ld", &x) where x was unsigned
long.
2015-12-10 08:52:10 -08:00
David Ahern 8a23f82045 vrf: Add support for table names
Currently, the table id for VRF devices requires an integer. Convert
it to use rtnl_rttable_a2n which handles table names from the iproute2
directory.

This also fixes a bug in the original commit where table name are not
properly handled.

Fixes: 15faa0a30b ("add support for VRF device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2015-12-10 08:45:30 -08:00
Phil Sutter 0f7543322c route: ignore RTAX_HOPLIMIT of value -1
Older kernels use -1 internally as indicator to use the sysctl default,
but they still export the setting. Newer kernels use 0 to indicate that
(which is why the conversion from -1 to 0 was done here), but they also
stopped exporting the value. Since the meaning of -1 is clear, treat it
equally like default on newer kernels (which is to not print anything).

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-12-10 08:45:11 -08:00
Stephen Hemminger a96a5d94c6 iptunnel: cleanup code
Make iptunnel pass checkpatch (mostly).
2015-11-29 12:05:39 -08:00
Konstantin Shemyak cc9c1dfaee ip_tunnel: determine tunnel address family from the tunnel type
On 24.11.2015 02:26, Stephen Hemminger wrote:
> On Thu, 12 Nov 2015 21:10:08 +0000
> Konstantin Shemyak <konstantin@shemyak.com> wrote:
>
>> When creating an IP tunnel over IPv6, the address family must be passed in
>> the option, e.g.
>>
>> ip -6 tunnel add mode ip6gre local 1::1 remote 2::2
>>
>> This makes it impossible to create both IPv4 and IPv6 tunnels in one batch.
>>
>> In fact the address family option is redundant here, as each tunnel mode is
>> relevant for only one address family.
>> The patch determines whether the applicable address family is AF_INET6
>> instead of the default AF_INET and makes the "-6" option unnecessary for
>> "ip tunnel add".
>>
>> Signed-off-by: Konstantin Shemyak <konstantin@shemyak.com>
>> ---
>>   ip/iptunnel.c                          | 26 ++++++++++++++++++++++++++
>>   testsuite/tests/ip/tunnel/add_tunnel.t | 14 ++++++++++++++
>>   2 files changed, 40 insertions(+)
>>   create mode 100755 testsuite/tests/ip/tunnel/add_tunnel.t
>>
>> diff --git a/ip/iptunnel.c b/ip/iptunnel.c
>> index 78fa988..7826a37 100644
>> --- a/ip/iptunnel.c
>> +++ b/ip/iptunnel.c
>> @@ -629,8 +629,34 @@ static int do_6rd(int argc, char **argv)
>>          return tnl_6rd_ioctl(cmd, medium, &ip6rd);
>>   }
>>
>> +static int tunnel_mode_is_ipv6(char *tunnel_mode) {
>> +       char *ipv6_modes[] = {
>> +               "ipv6/ipv6", "ip6ip6",
>> +               "vti6",
>> +               "ip/ipv6", "ipv4/ipv6", "ipip6", "ip4ip6",
>> +               "ip6gre", "gre/ipv6",
>> +               "any/ipv6", "any"
>> +       };
>> +       int i;
>> +
>> +       for (i = 0; i < sizeof(ipv6_modes) / sizeof(char *); i++) {
>> +               if (strcmp(ipv6_modes[i], tunnel_mode) == 0)
>> +                       return 1;
>> +       }
>> +       return 0;
>> +}
>> +
>
> The ipv6_modes table should be static const.

Thank you for the note! attached the corrected patch.

> Also is it possible to use strstr for ipv6 and ip6 or even strchr(tunnel_mode, '6')
> to simplify this?

There is IPv6 tunnel mode 'any', and IPv4 tunnel mode 'ipv6/ip' (aka
'sit'). It looks to me that attempts to find some substring match
would not make the code much shorter, but definitely less readable.

Konstantin Shemyak.

>From 42d27db0055c3a114fe6eb86d680bef9ec098ad4 Mon Sep 17 00:00:00 2001
From: Konstantin Shemyak <konstantin@shemyak.com>
Date: Thu, 12 Nov 2015 20:52:02 +0200
Subject: [PATCH] Tunnel address family is determined from the tunnel mode

When the tunnel mode already tells the IP address family, "ip tunnel"
command determines it and does not require option "-4"/"-6" to be passed.

This makes possible creating both IPv4 and IPv6 tunnels in one batch.

Signed-off-by: Konstantin Shemyak <konstantin@shemyak.com>
2015-11-29 11:57:21 -08:00
Tom Herbert 35f59d862f vxlan: Add support for remote checksum offload
This patch adds support to remote checksum checksum offload
to VXLAN. This patch adds remcsumtx and remcsumrx to ip vxlan
configuration to enable remote checksum offload for transmit
and receive on the VXLAN tunnel.

https://tools.ietf.org/html/draft-herbert-vxlan-rco-00

Example:

ip link add name vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0 \
    udpcsum remcsumtx remcsumrx

Testing:

Ran single netperf over mlnx4 to illustrate the effest:

- Without RCO (UDP csum set to zero)
  4335.99 Mbps
- With RCO enabled
  7661.81 Mbps

Signed-off-by: Tom Herbert <tom@herbertland.com>
2015-11-29 11:53:02 -08:00
Phil Sutter ea6cbab792 iproute: restrict hoplimit values to be in range [0; 255]
Technically, the range of possible hoplimit values are defined by IPv4
and IPv6 header formats. Both define the field to be eight bits in size,
which leads to a value range of [0;255]. Setting a packet's hoplimit
field to 0 though makes not much sense, as the next hop would
immediately drop the packet. Therefore Linux uses 0 as a special value
indicating to use the system's default hoplimit (configurable via
sysctl). In iproute, setting the hoplimit of a route to 0 is equivalent
to omitting the hoplimit parameter alltogether, so it is actually not
necessary to allow that value to be specified, but keep it anyway for
backwards compatibility.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-29 11:47:29 -08:00
Phil Sutter d81f54d599 iptoken: simplify iptoken_list a bit
Since it uses only a single filter, rtnl_dump_filter() can be used.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-29 11:47:29 -08:00
Phil Sutter 906dfe4887 ipaddress: drop unnecessary check in ipaddr_list_flush_or_save()
Right after ipaddr_reset_filter(), filter.family is always AF_UNSPEC.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-29 11:47:29 -08:00
Phil Sutter d25ec03e1d ipaddress: fix ipaddr_flush for Linux >= 3.1
Linux version 3.1 introduced a consistency check for netlink dumps in
commit 670dc28 ("netlink: advertise incomplete dumps"). This bites
iproute2 when flushing more addresses than can fit into a single
RTM_GETADDR response. To silence the spurious error message "Dump was
interrupted and may be inconsistent.", advise rtnl_dump_filter_l() to
not care about NLM_F_DUMP_INTR.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-29 11:47:29 -08:00
Phil Sutter c6995c4802 ipaddress: simplify ipaddr_flush()
Since it's no longer relevant whether an IP address is primary or
secondary when flushing, ipaddr_flush() can be simplified a bit.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-29 11:47:29 -08:00
John W. Linville 906ac5437a geneve: add support for IPv6 link partners
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2015-11-23 16:23:11 -08:00
Neil Horman e149d4e843 iproute2: Ignore EADDRNOTAVAIL errors during address flush operation
I found recently that, if I disabled address promotion in the kernel, that
ip addr flush dev <dev>

would fail with an EADDRNOTAVAIL errno (though the flush operation would in fact
flush all addresses from an interface properly)

Whats happening is that, if I add a primary and multiple secondary addresses to
an interface, the flush operation first ennumerates them all with a GETADDR |
DUMP operation, then sends a delete request for each address.  But the kernel,
having promotion disabled, deletes all secondary addresses when the primary is
removed.  That means, that several delete requests may still be pending in the
netlink request for addresses that have been removed on our behalf, resulting in
EADDRNOTAVAIL return codes.

It seems the simplest thing to do is to understand that EADDRUNAVAIL isn't a
fatal outcome on a flush operation, as it just indicates that an address which
you want to remove is already removed, so it can safely be ignored.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
2015-11-23 15:59:08 -08:00
Phil Sutter f7b49a3fc7 ip_common.h header cleanup
- Drop 'extern' keyword from all function prototypes.
- Make line breaking of print_* functions consistent.
- Make print_ntable() and ipntable_reset_filter() static and remove
  their declaration.
- Drop declaration of non-existent ipaddr_list() and iproute_monitor().

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:44:03 -08:00
Phil Sutter 04ce8d3eda ip{,6}tunnel: put spaces around non-unary operators
Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter f53ecee818 iptunnel: sanitize copying tunnel name
Since p->name is only IFNAMSIZ bytes, do not copy more than IFNAMSIZ - 1
bytes into it so there remains at least a single null byte in the end.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter c957821b18 iptunnel: share common code when determining the default interface name
Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter 0dd4d2b37f iptunnel: simplify parsing TTL, allow 'hlim' as identifier
Instead of parsing an unsigned integer and checking boundaries, simply
parse u8. This and the added ttl alias 'hlim' provide consistency with
ip6tunnel.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter 2520598a1a iptunnel: share common code when setting tunnel mode
Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter 7894ce7722 ip6tunnel: fix coding style: no newline between brace and else
Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter 9af72f819e ip6tunnel: print local/remote addresses like iptunnel does
This makes output consistent with iptunnel, also supporting reverse DNS
lookup for remote address if requested.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter c4527d7ba3 ip{,6}tunnel: align do_tunnels_list() a bit
In iptunnel, declare loop variables inside the loop as done in
ip6tunnel.

Fix and simplify goto logic in ip6tunnel:
- Failure to read over header lines would have left fp opened.
- By returning directly upon fopen() failure, fp can be closed
  unconditionally in the end.

Use the same goto logic in iptunnel, as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter 4b3cb96281 iptunnel: use ll_name_to_index() for physical interface lookup
Although the cache is only initialized in do_show(), this way it is at
least consistent with ip6tunnel.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter 6ddb1e8c90 ip{, 6}tunnel: unify behaviour if physical device is not found
Make ip6tunnel print an error message as well. While there, get rid of
unnecessary line breaking.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter a7ed1520ee ip/tunnel: introduce tnl_parse_key()
Instead of duplicating the same code six times (key, ikey and okey in
iptunnel and ip6tunnel), have a common parsing routine. This has the
added benefit of having the same verbose error message in ip6tunnel as
well as iptunnel.

I'm not sure if parsing an IPv4 address as key makes sense for
ip6tunnel, but the code was there before so this patch at least doesn't
make it worse.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Phil Sutter 8de592d05c ip{, 6}tunnel: get rid of extraneous whitespace when printing
Put whitespace in the beginning of optional parts, not as suffix
anywhere. Also drop double whitespaces in between words.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-11-23 15:26:37 -08:00
Stephen Hemminger 86c392f958 Merge branch 'master' into net-next 2015-10-23 15:46:08 -07:00
Stephen Hemminger f7520a1998 ip: remove extra newlines at end-of-file
Shouldn't have extra blank lines.
2015-10-23 15:41:58 -07:00
Stephen Hemminger 651dccbee7 Merge branch 'master' into net-next 2015-10-22 23:42:37 -07:00
Daniel Borkmann d583e88ebc ip, realms: also allow to pass in raw realms value
If get_rt_realms() fails, try to get a possible raw u32 realms
value for the u32 RTA_FLOW/FRA_FLOW attribute, as it might be
useful to directly configure the hex value itself. And only if
that fails, then bail out.

The source realm is provided in the upper u16 (mask: 0xffff0000)
and the destination realm through the lower u16 part (mask:
0x0000ffff). This can be useful for tc's bpf realm matcher, but
also a full hex/mask param can be provided already for matching
through iptables' --realm cmdline option, for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2015-10-22 23:40:51 -07:00
Kirill Tkhai 2f4e171f7d Add ip rule save/restore
This patch adds save and restore commands to "ip rule"
similar the same is made in commit f4ff11e3e2 for "ip route".

The feature is useful in checkpoint/restore for container
migration, also it may be helpful in some normal situations.

Signed-off-by: Kirill Tkhai <ktkhai@odin.com>
2015-10-22 23:35:57 -07:00
Stephen Hemminger b89c359c15 Merge branch 'master' into net-next 2015-10-18 21:58:29 -07:00
Roopa Prabhu 8b21cef129 ip route get: change exit to return to support batch commands
replace exit with return -2 on rtnl_talk failure

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
2015-10-18 21:57:46 -07:00
Phil Sutter ccaf6eb5cc ip-rule: neither prohibit nor reject or unreachable flags exist
This has been inconsistent since the beginning of Git and seems to be
merely a documentation leftover, therefore just remove it from help
output and man page.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-10-18 21:57:01 -07:00
Roopa Prabhu 1e5293056a lwtunnel: Add encapsulation support to ip route
This patch adds support to parse and print lwtunnel
encapsulation attributes attached to routes for MPLS
and IP tunnels.

example:
Add ipv4 route with mpls encap attributes:

Examples:

  MPLS:
  $ ip route add 40.1.2.0/30 encap mpls 200 via inet 40.1.1.1 dev eth3
  $ ip route show
  40.1.2.0/30  encap mpls 200 via 40.1.1.1 dev eth3

  Add ipv4 multipath route with mpls encap attributes:
  $ ip route add 10.1.1.0/30 nexthop encap mpls 200 via 10.1.1.1 dev eth0 \
		    nexthop encap mpls 700 via  40.1.1.2 dev eth3
  $ ip route show
  10.1.1.0/30
    nexthop encap mpls 200  via 10.1.1.1  dev eth0 weight 1
    nexthop encap mpls 700  via 40.1.1.2  dev eth3 weight 1

  IP:
  $ ip route add 10.1.1.1/24 encap ip id 200 dst 20.1.1.1 dev vxlan0

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jiri Benc <jbenc@redhat.com>
2015-10-16 16:13:22 -07:00
Stephen Hemminger c6646c1ea5 Merge branch 'master' into net-next 2015-10-16 16:03:32 -07:00
Phil Sutter 6f07f3dc41 ip-address: fix oneline mode for interfaces with VF
Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-10-16 16:02:38 -07:00
Roopa Prabhu 39ca4879a0 ip monitor neigh: Change 'delete' to 'Deleted' to be consistent with ip route
It helps to grep for one string "Deleted" when monitoring all events.

Fixes: 6ea3ebafe0 ("iproute2: inform user when a neighbor is removed")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-10-16 16:01:34 -07:00
Stephen Hemminger d2ccb70a91 Merge branch 'master' into net-next 2015-10-12 09:50:46 -07:00
Phil Sutter 3cf8ba5960 ip: macvlan: support MACVLAN_FLAG_NOPROMISC flag
This flag is allowed for devices in passthru mode to prevent forcing the
underlying interface into promiscuous mode.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-10-12 09:46:55 -07:00
Phil Sutter 541f1b3e1d ip: link: consolidate macvlan and macvtap
After eliminating the minor differences in both files which existed
solely because features/fixes were applied to only one of them and not
the other, the remaining differences were in function naming and error
messages. The latter is addressed by using the 'id' field of struct
link_util.

Fold both files into one in order to share common code and eliminate the
chance of having fixes/enhancements applied to only one of them.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-10-12 09:46:55 -07:00
David Ahern b8c753245b ip neigh: Add ifindex to request when filtering dumps by device
Add ifindex to dump request when filtering by device. If the kernel
supports it adding the index to the request limits the amount of data
the kernel pushes to userpsace.

The feature exists in userspace already, so no need to warn the user
if kernel side support does not exist. Using the kernel side filter
makes the request more efficient.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2015-10-12 09:43:28 -07:00
David Ahern 0d238ca2b8 ip neigh: Add support for filtering dumps by master device
Add support for filtering neighbor dumps by master device. Kernel side
support provided by commit 21fdd092acc7. Since the feature is not
available in older kernels the user is given a warning message if the
kernel does not support the request.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2015-10-12 09:39:37 -07:00
Stephen Hemminger cf5b002f20 Merge branch 'master' into net-next 2015-10-12 09:32:14 -07:00
Christoph Schulz 8aacb9bbbd ip: allow using a device "help" (or a prefix thereof)
Device names that match "help" or a prefix thereof should be allowed anywhere
a device name can be used. Note that a suitable keyword ("dev" or "name", the
latter for "ip tunnel") has to be used in these cases to resolve ambiguities.

Signed-off-by: Christoph Schulz <develop@kristov.de>
Reported-by: Leonhard Preis <leonhard@pre.is>
Reported-by: Wilhelm Wijkander <lists@0x5e.se>
2015-10-07 10:35:17 +01:00
David Ahern 84d30afd8a ip: Add type and master filters to brief output
The brief format does not honer the master and type filters:

$ ip link show master vrf-mgmt
7: dummy0: <BROADCAST,NOARP,SLAVE> mtu 1500 qdisc noop master vrf-mgmt state DOWN mode DEFAULT group default qlen 1000
    link/ether 66:39:cc:2b:e9:bd brd ff:ff:ff:ff:ff:ff

$ ip -br link show master vrf-mgmt
lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
eth0             UP             08:00:27🇩🇪14:c8 <BROADCAST,MULTICAST,UP,LOWER_UP>
eth1             UP             08:00:27:87:02:f1 <BROADCAST,MULTICAST,UP,LOWER_UP>
eth2             UP             08:00:27:61:1e:fd <BROADCAST,MULTICAST,UP,LOWER_UP>
vrf-blue         UNKNOWN        a6:3f:09:34:7e:74 <NOARP,MASTER,UP,LOWER_UP>
vrf-red          DOWN           fe:a2:2d:e1:bc:ac <NOARP,MASTER>
dummy0           DOWN           66:39:cc:2b:e9:bd <BROADCAST,NOARP,SLAVE>
dummy1           DOWN           4a:4f:13:91:64:b1 <BROADCAST,NOARP,SLAVE>
dummy2           DOWN           b2:4f:b6💿bd:a6 <BROADCAST,NOARP>
dummy3           DOWN           1e:06:3d:40:b8:c2 <BROADCAST,NOARP,SLAVE>
vrf-mgmt         DOWN           ce:b2:74:41:21:df <NOARP,MASTER>

With this patch the expected output is shown:

$ ip -br link show master vrf-mgmt
dummy0           DOWN           66:39:cc:2b:e9:bd <BROADCAST,NOARP,SLAVE>

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2015-09-23 16:27:52 -07:00