There are few places to improve:
1) return -1 when entry is filtered instead of zero, which means
accept entry: ipaddress_list_flush_or_save() the only user of this
2) use ll_idx_n2a() as last resort to translate name to index for
"should never happen" cases when cache shouldn't be considered
3) replace open coded access to IFLA_IFNAME attribute data by
RTA_DATA() with rta_getattr_str()
4) simplify ifname printing since name is never NULL, thanks to (2).
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
There is no reentrancy as well as deferred result usage for all cases
where ll_idx_n2a() being used: it is safe to use ll_index_to_name() that
internally calls ll_idx_n2a() with static buffer to hold result.
While there print master network device name using correct color.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
There at least two places in ip/ipaddress.c where we match IFA_LABEL
against filter.label if that is given.
Get rid of "common" if () statement for inet_addr_match_rta() and
ifa_label_match_rta(): it is not common because first will check for
filter.pfx.family != AF_UNSPEC inside and second for filter.label being
non NULL.
This allows us to further simplify down code and prepare for
ll_idx_n2a() replacement with ll_index_to_name() without 80 columns
checkpatch notice.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
After commit a233caa0aa ("json: make pretty printing optional") I get
following build failure:
LINK rtmon
../lib/libutil.a(json_print.o): In function `new_json_obj':
json_print.c:(.text+0x35): undefined reference to `show_pretty'
collect2: error: ld returned 1 exit status
make[1]: *** [rtmon] Error 1
make: *** [all] Error 2
It is caused by missing show_pretty variable in rtmon.
On the other hand tc/tc.c there are two distinct variables and single
matches() call that handles -pretty option thus setting show_pretty
will never happen. Note that since commit 44dcfe8201 ("Change
formatting of u32 back to default") show_pretty is used in tc/f_u32.c
so this is first place where -pretty introduced.
Furthermore other utilities like misc/ifstat.c and misc/nstat.c define
pretty variable, however only for their own purposes. They both support
JSON output and thus depend show_pretty in new_json_obj().
Assuming above use common variable to represent -pretty option, define
it in utils.c and declare in utils.h that is commonly used. Replace
show_pretty with pretty.
Fixes: a233caa0aa ("json: make pretty printing optional")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
We are going to merge link_iptnl.c and link_ip6tnl.c and this is final
step to make their diffs clear and show what needs to be changed during
merge.
Note that it is safe to omit endpoint address(es) from netlink create
request as kernel is aware of such case and will use zero for that
endpoint(s).
Make sure we initialize ip6rdprefix and ip6rdrelayprefix bitlen in
link_iptnl.c only when configuring existing tunnel: if kernel does not
submit prefixlen in corresponding attributes preceeding get_addr_rta()
will set bitlen to -1 which is incorrect value.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
We are going to merge link_gre.c and link_gre6.c and this is final step
to make their diffs clear and show what needs to be changed during merge.
Note that it is safe to omit endpoint address(es) from netlink create
request as kernel is aware of such case and will use zero for that
endpoint(s).
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
We are going to merge link_vti.c and link_vti6.c and this is final step
to make their diffs clear and show what needs to be changed during merge.
Note that it is safe to omit endpoint address(es) from netlink create
request as kernel is aware of such case and will use zero for that
endpoint(s).
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Initializing @inet_prefix using C initializers or memset() seems
inefficient and unnecessary: only small part of ->data[] field will be
used to store address corresponding to ->family.
Instead initialize ->flags with zero and assume no other fields accessed
before checking corresponding bits in ->flags. For example special
helpers (e.g. is_addrtype_*()) can be used to ensure that @inet_prefix
contains valid ip or ipv6 address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add JSON and color output formatting to ip route command.
Similar to existing address and link output.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add description for -json and -pretty options.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Since JSON is intended for programmatic consumption, it makes
sense for the default output format to be concise as possible.
For programmer and other uses, it is helpful to keep the pretty
whitespace format; therefore enable it with -p flag.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
In gre/gre6 for non-JSON output 0x%x format is used: use print_0xhex()
to get the same value for JSON.
Get rid of custom _print_hex() in bridge slave code: print_0xhex() can
be used perfectly.
Break long print_uint() with long argument list to fit into 80 columns.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Few minor changes to reduce diffs between ip and ipv6 tunnel code:
1) reduce intendation by one level when adding attributes in gre and
gre6; reorder addattr*() calls to simplify diff
2) reorder local variables definition; change their type (e.g. for
IFLA_LINK) to match ones returned by rta_getattr_*()
3) move "mode" parameter parsing in link_iptnl.c to the similar
position as in link_ip6tnl.c
4) handle "tc" as shortcut for "tclass"/"tos" in link_iptnl.c
5) add whitespace where required
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reduce diff lines between iptnl and ip6tnl help printing code.
Use @struct link_util ->id field to print correct link help: all callers
now pass this data structure to iptunnel_print_help().
Get rid of custom print_usage() and usage() functions and use
iptunnel_print_help() directly, return from function on "... type
<help|garbage>" instead of exit(2).
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reduce diff lines between gre and gre6 help printing code.
Use @struct link_util ->id field to print correct link help: all callers
now pass this data structure to gre_print_help().
Get rid of custom print_usage() and usage() functions and use
gre_print_help() directly, return from function on "... type
<help|garbage>" instead of exit(2).
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reduce diff lines between vti and vti6 help printing code.
Use @struct link_util ->id field to print correct link help: all callers
now pass this data structure to vti_print_help().
Get rid of custom print_usage() and usage() functions and use
vti_print_help() directly, return from function on "... type
<help|garbage>" instead of exit(2).
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
If the kernel receives a negative nsid it will automatically assign
the next available nsid. In this case alloc_netid() will set min and
max to 0 for ird_alloc(). And when max == 0 idr_alloc() will interpret
this as the maximum range, i.e. specific to nsids it will try to find
an id in the range [0,INT_MAX). This is intentionally supported in the
kernel for nsids.
Commit acbe9118ce ("ip netns: use strtol() instead of atoi()")
regressed ip netns in that respect although previously the use-case
was either accidentally supported or opaquely supported such that it
triggered the original commit. From what I can gather it went as
follows before: atoi() was called with a string indicating a negative
value which caused it to return -1 which was passed to the
kernel. Let's make it less opaque by introducing the keyword "auto":
ip netns set <netns-name> auto
will cause nsid to be set to -1 and the kernel will select an available
nsid.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Minor refactoring to move flush into separate function to improve
readability and reduce depth of nesting.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Fix checkpatch complaints about assignment in conditions.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add whitespace around operators for consistency.
Use tabs for indentation.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
It seems bad idea to depend on sysfs being mounted and reflected to the
current network namespace. Same applies to procfs.
Instead netlink should be used to talk to the kernel and get list of
specific network devices among with their parameters.
Support for kernel netlink message filtering by passing IFLA_INFO_KIND
in RTM_GETLINK request: if kernel does not support filtering by the kind
we will check it in reply anyway. Check for ifi->ifi_type to be either
ARPHRD_NONE or ARPHRD_ETHER to seed up things a bit without kernel level
filtering.
Unfortunately tun driver does not implement dumping it's configuration
via netlink and we still need to use read_prop() which depends on sysfs
to get additional tun device information.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Both tunnels use legacy /proc/net/dev interface to get tunnel device and
it's statistics. This may cause problems for cases when procfs either
not mounted or not unshare(2)d for given network namespace.
Use netlink to walk through list of tunnel devices which is network
namespace aware and provides additional information such as statistics
in the dump message.
Since both address family specific variants of do_tunnels_list() nearly
the same, except for tunnel parameters structure initialization,
matching and printing we can introduce common one in tunnel.c.
To implement address family specific parts introduce new data structure
@struct tnl_print_nlmsg_info what contains all necessary information as
well as pointers to ->init(), ->match() and ->print() callbacks.
Annotate data structures by const where appropriate.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Use switch () instead of if () to compare tunnel type to fit into 80
columns and make code more readable. Print "\n" using fputc().
In iptunnel.c abstract tunnel parameters matching code in iptunnel.c
into ip_tunnel_parm_match() helper to conform with ip6tunnel.c. Use
memset() to initialize @p1.
In ip6tunnel.c no need to call ll_name_to_index() with name twice: just
use found previously index. Do not initialize @p1: this is done in
ip6_tnl_parm_init().
This is to show real differences between ip and ipv6 do_tunnels_list()
implementations and prepare for upcoming unification of them.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This is first step to move tunnel code to use rtnl dump interface
instead of /proc/net/dev read.
Make tnl_print_stats() to accept @struct rtnl_link_stats64 parameter,
introduce tnl_get_stats() that will parse line from /proc/net/dev into
@struct rtnl_link_stats64.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Assume all statistics in ip(8) represented either by IFLA_STATS64 or
IFLA_STATS is 64 bit. It is clean that we can store __u32 counters of
@struct rtnl_link_stats in __u64 counters in @struct rtnl_link_stats64.
New get_rtnl_link_stats_rta() follows __print_link_stats() behaviour on
handling of stats attribute: copy no more than size of data structure
and no less than attribute length zeroing rest.
Drop print_link_stats32() as it's functionality can be handled by 64bit
variant. Move code from __print_link_stats() to print_link_stats64() and
finally rename print_link_stats64() to __print_link_stats().
More users of introduced function will come in future.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
To show real differences between these two variants adjust whitespace
intendation and use print_uint() instead of print_int() as all members
in both @struct rtnl_link_stats and @struct rtnl_link_stats64 are
unsigned.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
For JSON and colorization, make common code a function.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Make printing of multipath attributes a function to improve
readability.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Since these fields are printed in both route and multipath case;
avoid duplicating code.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Make a separate function to improve readability and enable
easier JSON conversion.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Make common function for decoding cacheinfo.
This code may print more info than old version in some cases.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Refactor to reduce size of print_route and improve
readability.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Both next hop and route need to decode flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
We have helper routines to support nested attribute addition into
netlink buffer: use them instead of open coding.
Use addattr_nest_compat()/addattr_nest_compat_end() where appropriate.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
1) Rename @hdr parameter to @n to be coherent with rest of the parsing
code.
2) Use NLMSG_DATA() to get pointer to the data after nlmsghdr instead
of calculating it directly in ip/tunnel code.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Having iplink_parse() and @struct iplink_req in include/utils.h does not
reflect it's IP nature: move to ip/ip_common.h.
Move contents of ip/iplink_xdp.h and ip/iproute_lwtunnel.h to
ip/ip_common.h since they are small (i.e. only two function prototypes):
ip/iplink_bridge.c and ip/iplink_vrf.c prototypes already there.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This reverts commit 63891c7013.
It seems print_linkinfo_brief() never accepts filter different than
default one and David Ahern suggests to revert it instead of making
new change that actually do revert.
Conflicts:
ip/ipaddress.c
ip/iplink.c
These are caused by JSON support addition after commit we reverting.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The JSON object name for statistics in ip link show is "stats644".
Looks like a typo, commit d0e720111a ("ip: ipaddress.c: add support
for json output") contains an example with the expected "stats64" name.
The fact that no one has noticed until now is probably an indication
that no one is using this object. Hopefully it's not too late to fix
this, although IIUC this has already been in 4.13 and 4.14 releases :S
Fixes: d0e720111a ("ip: ipaddress.c: add support for json output")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Introduce and use tnl_print_endpoint() helper to print of tunnel
endpoint address.
Note that for AF_INET and AF_INET6 inet_ntop(3) is used that may return
NULL in case of failure and while unlikely format_host_rta() might
return NULL too. Handle this case when passing local/remote to
print_string().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there remove & from inet_prefix.data when since it is array.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
There are couple of minor improvements:
1) Check erspan_ver == 2 in gre6. It still could
be 1 if erspan_idx is 0.
2) Add tunnel encapsulation attributes only when
collect metadata not in effect in gre.
3) Trivial: address checkpatch issues.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Print only "external" if collect meta data attribute
is given: rest of parameters are irrelevant. This is
to follow gre6.
For both JSON and non-JSON output use "external" for
all tunnels including vxlan and geneve.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While benefit from using ll_name_to_index() with populated
cache can potentially be exploited only in few places
(e.g. bridge fdb/mdb/vlan show routines) there is another
advantage of ll_name_to_index() over plain if_nametoindex():
in case of if_nametoindex() failure ll_name_to_index()
will attempt to get index from common name in form "if%d"
that may be returned from ll_index_to_name().
This makes output from ip(8) coherent with it's input.
Note that most of the code already switched from plain
if_nametoindex() to ll_name_to_index() to cached variant.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In prepare of link_vti.c and link_vti6.c merge:
1) Make @fwmark of __u32 type instead of unsigned int
in vti to match with rest tunneling code.
2) Report when unable to translate @link network device
name to index instead of silently exiting in vti6.
3) Remove newline separating local/remote attributes
from the ikey/okey in vti6 to match vti module.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Handle "inherit" case properly for gre6 and ip6tnl.
Use get_u8() in gre to parse ttl/hoplimit.
Be consistent about "hlim" alias to ttl/hoplimit
support.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Few minor changes after merge of 'master' into 'net-next' branch:
1) Follow 80 line length for printing erspan_index parameter
as we did in master with commit 2a8d0f6e9c ("gre/tunnel:
Print erspan_index using print_uint()").
2) Remove remnants of encapsulation option printing: now it
is done using tnl_print_encap() helper in commit bad76e6b1f
("ip/tunnel: Abstract tunnel encapsulation options printing").
Fixes: 8c75f69411 ("Merge branch 'master' into net-next")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Both geneve and vxlan modules are converted to
use get_addr() we can replace inet_get_addr()
in less problematic places and finally get
rid of inet_get_addr().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Now we have additional information about address
class from get_addr() we can use it in place of
inet_get_addr().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Now we have additional information about address
class from get_addr() we can use it in place of
inet_get_addr().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
We return constant string from tnl_strproto(), no need
to copy it to temporary buffer and then return such
buffer as const: return constant string instead.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Both of these two changes are missing for link_vti6.c:
commit 8b47135474 ("ip: link: Unify link type help functions a bit")
commit 561e650eff ("ip link: Shortify printing the usage of link type")
Replay them on link_vti6.c to bring link type help functions
inline with other tunneling code.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
For vti6 tunnel we print [io]key in dotted-quad notation
(ipv4 address) while in vti we do that in hex format.
For vti tunnel we print [io]key only if value is not
zero while for vti6 we miss such check.
Unify vti and vti6 tunnel [io]key output.
While here enlarge s2 buffer to the same size as in rest
of tunnel support code (64 bytes) and check return from
inet_ntop().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
One is missing in JSON output because fprintf()
is used instead of print_uint().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Get rid of code duplications and consolidate encapsulation
options printing in single function - tnl_print_encap().
Introduce and use tnl_encap_str() to format encapsulation
option string according to tempate and given values to avoid
code duplication and simplify it.
Use print_string() instead of fputs() and fprintf() to
print encapsulation for !is_json_context().
Print "unknown" parameter for "encap" type in PRINT_FP
context using "%s " format specifier and benefit from
complite time string merge.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
No need for custom SPRINT_BUF() and snprintf() 0x%x
value to this buffer: we can use print_0xhex() instead
of print_string().
In link_iptnl.c use s2 instead of s1 buffer and remove
s1.
While there adjust fwmark option print order in iptnl
and ip6tnl to get it match each other.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
For ip tunnels tos can be 0 when not configured, 1 when
inherited from encapsulated packet and rest specifying
diffserv (rfc2474) or tos (rfc1349) bits. It is stored
in packet tos/diffserv field and returned in tos
netlink attribute to userspace.
Simplify and unify tos printing by using print_0xhex()
and print_string() instead of fprintf() to output values.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Both ttl/hoplimit is from 1 to 255. Zero has special meaning:
use encapsulated packet value. In ip-link(8) -d output this
looks like "ttl/hoplimit inherit". In JSON we have "int" type
for ttl and therefore values from 0 (inherit) to 255.
To do the best in handling ttl/hoplimit we need to accept
both cases: missing attribute in netlink dump and zero value
for "inherit"ed case. Last one is broken since JSON output
introduction for gre/iptnl versions and was never true for
gre6/ip6tnl.
For all tunnels, except ip6tnl change JSON type from "int" to
"uint" to reflect true nature of the ttl.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
There are two reasons for switching to cached variant:
1) ll_index_to_name() may return result from cache,
eliminating expensive ioctl() to the kernel.
Note that most of the code already switched from plain
if_indextoname() to ll_index_to_name() to cached variant
in print path because in most cases cache populated.
2) It always return name in the form "if%d", even if
entry is not in cache and ioctl() fails. This drops
"link_index" from JSON output.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
We need NEXT_ARG() to get *argv pointing to "alias"
parameter value. Overwise we get and check "alias"
string length.
Fixes: f88becf35e ("iplink: Process "alias" parameter correctly")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When using the new minimum rate API and providing only one parameter
(minimum rate/maximum rate), we query the VF min and max rate regardless
of kernel support.
This resulted in segmentation fault in ipaddr_loop_each_vf, which tries
to access NULL pointer.
This patch identifies such cases by testing the VF table for NULL
pointer in IFLA_VF_RATE, and aborts the operation.
Aborting on the first VF is valid since if the kernel does not support
the new API for the first VF, it will not support it for the other VFs
as well.
Fixes: f89a2a05ff ("Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool")
Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
According to the documentation (man ip-link), the minimum TXRATE should
be always <= Maximum TXRATE, but commit f89a2a05ff ("Add support to
configure SR-IOV VF minimum and maximum Tx rate through ip tool") didn't
enforce it.
Fixes: f89a2a05ff ("Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool")
Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
These files are already mostly written in POSIX shell, so convert their
shebangs to /bin/sh and tweak the few bashisms in here.
URL: https://crbug.com/756559
Reported-by: Pat Erley <perley@chromium.org>
Signed-off-by: Mike Frysinger <vapier@chromium.org>
To follow gre6 output print hoplimit before encapsulation
limit in link_ip6tnl.c.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
To follow ip6tnl output print flowlabel after tclass
in link_gre6.c.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use %u format specifier to print it in link_gre6.c and
make code more readable.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use @s2 buffer to store string representation of
flowlabel and get rid of extra SPRINT_BUF(): no
need to preserve @s2 contents for later.
Use print_string(PRINT_ANY, ...) with prepared by
snprintf() string for both PRINT_JSON and PRINT_FP
cases.
Omit flowlabel from output if no flowinfo attribute
is given and IP6_TNL_F_USE_ORIG_FLOWLABEL isn't set.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use @s2 buffer to store string representation of
tclass and get rid of extra SPRINT_BUF(): no
need to preserve @s2 contents for later.
Use print_string(PRINT_ANY, ...) with prepared by
snprintf() string for both PRINT_JSON and PRINT_FP
cases.
While there use __u32 for flowinfo in link_gre6.c
and check for IFLA_GRE_FLOWINFO attribute presense.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
It is implementation internal and main purpose
of printing it seems debugging.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In link_gre6.c it seems copy paste error: tclass is 8 bits,
not 20 as flowlabel.
In link_iptnl.c rename "flowinfo_tclass" to "tclass" as it
correct name since flowinfo is implementation internal name
used to label combined within u32 attribute tclass and
flowlabel.
Fixes: 1facc1c61c ("ip: link_ip6tnl.c: add json output support")
Fixes: 2e706e12d9 ("Merge branch 'master' into net-next")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
It seems missing pair of open_json_object()/close_json_object()
in iptnl implementation.
Note that we open "encap" JSON object in ip6tnl.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Tunnel mode does not appear in parameters print for iptnl
supported tunnels like ipip and sit, while printed for
ip6tnl.
Print tunnel mode as "proto" field name for JSON and
without any name when printing to cli to follow ip6tnl
behaviour.
For non JSON output we have:
$ ip -d link show dev sit1
Before:
-------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
sit remote any local X.X.X.X ...
~~~
After:
------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
sit any remote any local X.X.X.X ...
^^^
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Both sit and ipip "mode" parameter handling nearly the same.
Except for sit we have "ip6ip" mode: check it only when
configuring sit.
Note that there is no need strcmp(lu->id, "ipip"): if it is
not sit it is "ipip" because we have only these two link util
defined in module.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
netdevsim is a new software device for testing kernel APIs
without any hardware attached. Allow users to create such
devices.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
It is already given for original device we configure this
peer for.
Results from following command before/after change applied
are shown below:
$ ip link add dev veth1a type veth peer name veth1b \
type veth peer name veth1c
Before:
-------
<no output, no netdevs created>
After:
------
Error: duplicate "type": "veth" is the second value.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
We always send flowinfo to the kernel. If flowlabel/tclass
was set first to non-inherit value and then reset to
inherit we do not clear flowlabel/tclass part in flowinfo,
send it to kernel and can get from the kernel back.
Even if we check for IP6_TNL_F_USE_ORIG_TCLASS and
IP6_TNL_F_USE_ORIG_FLOWLABEL when printing options
sending invalid flowlabel/tclass to the kernel seems
bad idea.
Note that ip6tnl always clean corresponding flowinfo
parts on inherit.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
We must clear bit, not set all but given bit.
Fixes: 858dbb208e ("ip link: Add support for remote checksum offload to IP tunnels")
Fixes: 73516e128a ("ip6tnl: Support for fou encapsulation"
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
The patch adds erspan usage description, so 'ip link help erspan'
and 'ip link help ip6erspan' shows the options.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Since rtnl_talk() never returns with answer buffer allocated
on error we do not need to release it manually. After this
initializing answer with NULL before rtnl_talk() is useless.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
The patch adds support for configuring the erspan v2, for both
ipv4 and ipv6 erspan implementation. Three additional fields
are added: 'erspan_ver' for distinguishing v1 or v2, 'erspan_dir'
for specifying direction of the mirrored traffic, and 'erspan_hwid'
for users to set ERSPAN engine ID within a system.
As for manpage, the ERSPAN descriptions used to be under GRE, IPIP,
SIT Type paragraph. Since IP6GRE/IP6GRETAP also supports ERSPAN,
the patch removes the old one, creates a separate ERSPAN paragrah,
and adds an example.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
When running "ip route list default" and not specifying address family,
one will get all of the routes instead of just default only. The same
is for "exact default" and "match default".
It behaves in such a way because default route with unspecified family
has the same all-zeroes value like no prefix specified at all. Thus
following code blindly ignores the fact, that prefix was actually
specified.
This patch adds the flag PREFIXLEN_SPECIFIED to the default route too.
And then checks its value when filtering routes.
Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Metric is one of the "unique key" fields of the route in Linux. But
still one can not use its value in filter while running ip list.
Because of this writing checks in scripts for example is incovenient.
Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
All tunnels already support for parsing/adding zero
endpoints and vti6 isn't an exception.
This check was added as part of commit 2a80154fde
(vti6: fix local/remote any addr handling) and looks
too restrictive as purpose of change is to avoid
endpoint configuration from uninitialized data.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use specialized helper to initialize endpoint addresses with
zeros instead of open coding this. This unifies initialization
style with other ipv6 tunnel variants (i.e. gre6 and vti6).
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
It is added with
commit a7ed1520ee ("ip/tunnel: introduce tnl_parse_key()")
to avoid code duplication in ip6?tunnel.c.
Reuse it for gre/gre6 and vti/vti6 tunnel rtnl
configuration interface with the same purpose
it is used in tunnel ioctl interface in ip6?tunnel.c.
While there change type of key variables from
unsigned integer to __be32 to reflect nature of the
value they store and place error message in
tnl_parse_key() on a single line to make single
call to fprintf().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since commit 625df645b7 (Check user supplied interface name lengths)
iplink_parse() validates network device name using check_ifname()
helpers.
Remove redundant "name" length checks from iplink_parse() callers.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Do not stop parameters processing after "alias" parameter: it might
not be a last one. Seems copy pasted from "type" parameter code.
Check it's length does not exceed IFALIASZ - 1. Better we warn
than get RTNL error.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Correctly check for valid network device index supplied on
command line: indexes are always greather than zero. Check
for duplicate "index" argument.
Initialize @index to 0 to simplify handling it in iplink_modify().
Other callers (link_veth.c, iplink_vxcan.c) already did so.
No need to initialize ifi_index with 0 since it is already
initialized at the @struct req initialization time and not
modified in iplink_parse().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Manual page ip-link(8) states that both local and remote accept
IPADDR not PREFIX. Use get_addr() instead of get_prefix() to
parse local/remote endpoint address correctly.
Force corresponding address family instead of using preferred_family
to catch weired cases as shown below.
Before this patch it is possible to create tunnel with commands:
ip li add dev ip6gre2 type ip6gre local fe80::1/64 remote fe80::2/64
ip -4 li add dev ip6gre2 type ip6gre local 10.0.0.1/24 remote 10.0.0.2/24
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
It is fully legal to submit zero (INADDR_ANY/IN6ADDR_ANY_INIT)
value for local and/or remote endpoints for all tunnel drivers:
no need additionally check this in userspace.
Note that all tunnel specific code already can pass zero address
to the kernel.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
The patch adds 'external' option to support collect metadata
gre6 tunnel. The 'external' keyword is already used to set the
device into collect metadata mode such as vxlan, geneve, ipip,
etc. This patch extends support for ipv6 gre and gretap.
Example of L3 and L2 gre device:
bash:~# ip link add dev ip6gre123 type ip6gre external
bash:~# ip link add dev ip6gretap123 type ip6gretap external
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Until kernel exports these, add GSO_MAX values into iplink
rather than assuming they are UINT_MAX + 1
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Validate the upper limit for gso_max_size, valid range is [0-65,536]
inclusive. Fix minor whitespace in iplink man page.
Signed-off-by: Solio Sarabia <solio.sarabia@intel.com>
Add missing tag 'vxcan' inside the help text which was missing in commit
efe459c76d ('ip: link add vxcan support').
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Recently `external` support was added to the tunnel drivers, but there is no way
to introspect this from userspace. This adds support for that.
Now `ip -details link` shows it:
```
7: tunl60@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group
default qlen 1
link/tunnel6 :: brd :: promiscuity 0
ip6tnl external any remote :: local :: encaplimit 0 hoplimit 0 tclass 0x00 flowlabel 0x00000 (flowinfo 0x00000000) addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
```
Signed-off-by: Phil Dibowitz <phil@ipom.com>
This allows sending GSO maximum values when configuring a device.
The values are advisory. Most devices will ignore them but for some
pseudo devices such as veth pairs they can be set.
Example:
# ip link add dev vm1 type veth peer name vm2 gso_max_size 32768
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Specifying the IFLA_VXLAN_LINK attribute on a vxlan link modify is
optional in the kernel, so make the id argument optional for "ip link
set ..." to avoid a user needing to specify it when changing another
attribute.
Signed-off-by: Robert Shearman <rs823p@att.com>
Specifying "... ttl inherit" currently does nothing on a GRE link
modify since the previous ttl value is retrieved up front. Fix this by
explicitly setting ttl to 0 when "inherit" is specified for the
option, since 0 represents the semantics of inherit.
Signed-off-by: Robert Shearman <rs823p@att.com>
Looks like a typo: get_u8() returns 0 on success and -1 on error, so the
error checking here was ineffective.
Fixes: a11b7b71a6 ("link_gre6: really support encaplimit option")
Signed-off-by: Phil Sutter <phil@nwl.cc>
When xdpoffload option is used, communicate the ifindex down
to the kernel to trigger device-specific load.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
bpf_parse_common() parses and loads the program. Rename it
accordingly.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Program type is needed both for parsing and loading of
the program. Parsing may also induce the type based on
signatures from __bpf_prog_meta. Instead of passing
the type around keep it in struct bpf_cfg_in.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
For all files in iproute2 which do not have an obvious license
identification, mark them with SPDK GPL-2
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adapts the tc command line interface to allow bandwidth limits
to be specified as a percentage of the interface's capacity.
Adding this functionality requires passing the specified device string to
each class/qdisc which changes the prototype for a couple of functions: the
.parse_qopt and .parse_copt interfaces. The device string is a required
parameter for tc-qdisc and tc-class, and when not specified, the kernel
returns ENODEV. In this patch, if the user tries to specify a bandwidth
percentage without naming the device, we return an error from userspace.
Signed-off-by: Nishanth Devarajan<ndev2021@gmail.com>
Expose identifier type and hook types in ILA configuraiton
and reporting. This adds support in both ip ila ILA LWT.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Configuration support in both ip ila and ip LWT for checksum
neutral-map-auto. This is a mode of ILA where checksum
neutral mapping is assumed for packets (there is no C-bit
in the identifier to indicate checksum neutral).
Signed-off-by: Tom Herbert <tom@quantonium.net>
Add checksum neutral to ip ila configuration. This control whether
the C-bit is interpreted as checksum neutral bit.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Sample output:
$ sudo ./ip/ip fou add port 111 ipproto 11
$ sudo ./ip/ip fou add port 222 ipproto 22 -6
$ ./ip/ip fou show
port 222 ipproto 22 -6
port 111 ipproto 11
Signed-off-by: Greg Greenway <ggreenway@apple.com>
As was reported [1], the iproute2 fails to compile on old systems,
in Cong's case, it was Fedora 19, in our case it was RedHat 7.2, which
failed with the following errors during compilation:
ipxfrm.c: In function ‘xfrm_selector_print’:
ipxfrm.c:479:7: error: ‘IPPROTO_MH’ undeclared (first use in this
function)
case IPPROTO_MH:
^
ipxfrm.c:479:7: note: each undeclared identifier is reported only once
for each function it appears in
ipxfrm.c: In function ‘xfrm_selector_upspec_parse’:
ipxfrm.c:1345:8: error: ‘IPPROTO_MH’ undeclared (first use in this
function)
case IPPROTO_MH:
^ make[1]: *** [ipxfrm.o] Error 1
The reason to it is the order of headers files. The IPPROTO_MH field is
set in kernel's UAPI header file (in6.h), but only in case
__UAPI_DEF_IPPROTO_V6 is set before. That define comes from other kernel's
header file (libc-compat.h) and is set in case there are no previous
libc relevant declarations.
In ip code, the include of <netdb.h> causes to indirect inclusion of
<netinet/in.h> and it sets __UAPI_DEF_IPPROTO_V6 to be zero and prevents from
IPPROTO_MH declaration.
This patch takes the simplest possible approach to fix the compilation
error by checking if IPPROTO_MH was defined before and in case it
wasn't, it defines it to be the same as in the kernel.
[1] https://www.spinics.net/lists/netdev/msg463980.html
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Riad Abo Raed <riada@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Any iproute utility that uses any function from lib/utils.c needs
to declare its own resolve_hosts variable instance although it does
not need/use hostname resolving functionality (currently only 'ip'
and 'ss' commands uses this).
The patch declares single common instance of resolve_hosts directly
in utils.c so the existing ones can be removed (the same approach
that is used for timestamp_short).
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Using 'ip deleteall' with policies that have marks, fails unless you
eplicitely specify the mark values. This is very uncomfortable when
bulk-deleting policies and states. With this patch all relevant states
and policies are wiped by 'ip deleteall' regardless of their mark
values.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Socket polices are added to a socket using setsockopt(2). They cannot be
deleted by iproute2. The attempt to delete them causes an error
(EINVAL).
To avoid this unnecessary error message all socket policies are skipped
in xfrm_policy_keep.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Listing policies on systems with a lot of socket policies can be
confusing due to the number of returned polices. Even if socket polices
are not of interest, they cannot be filtered. This patch adds an option
to filter all socket policies from the output.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
IPvlan supported bridge-only functionality prior to commits
a190d04db937 ('ipvlan: introduce 'private' attribute for all
existing modes.') and fe89aa6b250c ('ipvlan: implement VEPA mode').
These two commits allow to configure the VEPA and private modes now.
This patch adds those options in ip command.
e.g.
bash:~# ip link add link eth0 name ipvl0 type ipvlan mode l2 private
-or-
bash:~# ip link add link eth0 type ipvl0 type ipvlan mode l2 vepa
Also the output will reflect the mode and the mode-flag accordingly.
e.g.
bash:~# ip -details link show ipvl0
4: ipvl0@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc ...
link/ether 00:1a:11:44:a5:3e brd ff:ff:ff:ff:ff:ff promiscuity 0
ipvlan mode l2 private addrgenmode eui64 numtxqueues 1 ...
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
This patch adds fastopen_no_cookie option to enable/disable TCP fastopen
without a cookie on a per-route basis.
Support in Linux was added with 71c02379c762 (tcp: Configure TFO without
cookie per socket and/or per route).
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Use strtol-based API to parse and validate integer input; atoi() does
not detect errors and may yield undefined behaviour if result can't be
represented.
v2: use get_unsigned() since network namespace is really an unsigned value.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
IP6_TNL_F_ALLOW_LOCAL_REMOTE allows tunnel traffic on ip6tnl devices
where the remote endpoint is a local host address.
Specifying "[no]allow-localremote" controls the
IP6_TNL_F_ALLOW_LOCAL_REMOTE flag on ip6tnl interfaces.
This is the user-space counterpart for kernel
commit 908d140a87a7 ("ip6_tunnel: Allow rcv/xmit even if remote address is a local address")
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
This config maps to IFLA_BRPORT_VLAN_TUNNEL bridge port netlink
flag attribute. This flag enables vlan to tunnel mapping on a bridge
port. It is off by default.
set vlan_tunnel attribute on bridge port vxlan0:
$ip link set dev vxlan0 type bridge_slave vlan_tunnel on
$ip link set dev vxlan0 type bridge_slave vlan_tunnel off
or via bridge command
$bridge link set dev vxlan0 vlan_tunnel on
$bridge link set dev vxlan0 vlan_tunnel off
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
This is an update for 460c03f3f3 ("iplink: double the buffer size also in
iplink_get()"). After update, we will not need to double the buffer size
every time when VFs number increased.
With call like rtnl_talk(&rth, &req.n, NULL, 0), we can simply remove the
length parameter.
With call like rtnl_talk(&rth, nlh, nlh, sizeof(req), I add a new variable
answer to avoid overwrite data in nlh, because it may has more info after
nlh. also this will avoid nlh buffer not enough issue.
We need to free answer after using.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Add neigh_suppress to the type help and document it in ip-link's man page.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Commit 530903dd90 ("ip: fix igmp parsing when iface is long") uses
variable len to keep trailing colon from interface name comparison. This
variable is local to loop body but we set it in one pass and use it in
following one(s) so that we are actually using (pseudo)random length for
comparison. This became apparent since commit b48a1161f5 ("ipmaddr: Avoid
accessing uninitialized data") always initializes len to zero so that the
name comparison is always true. As a result, "ip maddr show dev eth0" shows
IPv4 multicast addresses for all interfaces.
Instead of keeping the length, let's simply replace the trailing colon with
a null byte. The bonus is that we get correct interface name in ma.name.
Fixes: 530903dd90 ("ip: fix igmp parsing when iface is long")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Phil Sutter <phil@nwl.cc>
Acked-by: Petr Vorel <pvorel@suse.cz>
This patch adds the iproute2 support for getting and setting the
per-port group_fwd_mask. It also tries to resolve the value into a more
human friendly format by printing the known protocols instead of only
the raw value.
The man page is also updated with the new option.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
neigh suppression can be used to suppress arp and nd flood
to bridge ports. It maps to the recently added
kernel support for bridge port flag IFLA_BRPORT_NEIGH_SUPPRESS.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Since kernel net-next commit c7c0bbeae950 ("net: ipmr: Add MFC offload
indication") the kernel indicates on an MFC entry whether it was offloaded
using the RTNH_F_OFFLOAD flag. Update the "ip mroute show" command to
indicate when a route is offloaded, similarly to the "ip route show"
command.
Example output:
$ ip mroute
(0.0.0.0, 239.255.0.1) Iif: sw1p7 Oifs: t_br0 State: resolved offload
(192.168.1.1, 239.255.0.1) Iif: sw1p7 Oifs: sw1p4 State: resolved offload
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
The original problem was that something like:
| strncpy(ifr.ifr_name, *argv, IFNAMSIZ);
might leave ifr.ifr_name unterminated if length of *argv exceeds
IFNAMSIZ. In order to fix this, I thought about replacing all those
cases with (equivalent) calls to snprintf() or even introducing
strlcpy(). But as Ulrich Drepper correctly pointed out when rejecting
the latter from being added to glibc, truncating a string without
notifying the user is not to be considered good practice. So let's
excercise what he suggested and reject empty, overlong or otherwise
invalid interface names right from the start - this way calls to
strncpy() like shown above become safe and the user has a chance to
reconsider what he was trying to do.
Note that this doesn't add calls to check_ifname() to all places where
user supplied interface name is parsed. In many cases, the interface
must exist already and is therefore looked up using ll_name_to_index(),
so if_nametoindex() will perform the necessary checks already.
Signed-off-by: Phil Sutter <phil@nwl.cc>
In both files' parse_args() functions as well as in iptunnel's do_prl()
and do_6rd() functions, a user-supplied 'dev' parameter is uselessly
copied into a temporary buffer before passing it to ll_name_to_index()
or copying into a struct ifreq. Avoid this by just caching the argv
pointer value until the later lookup/strcpy.
Signed-off-by: Phil Sutter <phil@nwl.cc>
When SA is added manually using "ip xfrm state add", xfrm_state_modify()
uses alg_key_len field of struct xfrm_algo for the length of key passed to
kernel in the netlink message. However alg_key_len is bit length of the key
while we need byte length here. This is usually harmless as kernel ignores
the excess data but when the bit length of the key exceeds 512
(XFRM_ALGO_KEY_BUF_SIZE), it can result in buffer overflow.
We can simply divide by 8 here as the only place setting alg_key_len is in
xfrm_algo_parse() where it is always set to a multiple of 8 (and there are
already multiple places using "algo->alg_key_len / 8").
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
This fixes a corner-case for routes with a certain metric locked to
zero:
| ip route add 192.168.7.0/24 dev eth0 window 0
| ip route add 192.168.7.0/24 dev eth0 window lock 0
Since the kernel doesn't dump the attribute if it is zero, both routes
added above would appear as if they were equal although they are not.
Fix this by taking mxlock value for the given metric into account before
skipping it if it is not present.
Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
As Stephen Hemminger mentioned on the last submission the new_json_obj
function is always called with fp == stdout, so right now, there's no
need of this extra argument.
The background for the rework is the following:
The ip monitor didn't call `new_json_obj` (even for in non json context),
so the static FILE* _fp variable wasn't initialized, thus raising a
SIGSEGV in ipaddress.c. This patch should fix this issue for good, new
paths won't have to call `new_json_obj`.
How to reproduce:
$ ip -t mon label link
(gdb) bt
.#0 _IO_vfprintf_internal (s=s@entry=0x0, format=format@entry=0x45460d “%d: “, ap=ap@entry=0x7fffffff7f18) at vfprintf.c:1278
.#1 0x0000000000451310 in color_fprintf (fp=0x0, attr=<optimized out>, fmt=0x45460d “%d: “) at color.c:108
.#2 0x000000000044a856 in print_color_int (t=t@entry=PRINT_ANY, color=color@entry=4294967295, key=key@entry=0x4545fc “ifindex”,
fmt=fmt@entry=0x45460d “%d: “, value=<optimized out>) at ip_print.c:132
.#3 0x000000000040ccd2 in print_int (value=<optimized out>, fmt=0x45460d “%d: “, key=0x4545fc “ifindex”, t=PRINT_ANY) at ip_common.h:189
.#4 print_linkinfo (who=<optimized out>, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipaddress.c:1107
.#5 0x0000000000422e13 in accept_msg (who=0x7fffffff8320, ctrl=0x7fffffff8310, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipmonitor.c:89
.#6 0x000000000044c58f in rtnl_listen (rtnl=0x672160 <rth>, handler=handler@entry=0x422c70 <accept_msg>, jarg=0x7ffff77a82a0 <_IO_2_1_stdout_>)
at libnetlink.c:761
.#7 0x00000000004233db in do_ipmonitor (argc=<optimized out>, argv=0x7fffffffe5a0) at ipmonitor.c:310
.#8 0x0000000000408f74 in do_cmd (argv0=0x7fffffffe7f5 “mon”, argc=3, argv=0x7fffffffe588) at ip.c:116
.#9 0x0000000000408a94 in main (argc=4, argv=0x7fffffffe580) at ip.c:311
Fixes: 6377572f ("ip: ip_print: add new API to print JSON or regular format output")
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
Move the json printer which is based on json writer into the
iproute2 library, so it can be used by library code and tools
other than ip. Should probably have been done from the beginning
like that given json writer is in the library already anyway.
No functional changes.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Julien Fortin <julien@cumulusnetworks.com>
Obviously, 'addr showdump' feature wasn't adjusted to json output
support. As a consequence, calls to print_string() in print_addrinfo()
tried to dereference a NULL FILE pointer.
Fixes: d0e720111a ("ip: ipaddress.c: add support for json output")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Consolidate dump of prog info to use bpf_dump_prog_info() when possible.
Moving forward, we want to have a consistent output for BPF progs when
being dumped. E.g. in cls/act case we used to dump tag as a separate
netlink attribute before we had BPF_OBJ_GET_INFO_BY_FD bpf(2) command.
Move dumping tag into bpf_dump_prog_info() as well, and only dump the
netlink attribute for older kernels. Also, reuse bpf_dump_prog_info()
for XDP case, so we can dump tag and whether program was jited, which
we currently don't show.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Commit 72b365e8e0 ("libnetlink: Double the dump buffer size") increased
the buffer size for "ip link show" command to 32 KB to handle NICs with
large number of VFs. With "dev" filter, a different code path is taken and
iplink_get() still uses only 16 KB buffer.
The size of 32768 is not very future-proof as NICs supporting 120-128 VFs
are already in use so that single RTM_NEWLINK message in the dump can
exceed 30000 bytes. But it's what rtnl_talk() and rtnl_dump_filter_l() use
so let's be consistent. Once this proves insufficient, all three sizes
should be increased.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
If message length exceeds maxlen argument of rtnl_talk(), it is truncated
to maxlen but unlike in the case of truncation to the length of local
buffer in rtnl_talk(), the caller doesn't get any indication of a problem.
In particular, iplink_get() passes the truncated message on and parsing it
results in various warnings and sometimes even a segfault (observed with
"ip link show dev ..." for a NIC with 125 VFs).
Handle message truncation in iplink_get() the same way as truncation in
rtnl_talk() would be handled: return an error.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
This patch converts spots where manual buffer termination was missing to
strlcpy() since that does what is needed.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Print the value analogous to flowlabel. While being at it, also break
the overlong lines to not exceed 80 characters boundary.
Signed-off-by: Phil Sutter <phil@nwl.cc>
When trying to change tclass or flowlabel of a GREv6 tunnel which has
the respective value set already, the code accidentally bitwise OR'ed
the old and the new value, leading to unexpected results. Fix this by
clearing the relevant bits of flowinfo variable prior to assigning the
new value.
Fixes: af89576d7a ("iproute2: GRE over IPv6 tunnel support.")
Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch adds support for the L2ENCAP seg6 mode, enabling to encapsulate
L2 frames within SRv6 packets.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
The original issue was that filter.name might end up unterminated if
user provided string was too long. But in fact it is not necessary to
copy the commandline parameter at all: just make filter.name point to it
instead.
Signed-off-by: Phil Sutter <phil@nwl.cc>
The patch adds ERSPAN type II tunnel support. The implementation is
based on the draft at
https://tools.ietf.org/html/draft-foschiano-erspan-01.
One of the purposes is for Linux box to be able to receive ERSPAN
monitoring traffic sent from the Cisco switch, by creating a ERSPAN
tunnel device. In addition, the patch also adds ERSPAN TX, so traffic
can also be encapsulated into ERSPAN and sent out.
The implementation reuses the key as ERSPAN session ID, and
field 'erspan' as ERSPAN Index fields:
./ip link add dev ers11 type erspan seq key 100 erspan 123 \
local 172.16.1.200 remote 172.16.1.100
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Meenakshi Vohra <mvohra@vmware.com>
This renames Config to config.mk and includes more Make input.
Now configure generates all the required CFLAGS and LDLIBS for
the optional libraries.
Also, use pkg-config to test for libelf, rather than using a test
program. This makes it consistent with other libraries.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
can_state_names array contains at most CAN_STATE_MAX fields, so allowing
an index to it to be equal to that number is wrong. While here, also
make sure the array is indeed that big so nothing bad happens if
CAN_STATE_MAX ever increases.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Covscan complained about dead code but after reading it, I assume the
author's intention was to prefix the interface list with 'Oifs: '.
Initializing first to 1 and setting it to 0 after above prefix was
printed should fix it.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This variable is initialized at declaration and nowhere else does any
assignment to it happen, so just drop the check.
Signed-off-by: Phil Sutter <phil@nwl.cc>
ila_csum_name2mode() returning -1 on error but being declared as
returning __u8 doesn't make much sense. Change the code to correctly
detect this issue. Checking for __u8 overruns shouldn't be necessary
though since ila_csum_name2mode() return values are well-defined.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This prevents word-splitting and therefore leads to more accurate error
message in case 'grep -c' prints something other than a number.
Signed-off-by: Phil Sutter <phil@nwl.cc>
To avoid code duplication and have a ligther impact on most of the files,
these functions were made to handle both stdout (FP context) or JSON
output. Using this api, the changes are easier to read and the code
stays as compact as possible.
includes json_writer.h in ip_common.h to make the lib/json_writer.c
functions available to the new "ip_print" api.
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
This patch adds support for the seg6local lightweight tunnel
("ip route add ... encap seg6local ...").
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Commit 69fed534a5 ("change how Config is used in Makefile's") moved
HAVE_MNL specific CFLAGS/LDLIBS for building with libmnl out of the
top level Makefile into sub-Makefiles. However, it also removed the
HAVE_ELF specific CFLAGS/LDLIBS entirely, which breaks the BPF object
loader for tc and ip with "No ELF library support compiled in." despite
having libelf detected in configure script. Fix it similarly as in
69fed534a5 for HAVE_ELF.
Fixes: 69fed534a5 ("change how Config is used in Makefile's")
Reported-by: Jeffrey Panneman <jeffrey.panneman@tno.nl>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The ikey and okey value are normal u32 values. The input accepts
them in dotted, hex or decimal form. For output, hex seems like
the best form since they are not really addresses.
Suggested-by: Christian Langrock <christian.langrock@secunet.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
According to the IPv4 behavior of 'ip' it should be possible
to omit the arguments for local and remote address.
Without this patch omitting these parameters would lead to
uninitialized memory being interpreted as IPv6 addresses.
Reported-by: Christian Langrock <christian.langrock@secunet.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When ip netns {add|delete} is first run, it bind-mounts /var/run/netns
on top of itself, then marks it as shared. However, if there are already
bind-mounts in the directory from other tools, these would not be
propagated. Fix this by recursively bind-mounting.
Signed-off-by: Casey Callendrello <casey.callendrello@coreos.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Since kernel commit 475abbf1ef67 ("ipv4: fib: Set offload indication
according to nexthop flags") offload indication is reported on a
per-nexthop basis.
Adjust iproute2 to display it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
For the most of the address flags, use a table of values rather
than open coding every value. This allows for easier inevitable
expansion of flags.
This also fixes the missing stable-privacy flag.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip netns accepts invalid input as namespace name like an empty string or a
string longer than the maximum file name length.
Check that the netns name is not empty and less than or equal to NAME_MAX.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Ability to change geneve device attributes was added to kernel through
commit 5b861f6baa3a ("geneve: add rtnl changelink support"), however one
cannot do the same through ip-link(8) command. Changing the allowed
geneve device attributes using 'ip link set <geneve_name> type geneve id
<geneve_id> <allowed_attributes>' currently fails with 'operation not
supported' error. This patch adds support for it.
Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
This patch replaces exits with returns in ip route
commands.
Allows to continue when invoked with ip -batch.
Signed-off-by: Élie Bouttier <elie@bouttier.eu>
In the presence of firewalls which improperly block ICMP Unreachable
(including Fragmentation Required) messages, Path MTU Discovery is
prevented from working.
The workaround is to handle IPv4 payloads opaquely, ignoring the DF
bit.
Kernel commit 22a59be8b7693eb2d0897a9638f5991f2f8e4ddd ("net: ipv4:
Add ability to have GRE ignore DF bit in IPv4 payloads") is
complemented by this user-space changeset which exposes control of
this setting.
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>
ip netns keeps track of created namespaces with bind mounts named
/var/run/netns/<namespace>. No input sanitization is done, allowing creation and
deletion of files relatives to /var/run/netns or, if the path is non existent or
invalid, allows to create "untracked" namespaces (invisible to the tool).
This commit denies creation or deletion of namespaces with names contaning
"/" or matching exactly "." or "..".
Signed-off-by: Matteo Croce <mcroce@redhat.com>
This patch extends route get to support mpls specific
route attributes like RTA_NEWDST.
Input:
RTA_DST - input label
RTA_NEWDST - labels in packet for multipath selection
By default the getroute handler returns matched
nexthop label, via and oif
With fibmatch keyword (RTM_F_FIB_MATCH flag), full matched
route is returned.
example:
$ip -f mpls route show
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
201
nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12
$ip -f mpls route get 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get 101
101 as to 102/103 via inet 172.16.2.2 dev virt1-2
$ip -f mpls route get as to 302/303 101
101 as to 302/303 via inet 172.16.12.2 dev virt1-12
$ip -f mpls route get fibmatch 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get fibmatch 101
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Let XDP link set command request that the program be offloaded.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Allow user to select XDP DRV_MODE flag by using xdpdrv keyword
instead of xdp or xdpgeneric.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Add interpretation of XDP_ATTACHED_HW mode on dump.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
This patch adds support to the newly added IFLA_XDP_PROG_ID.
./ip link show dev eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdpgeneric/id:2 qdisc [...]
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
If a header that includes linux/in6.h is included before
iproute's utils.h, then iproute2 fails to compile on older
glibc versions.
Fixes: e8493916a8 ("iproute: add support for SR-IPv6 lwtunnel encapsulation")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
After upstream commit 5071034e4af7 ('neigh: Really delete an arp/neigh entry
on "ip neigh delete" or "arp -d"'), we could delete a single FAILED neighbour
entry now. But `ip neigh flush` still skip the FAILED entry.
Move the filter after first round flush so we can flush FAILED entry on fixed
kernel and also do not keep retrying on old kernel.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
When the user specifies `table all` or `table 0` to
the `ip mroute show` command we dump the entirety of
the known mroute tables. Without some sort of
divisor to tell us what table we are looking at
the command is useless.
Add `Table: <vrf name>` to the output of 'ip mroute show table 0'
Follow the convention established by 'ip route show table 0'
for when to display
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This option is documented in gre6 help, but was not supported.
Fixes: af89576d7a ("iproute2: GRE over IPv6 tunnel support.")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
When modifying a route we set the RTA_OIF attribute only if a device was
specified with "dev" or "oif" keyword. But for some unknown reason we
earlier alternatively check also for the presence of "nexthop" keyword,
even though it has no effect. So remove the pointless check.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Add IFLA_EVENT output so that event types can be viewed with
'monitor' command. This gives a little more information for why
a given message was received.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Uses newly introduced RTM_GETROUTE flag RTM_F_FIB_MATCH
to return a matching fib route. Introduces 'fibmatch'
keyword to ip route get.
ipv4:
----
$ip route show
default via 192.168.0.2 dev eth0
10.0.14.0/24
nexthop via 172.16.0.3 dev dummy0 weight 1
nexthop via 172.16.1.3 dev dummy1 weight 1
$ip route get 10.0.14.2
10.0.14.2 via 172.16.1.3 dev dummy1 src 172.16.1.1
cache
$ip route get fibmatch 10.0.14.2
10.0.14.0/24
nexthop via 172.16.0.3 dev dummy0 weight 1
nexthop via 172.16.1.3 dev dummy1 weight 1
ipv6:
----
$ip -6 route show
2001:db9:100::/120 metric 1024
nexthop via 2001:db8:2::2 dev dummy0 weight 1
nexthop via 2001:db8:12::2 dev dummy1 weight 1
$ip -6 route get 2001:db9:100::1
2001:db9:100::1 from :: via 2001:db8:12::2 dev dummy1 \
src 2001:db8:12::1 metric 1024 pref medium
$ip -6 route get fibmatch 2001:db9:100::1
2001:db9:100::/120 metric 1024
nexthop via 2001:db8:12::2 dev dummy1 weight 1
nexthop via 2001:db8:2::2 dev dummy0 weight 1
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Add to usage message a description of how to configure Infiniband node
and port GUIDs. Also modify the man page to emphasize the GUIDs are
configured for Infiniband VFs.
Fixes: d91fb3f4c7 ("Add support for configuring Infiniband GUIDs")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Since commit a8f820a380a2a06 ('can: add Virtual CAN Tunnel driver (vxcan)')
for Linux 4.12 a virtual CAN tunnel driver analogue to veth is available in
Linux.
This patch adds the ability to create vxcan device pairs.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Change print_linkinfo_brief to take the filter as an input arg.
If the arg is NULL, use the global filter in ipaddress.c.
Signed-off-by: David Ahern <dsahern@gmail.com>
ipaddr_list_flush_or_save generates a list of nlmsg's for links and
optionally for addresses. Move the code into ip_linkaddr_list and
export it along with the supporting infrastructure.
API to use this function is:
struct nlmsg_chain linfo = { NULL, NULL};
struct nlmsg_chain ainfo = { NULL, NULL};
ip_linkaddr_list(family, filter_req, &linfo, &ainfo);
... error checking and code looping over linfo/ainfo ...
free_nlmsg_chain(&linfo);
free_nlmsg_chain(&ainfo);
Signed-off-by: David Ahern <dsahern@gmail.com>
Follow-up to d67b9cd28c1d ("xdp: refine xdp api with regards to
generic xdp") in order to update the XDP dumping part.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Including libc headers first helps as a workaround to redefinition of struct
ethhdr with a suitably patched musl libc that suppresses the kernel
if_ether.h.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Ability to change vxlan device attributes was added to kernel through
commit 8bcdc4f3a20b ("vxlan: add changelink support"), however one
cannot do the same through ip(8) command. Changing the allowed vxlan
device attributes using 'ip link set dev <vxlan_name> type vxlan
<allowed_attributes>' currently fails with 'operation not supported'
error. This failure is due to the incorrect rtnetlink message
construction for the 'ip link set' operation.
The vxlan_parse_opt() callback function is called for parsing options
for both 'ip link add' and 'ip link set'. For the 'add' case, we pass
down default values for those attributes that were not provided as CLI
options. However, for the 'set' case we should be only passing down the
explicitly provided attributes and not any other (default) attributes.
Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
syntax:
ip xfrm state .... offload dev <if-name> dir <in or out>
Example to add inbound offload:
ip xfrm state .... offload dev mlx0 dir in
Example to add outbound offload:
ip xfrm state .... offload dev mlx0 dir out
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Follow-up to commit c7272ca720 ("bpf: add initial support for
attaching xdp progs") to also support generic XDP. This adds an
indicator for loaded generic XDP programs when programs are loaded
as shown in c7272ca720, but the driver still lacks native XDP
support.
# ip link
[...]
3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdpgeneric qdisc [...]
link/ether 0c:c4:7a:03:f9:25 brd ff:ff:ff:ff:ff:ff
[...]
In case the driver does support native XDP, but the user wants
to load the program as generic XDP (e.g. for testing purposes),
then this can be done with the same semantics as in c7272ca720,
but with 'xdpgeneric' instead of 'xdp' command for loading:
# ip -force link set dev eno1 xdpgeneric obj xdp.o
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David S. Miller <davem@davemloft.net>
As noticed by one of the few users of routel script, it ends up in an
infinite loop when they pull out the cable from the NIC used for some
route. This is caused by its parser expecting the line of "ip route show"
output consists of "key value" pairs (except for the initial target range),
together with an old trap of Bourne style shells that "shift 2" does
nothing if there is only one argument left. Some keywords, e.g. "linkdown",
are not followed by a value.
Improve the parser to
(1) only set variables for keywords we care about
(2) recognize (currently) known keywords without value
This is still far from perfect (and certainly not future proof) but to
fully fix the script, one would probably have to rewrite the logic
completely (and I'm not sure it's worth the effort).
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
This attribute allows the administrator to adjust the packet marking
attribute of tunnels that support policy based routing.
Signed-off-by: Craig Gallek <kraig@google.com>
This patch adds commands to support the tunnel source properties
("ip sr tunsrc") and the HMAC key -> secret, algorithm binding
("ip sr hmac").
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
'ip vrf pids' is used to list processes bound to a vrf, but it only
shows the pid leaving a lot of work for the user. Add the command
name to the output. With this patch you get the more user friendly:
$ ip vrf pids mgmt
1121 ntpd
1418 gdm-session-wor
1488 gnome-session
1491 dbus-launch
1492 dbus-daemon
1565 sshd
...
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
'ip vrf pids' is used to list processes bound to a vrf, but it only
shows the pid leaving a lot of work for the user. Add the command
name to the output. With this patch you get the more user friendly:
$ ip vrf pids mgmt
1121 ntpd
1418 gdm-session-wor
1488 gnome-session
1491 dbus-launch
1492 dbus-daemon
1565 sshd
...
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Currently specifying a device to ip netconf and it dumps only values
for IPv4. Change this to dump data for all families unless a specific
family is given.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Currently, 'ip netconf' only shows ipv4 and ipv6 netconf settings. If IPv6
is not enabled, the dump ends with
RTNETLINK answers: Operation not supported
when IPv6 request is attempted. Further, if the mpls_router module is also
loaded a separate request is needed to get MPLS settings.
To make this better going forward, use the new PF_UNSPEC dump all option
if the kernel supports it. If the kernel does not, it sets NLMSG_ERROR and
returns EOPNOTSUPP which is trapped and we fall back to the existing output
to maintain compatibility with existing kernels.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add support for setting and displaying the ttl attribute
for MPLS IP lighweight tunnels.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Add support for setting and displaying the ttl-propagation attribute
initially used by MPLS to control propagation of MPLS TTL to IPv4/IPv6
TTL/hop-limit on popping final label on a per-route basis.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
These are basically stubs: The types which lacked their own help text
simply don't accept any options (yet). Still it might be a bit confusing
to users if they are presented with the generic 'ip link' help text
instead of something saying there are no type specific options.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Take help function in iplink_bridge.c as an example and make other link
types' help functions similar:
* Use a single fprintf() call (if possible).
* Don't state a full command line, just "... type OPTIONS".
* Put every option in it's own line, align options by column.
* List mandatory options first.
link_veth.c is intentionally left untouched because it's 'peer' option
eats all kinds of generic link options and the help text points this out
without duplicating all the options there again.
Signed-off-by: Phil Sutter <phil@nwl.cc>
When neither group or remote is specified (or if they are specified with
the any address), nothing is sent to the kernel. In this case, the
kernel defaults to IPv4. This makes impossible to use IPv6 with
unspecified unicast remote ("bridge fdb add" will return
EAFNOTSUPPORT).
If the user specifies a preferred address family (eg, "ip -6 link add"),
then send either IFLA_VXLAN_GROUP or IFLA_VXLAN_GROUP6 to enforce the
use of the appropriate family.
Signed-off-by: Vincent Bernat <vincent@bernat.im>
MPLS multipath routes are missing a space between 'nexthop' and 'via':
$ ip -net ns1 -f mpls ro ls
100
nexthopvia inet 172.16.2.2 dev virt12
nexthopvia inet 172.16.3.2 dev br0
Add it.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add support for new afstats subcommand. This uses the new
IFLA_STATS_AF_SPEC attribute of RTM_GETSTATS messages to show
per-device, AF-specific stats. At the moment the kernel only supports
MPLS AF stats, so that is all that's implemented here.
The print_num function is exposed from ipaddress.c to be used for
printing the new stats so that the human-readable option, if set, can
be respected.
Example of use:
$ ./ip/ip -f mpls link afstats dev eth1
3: eth1
mpls:
RX: bytes packets errors dropped noroute
9016 98 0 0 0
TX: bytes packets errors dropped
7232 113 0 0
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Use the new helper functions rta_getattr_u* instead of direct
cast of RTA_DATA(). Where RTA_DATA() is a structure, then remove
the unnecessary cast since RTA_DATA() is void *
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add support for MPLS netconf to ip monitor and ip netconf commands.
Changes to header files not included as those are typically pulled
in my a header sync with the kernel.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
This patch adds support to the bridge_slave link type for displaying
xstats by reusing the previously added bridge xstats callbacks.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch adds support for a new xstats link subcommand which uses the
specified link type's new parse/print_ifla_xstats callbacks to display
extended statistics.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since cgroups are not namespace aware, the directory heirarchy used by
ip vrf should account for network namespaces. In this case, change the
path from CGRP/BASE/vrf/NAME to CGRP/BASE/NETNS/vrf/NAME where CGRP is
the cgroup2 mount path, BASE in any base heirarchy inherited before VRF
is applied and NAME is the VRF name.
The intent is as follows: a user logs into the box into some namespace
with a name known to iproute2. Some other policy may have put the
process into a BASE heirarchy. From there the user executes a task in
a VRF and in doing so the task heirarchy becomes CGRP/BASE/NETNS/vrf/NAME.
The namespace level is omitted for the default namespace.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Move guts of netns_identify into a standalone function that returns
the netns name in a given buffer.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add support for VRF in a pre-existing hierarchy. For example, if the
current process is running in CGRP/foo/bar, the 'ip vrf exec NAME CMD'
should run CMD in the cgroup CGRP/foo/bar/vrf/NAME.
When listing process ids in a VRF, search for the directory vrf/NAME
regardless of base path (foo/bar/vrf/NAME and vrf/NAME) are still
running against the same vrf NAME.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
This patch implements support for the IFLA_BRPORT_FLUSH attribute
in iproute2 so it can flush bridge slave's fdb dynamic entries.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_MCAST_MLD_VERSION
attribute in iproute2 so it can change the mcast mld version.
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
This patch implements support for the IFLA_BR_MCAST_IGMP_VERSION
attribute in iproute2 so it can change the mcast igmp version.
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
This patch implements support for the IFLA_BR_MCAST_STATS_ENABLED
attribute in iproute2 so it can enable/disable mcast stats accounting.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_VLAN_STATS_ENABLED
attribute in iproute2 so it can enable/disable vlan stats accounting.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch implements support for the IFLA_BR_FDB_FLUSH attribute
in iproute2 so it can flush bridge fdb dynamic entries.
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
This patch adds a new field that is printed in the end of the line which
denotes the real entry state. Before this patch an entry's IIF could
disappear and it would look like an unresolved one (iif = unresolved):
(3.0.16.1, 225.11.16.1) Iif: unresolved
with no way to really distinguish it from an unresolved entry.
After the patch if the dumped entry has RTNH_F_UNRESOLVED set we get:
(3.0.16.1, 225.11.16.1) Iif: unresolved State: unresolved
for unresolved entries and:
(0.0.0.0, 225.11.11.11) Iif: eth4 Oifs: eth3 State: resolved
for resolved entries after the OIF list. Note that "State:" has ':' in
it so it cannot be mistaken for an interface name.
And for the example above, we'd get:
(0.0.0.0, 225.11.11.11) Iif: unresolved State: resolved
Also when dumping all routes via ip route show table all,
it will show up as:
multicast 225.11.16.1/32 from 3.0.16.1/32 table default proto 17 unresolved
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
To specify multiple nexthops in a route the user is expected to use the
"nexthop" keyword which ip route uses to create the RTA_MULTIPATH.
However, ip route always accepts multiple 'via' keywords where only the
last one is used in the route leading to confusion. For example, ip
accepts this syntax:
$ ip ro add vrf red 1.1.1.0/24 via 10.100.1.18 via 10.100.2.18
but the route entered inserted by the kernel is just the last gateway:
1.1.1.0/24 via 10.100.2.18 dev eth2
which is not the full request from the user. Detect the presense of
multiple 'via' and give the user a hint to add nexthop:
$ ip ro add vrf red 1.1.1.0/24 via 10.100.1.18 via 10.100.2.18
Error: argument "via" is wrong: use nexthop syntax to specify multiple via
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fix "Policy buffer overflow" when trying to use deleteall with many
policies installed.
Signed-off-by: Alexander Heinlein <alexander.heinlein@secunet.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Entries with long vhost names in /proc/net/igmp have no whitespace
between name and colon, so sscanf() adds it to vhost and
'ip maddr show iface' doesn't include inet result.
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Show ipv6 tunnel keys on presence of GRE_KEY flag for tunnel types
other than GRE. Aligns ipv6 behaviour with ipv4.
Signed-off-by: dforster@brocade.com
Next up a non-root user gets various bpf related error messages:
$ ip vrf exec mgmt bash
Failed to load BPF prog: 'Operation not permitted'
Kernel compiled with CGROUP_BPF enabled?
Catch the EPERM error and do not show the kernel config option.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
A vrf is local to a namespace. Drop any VRF association before trying
to exec a command in the new namespace.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Path in vrf_switch for "default" VRF is supposed to be MNT/vrf not
MNT/default. Also, default_vrf flag is redundant with ifindex. Remove
the flag in favor of ifindex != 0.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Split ipvrf_identify into arg processing and a function that does the
actual cgroup file parsing. The latter function is used in a follow
on patch.
In the process, convert the reading of the cgroups file to use fopen
and fgets just in case the file ever grows beyond 4k. Move printing
of any error message and the vrf name to the caller of the new
vrf_identify.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Move the hint about CGROUP_BPF enabled to prog_load failure since
it fails before the attach. Update the existing error message to
print to stderr.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
'ip vrf' follows the user semnatics established by 'ip netns'.
The 'ip vrf' subcommand supports 3 usages:
1. Run a command against a given vrf:
ip vrf exec NAME CMD
Uses the recently committed cgroup/sock BPF option. vrf directory
is added to cgroup2 mount. Individual vrfs are created under it. BPF
filter attached to vrf/NAME cgroup2 to set sk_bound_dev_if to the VRF
device index. From there the current process (ip's pid) is addded to
the cgroups.proc file and the given command is exected. In doing so
all AF_INET/AF_INET6 (ipv4/ipv6) sockets are automatically bound to
the VRF domain.
The association is inherited parent to child allowing the command to
be a shell from which other commands are run relative to the VRF.
2. Show the VRF a process is bound to:
ip vrf id
This command essentially looks at /proc/pid/cgroup for a "::/vrf/"
entry with the VRF name following.
3. Show process ids bound to a VRF
ip vrf pids NAME
This command dumps the file MNT/vrf/NAME/cgroup.procs since that file
shows the process ids in the particular vrf cgroup.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
iplink_vrf has 2 functions used to validate a user given device name is
a VRF device and to return the table id. If the user string is not a
device name ip commands with a vrf keyword show a confusing error
message: "RTNETLINK answers: No such device".
Add a variant of rtnl_talk that does not display the "RTNETLINK answers"
message and update iplink_vrf to use it.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Adds support to configure BPF programs as nexthop actions via the LWT
framework.
Example:
ip route add 192.168.253.2/32 \
encap bpf out obj lwt_len_hist_kern.o section len_hist \
dev veth0
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Now that we made the BPF loader generic as a library, reuse it
for loading XDP programs as well. This basically adds a minimal
start of a facility for iproute2 to load XDP programs. There
currently only exists the xdp1_user.c sample code in the kernel
tree that sets up netlink directly and an iovisor/bcc front-end.
Since we have all the necessary infrastructure in place already
from tc side, we can just reuse its loader back-end and thus
facilitate migration and usability among the two for people
familiar with tc/bpf already. Sharing maps, performing tail calls,
etc works the same way as with tc. Naturally, once kernel
configuration API evolves, we will extend new features for XDP
here as well, resp. extend dumping of related netlink attributes.
Minimal example:
clang -target bpf -O2 -Wall -c prog.c -o prog.o
ip [-force] link set dev em1 xdp obj prog.o # attaching
ip [-d] link # dumping
ip link set dev em1 xdp off # detaching
For the dump, intention is that in the first line for each ip
link entry, we'll see "xdp" to indicate that this device has an
XDP program attached. Once we dump some more useful information
via netlink (digest, etc), idea is that 'ip -d link' will then
display additional relevant program information below the "link/
ether [...]" output line for such devices, for example.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
In case of an older kernel that doesn't set L2TP_ATTR_UDP_ZERO_CSUM6_{RX,TX}
the old hard-coded value is being preserved, since the attribute flag will be
missing.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
L2TP_ATTR_UDP_CSUM is read by the kernel as a NLA_FLAG value,
but is validated as a NLA_U8, so we will write it as an u8,
but the value isn't actually being read by the kernel.
It is written by the kernel as a NLA_U8, so we will read as
such.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
L2TP_ATTR_RECV_SEQ and L2TP_ATTR_SEND_SEQ are declared as NLA_U8
attributes in the kernel, so let's threat them accordingly.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
udp6_csum_{tx,rx}, tunnel and session are the only ones
currently used.
recv_seq, send_seq, lns_mode and data_seq are partially
implemented in a useless way.
Signed-off-by: Asbjørn Sloth Tønnesen <asbjorn@asbjorn.st>
Adjusting iproute2 utility to support new macvlan link type mode called
"source".
Example of commands that can be applied:
ip link add link eth0 name macvlan0 type macvlan mode source
ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr del 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr flush
ip -details link show dev macvlan0
Based on previous work of Stefan Gula <steweg@gmail.com>
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Cc: steweg@gmail.com
v5:
- rebase and fix checkpatch
v4:
- add MACADDR_SET support
- skip FLAG_UNICAST / FLAG_UNICAST_ALL as this is not upstream
- fix man page
- Support adding, deleting and showing IP rules with UID ranges.
- Support querying per-UID routes via "ip route get uid <UID>".
UID range routing was added to net-next in 4fb7450683 ("Merge
branch 'uid-routing'")
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Commit 7b8179c780 ("iproute2: Add new command to ip link to
enable/disable VF spoof check") tried to add support for
IFLA_VF_SPOOFCHK in a backwards-compatible manner, but aparently overdid
it: parse_rtattr_nested() handles missing attributes perfectly fine in
that it will leave the relevant field unassigned so calling code can
just compare against NULL. There is no need to layback from the previous
(IFLA_VF_TX_RATE) attribute to the next to check if IFLA_VF_SPOOFCHK is
present or not. To the contrary, it establishes a potentially incorrect
assumption of these two attributes directly following each other which
may not be the case (although up to now, kernel aligns them this way).
This patch cleans up the code to adhere to the common way of checking
for attribute existence. It has been tested to return correct results
regardless of whether the kernel exports IFLA_VF_SPOOFCHK or not.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Greg Rose <grose@lightfleet.com>