Commit Graph

4074 Commits

Author SHA1 Message Date
Phil Sutter 6f7df6b2a1 tc: Optimize gact action lookup
When adding a filter with a gact action such as 'drop', tc first tries
to open a shared object with equivalent name (m_drop.so in this case)
before trying gact. Avoid this by matching the action name against those
handled by gact prior to calling get_action_kind().

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
2018-01-17 10:27:47 -08:00
David Ahern e209d4c83d Merge branch 'tc-batch' into net-next
Chris Mi says:

====================
Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this patchset, at most 128
commands can be accumulated before sending to kernel.

We introduced a new function in patch 1 to support for sending
multiple messages. In patch 2, we add this support for filter
add/delete/change/replace and actions add/change/replace commands.

But please note that kernel still processes the requests one by one.
To process the requests in parallel in kernel is another effort.
The time we're saving in this patchset is the user mode and kernel mode
context switch. So this patchset works on top of the current kernel.

Using the following script in kernel, we can generate 1,000,000 rules.
	tools/testing/selftests/tc-testing/tdc_batch.py

Without this patchset, 'tc -b $file' exection time is:

real    0m15.555s
user    0m7.211s
sys     0m8.284s

With this patchset, 'tc -b $file' exection time is:

real    0m12.360s
user    0m6.082s
sys     0m6.213s

The insertion rate is improved more than 10%.
====================

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-15 08:27:20 -08:00
Chris Mi 485d0c6001 tc: Add batchsize feature for filter and actions
Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this support, at most 128
commands can be accumulated before sending to kernel.

Now it only works for the following successive commands:
1. filter add/delete/change/replace
2. actions add/change/replace

Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-14 09:03:35 -08:00
Chris Mi 72a2ff3916 lib/libnetlink: Add a new function rtnl_talk_iov
rtnl_talk can only send a single message to kernel. Add a new function
rtnl_talk_iov that can send multiple messages to kernel.
rtnl_talk_iov takes struct iovec * and iovlen as arguments.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-14 09:03:33 -08:00
Christian Ehrhardt 9bed02a5d5 tests: make sure rand_dev suffix has 6 chars
The change to limit the read size from /dev/urandom is a tradeoff.
Reading too much can trigger an issue, but so it could if the
suggested 250 random chars would not contain enough [:alpha:] chars.
If they contain:
 a) >=6 all is ok
 b) [1-5] the devname would be shorter than expected (non fatal).
 c) 0 things would break

In loops of hundreds of thousands it always was (a) for my, but since
if occuring in a test it might be hard to track what happened avoid
this issue by retrying on the length condition.

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:29:51 -08:00
Christian Ehrhardt 4afbeaeeaf tests: read limited amount from /dev/urandom
In some test environments like e.g. Ubuntu & Debian autopkgtest it
can happen that while generating random device names the pipes
between tr and head are considered dead while processing.
That prints (non fatal) issues like:
  Running ip/link/new_link.t [iproute2-this/4.13.0-17-generic]: tr:
write error: Broken pipe
  tr: write error
  PASS

This only happens if reading an infinite amount of chars with the
read from urandom, so reading a defined amount fixes the issue.

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:29:51 -08:00
Mike Frysinger a8b970d7d2 ifcfg/rtpr: convert to POSIX shell
These files are already mostly written in POSIX shell, so convert their
shebangs to /bin/sh and tweak the few bashisms in here.

URL: https://crbug.com/756559
Reported-by: Pat Erley <perley@chromium.org>
Signed-off-by: Mike Frysinger <vapier@chromium.org>
2018-01-10 08:26:09 -08:00
Mike Frysinger 54f5991acd mark shell scripts +x
This makes it easier to execute locally for testing.

Signed-off-by: Mike Frysinger <vapier@chromium.org>
2018-01-10 08:23:49 -08:00
Stephen Hemminger 7d63671030 tc: remove no longer relevant README
This document described how kernel and tc used to handle
timing. In last two years, kernel has switched over to using
ktime. Nothing to see here, move along.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:21:22 -08:00
Serhey Popovych cc899123cc ip6tnl/tunnel: Output hoplimit before encapsulation limit
To follow gre6 output print hoplimit before encapsulation
limit in link_ip6tnl.c.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:06:12 -08:00
Serhey Popovych 763cf4956d gre6/tunnel: Output flowlabel after tclass
To follow ip6tnl output print flowlabel after tclass
in link_gre6.c.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:06:12 -08:00
Serhey Popovych e3945d92b0 ip6/tunnel: Unify encap_limit printing
Use %u format specifier to print it in link_gre6.c and
make code more readable.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:06:12 -08:00
Serhey Popovych a0fd0c3a30 ip6/tunnel: Unify flowlabel printing
Use @s2 buffer to store string representation of
flowlabel and get rid of extra SPRINT_BUF(): no
need to preserve @s2 contents for later.

Use print_string(PRINT_ANY, ...) with prepared by
snprintf() string for both PRINT_JSON and PRINT_FP
cases.

Omit flowlabel from output if no flowinfo attribute
is given and IP6_TNL_F_USE_ORIG_FLOWLABEL isn't set.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:06:12 -08:00
Serhey Popovych 090524f899 ip6/tunnel: Unify tclass printing
Use @s2 buffer to store string representation of
tclass and get rid of extra SPRINT_BUF(): no
need to preserve @s2 contents for later.

Use print_string(PRINT_ANY, ...) with prepared by
snprintf() string for both PRINT_JSON and PRINT_FP
cases.

While there use __u32 for flowinfo in link_gre6.c
and check for IFLA_GRE_FLOWINFO attribute presense.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:06:12 -08:00
Serhey Popovych 4dc6665b6b ip6tnl/tunnel: Do not print obscure flowinfo
It is implementation internal and main purpose
of printing it seems debugging.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:06:12 -08:00
Serhey Popovych b76b24006c ip6/tunnel: Fix tclass output
In link_gre6.c it seems copy paste error: tclass is 8 bits,
not 20 as flowlabel.

In link_iptnl.c rename "flowinfo_tclass" to "tclass" as it
correct name since flowinfo is implementation internal name
used to label combined within u32 attribute tclass and
flowlabel.

Fixes: 1facc1c61c ("ip: link_ip6tnl.c: add json output support")
Fixes: 2e706e12d9 ("Merge branch 'master' into net-next")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-10 08:06:11 -08:00
Jamal Hadi Salim 24a5a48e27 tc: Fix filter protocol output
Fixes: 249284ff5a ("tc: jsonify filter core")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
2018-01-09 08:09:10 -08:00
Stephen Hemminger a366508913 include: update ethernet headers
Incorporate upstream changes to fix compliation with MUSL.
See commit 6926e041a892
 ("uapi/if_ether.h: prevent redefinition of struct ethhdr")

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-09 08:08:03 -08:00
Antonio Quartulli ebbb219c92 ss: fix NULL pointer access when parsing unix sockets with oldformat
When parsing and printing the unix sockets in unix_show(),
if the oldformat is detected, the peer_name member of the sockstat
object is left uninitialized (NULL).
For this reason, if a filter has been specified on the command line,
a strcmp() will crash when trying to access it.

Avoid crash by checking that peer_name is not NULL before
passing it to strcmp().

Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-09 08:02:46 -08:00
Antonio Quartulli 192be8fccb ss: fix crash when skipping disabled header field
When the first header field is disabled (i.e. when passing the -t
option), field_flush() is invoked with the `buffer` global variable
still zero'd.
However, in field_flush() we try to access buffer.cur->len
during variables initialization, thus leading to a SIGSEGV.

It's interesting to note that this bug appears only when the code
is compiled with -O0, because the compiler is smart
enough to immediately jump to the return statement if optimizations
are enabled and skip the faulty instruction.

Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-09 08:02:46 -08:00
Filip Moc 33f6dd23a5 ip fou: pass family attribute as u8
This fixes fou on big-endian systems.

Signed-off-by: Filip Moc <dev@moc6.cz>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-09 07:58:37 -08:00
David Ahern ac6561417a Restore --no-print-directory option for silent builds
Commit 69fed534a5 ("change how Config is used in Makefile's") removed
Config from Makefile. Config had the checks to set VERBOSE based on user
request and VERBOSE is used to add the --no-print-directory argument.
Since Config is gone, add the relevant setup for VERBOSE to Makefile
to restore quieter builds by default.

Fixes: 69fed534a5 ("change how Config is used in Makefile's")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-09 07:58:00 -08:00
David Ahern 6a21ca8a4a Merge branch 'master' into net-next
Conflicts:
	man/man8/ip-link.8.in

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-01-08 10:10:45 -08:00
Serhey Popovych 9deb754283 link_iptnl: Open "encap" JSON object
It seems missing pair of open_json_object()/close_json_object()
in iptnl implementation.

Note that we open "encap" JSON object in ip6tnl.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2018-01-05 16:35:47 -08:00
Serhey Popovych d9aefbc0b8 link_iptnl: Print tunnel mode
Tunnel mode does not appear in parameters print for iptnl
supported tunnels like ipip and sit, while printed for
ip6tnl.

Print tunnel mode as "proto" field name for JSON and
without any name when printing to cli to follow ip6tnl
behaviour.

For non JSON output we have:

   $ ip -d link show dev sit1

Before:
-------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
    link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
    sit remote any local X.X.X.X ...
        ~~~

After:
------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
    link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
    sit any remote any local X.X.X.X ...
        ^^^

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2018-01-05 16:35:47 -08:00
Serhey Popovych 68a7f5ed47 link_iptnl: Kill code duplication
Both sit and ipip "mode" parameter handling nearly the same.
Except for sit we have "ip6ip" mode: check it only when
configuring sit.

Note that there is no need strcmp(lu->id, "ipip"): if it is
not sit it is "ipip" because we have only these two link util
defined in module.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2018-01-05 16:35:47 -08:00
Matthias Schiffer cfd6ccbfd0 devlink, rdma, tipc: properly define TARGETS without HAVE_MNL
Leaving a variable with a generic name such as TARGETS undefined would lead
to Make picking up its value from the environment. Avoid this by always
defining TARGETS in the Makefiles.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
2018-01-05 16:32:17 -08:00
Jakub Kicinski 7d424c7193 ip: link: add support for netdevsim device type
netdevsim is a new software device for testing kernel APIs
without any hardware attached.  Allow users to create such
devices.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
2018-01-02 20:46:19 -08:00
Luca Boccassi c7d6cbaf85 man: fix small formatting errors
Lintian detected the following formatting errors:

 man/man8/devlink-sb.8.gz 230: warning: macro `b' not defined
 man/man8/ip-link.8.gz 1243: warning: macro `in-8' not defined
  (possibly missing space after `in')
 man/man8/tc-u32.8.gz `R' is a string (producing the registered sign),
  not a macro.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-02 11:29:39 -08:00
Luca Boccassi 36c1d2383a man: routel/routef: don't mention filesystem paths
The filesytem paths to these scripts might be different on various
distros, so don't mention it in the manpages. It is not really useful
information anyway.

Originally submitted as Debian bug:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561424

Reported-by: jidanni@jidanni.org
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-30 09:43:47 -08:00
Luca Boccassi fe2ab15d2c man: ip-address: document 15-char limit for LABEL
Trying to set a label longer than 15 characters returns an error:
 RTNETLINK answers: Numerical result out of range

Document the limit in the manpage.

Originally reported as a Debian bug:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661886

Reported-by: Gabor Kiss <kissg@ssg.ki.iif.hu>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-30 09:43:47 -08:00
Luca Boccassi be78fade55 man: add more keywords to ip.8 short description
A Debian user suggested adding more network-related keywords to the
ip manpage, so that manpage-scraping and indexing software like
apropos can do a better job of categorizing the programs.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=877983

Suggested-by: Lynoure Braakman <lynoure@gmail.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-30 09:43:47 -08:00
Luca Boccassi cd25876440 man: drop references to Debian-specific paths
Documentation should be distribution-agnostic - any specific quirks
should be handled by downstream maintainers, if necessary.
Remove mentions of Debian paths and package names.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-30 09:43:47 -08:00
Serhey Popovych b760a8823a ip/tunnel: Document "external" parameter
Add it to ip-link(8) "type gre" output help message
as well as to ip-link(8) page.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-28 09:40:02 -08:00
Serhey Popovych 8c0b19d178 vxcan,veth: Forbid "type" for peer device
It is already given for original device we configure this
peer for.

Results from following command before/after change applied
are shown below:

  $ ip link add dev veth1a type veth peer name veth1b \
                           type veth peer name veth1c

Before:
-------

<no output, no netdevs created>

After:
------

Error: duplicate "type": "veth" is the second value.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-28 09:35:27 -08:00
Yuval Mintz b97c6fa71d qdisc: print offload indication
Use the newly added TCA_HW_OFFLOAD indication from kernel
to print a consistent 'offloaded' message to user when listing qdiscs.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-27 13:55:16 -08:00
Serhey Popovych afdf9277eb gre6/tunnel: Do not submit garbage in flowinfo
We always send flowinfo to the kernel. If flowlabel/tclass
was set first to non-inherit value and then reset to
inherit we do not clear flowlabel/tclass part in flowinfo,
send it to kernel and can get from the kernel back.

Even if we check for IP6_TNL_F_USE_ORIG_TCLASS and
IP6_TNL_F_USE_ORIG_FLOWLABEL when printing options
sending invalid flowlabel/tclass to the kernel seems
bad idea.

Note that ip6tnl always clean corresponding flowinfo
parts on inherit.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-27 13:45:37 -08:00
Serhey Popovych 147ade01b0 gre,ip6tnl/tunnel: Fix noencap- support
We must clear bit, not set all but given bit.

Fixes: 858dbb208e ("ip link: Add support for remote checksum offload to IP tunnels")
Fixes: 73516e128a ("ip6tnl: Support for fou encapsulation"
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-27 13:41:42 -08:00
Leon Romanovsky 874c734c1c rdma: Move link execution logic to common code
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky 5fc17280b1 rdma: Rename rd_free_devmap to be rd_free
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky 7109f4b212 rdma: Rename free function to be rd_cleanup
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky b1e6bc437f rdma: Get rid of dev_map_free call
The dev_map_free() is called once only and it is short,
so it is better to integrate it into the caller's site.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky b55644412f rdma: Print supplied device name in case of wrong name
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky e3dee3c81f rdma: Check that port index exists before operate on link layer
Link layer operates on port layer, hence it should check
it existence before execution commands.

Fixes: da990ab40a ("rdma: Add link object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky 4e2eb9fdf9 rdma: Fix misspelled SYS_IMAGE_GUID
SYS_IMAGE_GUIG is actually SYS_IMAGE_GUID.

Fixes: da990ab40a ("rdma: Add link object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky 8b92a2c930 rdma: Move per-device handler function to generic code
Most of the proposed objects are working in the scope "dev"
and will implement the same logic. Move the code to utils.c,
so other objects will be able to reuse the code.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky 99da90326e rdma: Protect dev_map_lookup from wrong input
Despite the fact that all callers to dev_map_lookup are ensuring that
there is always device name prior to call to that function, it is better
and safer to check that in the dev_map_lookup itself.

Fixes: 40df8263a0 ("rdma: Add dev object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky 0fc8c30b4e rdma: Reduce scope of _dev_map_lookup call
There is no external users of _dev_map_lookup function,
so let's limit its scope to be local.

Fixes: 40df8263a0 ("rdma: Add dev object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
William Tu 1e4be52bf2 erspan: add erspan usage description
The patch adds erspan usage description, so 'ip link help erspan'
and 'ip link help ip6erspan' shows the options.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Serhey Popovych 08ede25fda ip/tunnel: No need to free answer after rtnl_talk() on error
Since rtnl_talk() never returns with answer buffer allocated
on error we do not need to release it manually. After this
initializing answer with NULL before rtnl_talk() is useless.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-26 09:07:43 -08:00