Commit aba9c23a6e ("ss: enclose IPv6 address in brackets") unified
display of wildcard sockets in IPv4 and IPv6 to print the unspecified
address as '*'. Users then complained that they can't distinguish
between address families anymore, so change this again to what Stephen
Hemminger suggested:
| *:80 << both IPV6 and IPV4
| [::]:80 << IPV6_ONLY
| 0.0.0.0:80 << IPV4_ONLY
Note that on older kernels which don't support INET_DIAG_SKV6ONLY
attribute, pure IPv6 sockets will still show as '*'.
Cc: Humberto Alves <hjalves@live.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
These keys are reported by kernel 4.14 and later under the
INET_DIAG_MD5SIG attribute, when INET_DIAG_INFO is requested (ss -i)
and we have CAP_NET_ADMIN. The additional output looks like:
md5keys:fe80::/64=signing_key,10.1.2.0/24=foobar,::1/128=Test
Signed-off-by: Ivan Delalande <colona@arista.com>
The AF_VSOCK address family is a host<->guest communications channel
supported by VMware, KVM, and Hyper-V. Initial VMware support was
released in Linux 3.9 in 2013 and transports for other hypervisors were
added later.
AF_VSOCK addresses are <u32 cid, u32 port> tuples. The 32-bit cid
integer is comparable to an IP address. AF_VSOCK ports work like
TCP/UDP ports.
Both SOCK_STREAM and SOCK_DGRAM socket types are available.
This patch adds AF_VSOCK support to ss(8) so that sockets can be
observed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Linux has more than 32 address families defined in <bits/socket.h>. Use
a 64-bit type so all of them can be represented in the filter->families
bitmask.
It's easy to introduce bugs when using (1 << AF_FAMILY) because the
value is 32-bit. This can produce incorrect results from bitmask
operations so introduce the FAMILY_MASK() macro to eliminate these bugs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The original problem was that something like:
| strncpy(ifr.ifr_name, *argv, IFNAMSIZ);
might leave ifr.ifr_name unterminated if length of *argv exceeds
IFNAMSIZ. In order to fix this, I thought about replacing all those
cases with (equivalent) calls to snprintf() or even introducing
strlcpy(). But as Ulrich Drepper correctly pointed out when rejecting
the latter from being added to glibc, truncating a string without
notifying the user is not to be considered good practice. So let's
excercise what he suggested and reject empty, overlong or otherwise
invalid interface names right from the start - this way calls to
strncpy() like shown above become safe and the user has a chance to
reconsider what he was trying to do.
Note that this doesn't add calls to check_ifname() to all places where
user supplied interface name is parsed. In many cases, the interface
must exist already and is therefore looked up using ll_name_to_index(),
so if_nametoindex() will perform the necessary checks already.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Can't use strlcpy() here since lnstat is not linked against libutil.
While being at it, fix coding style in that chunk as well.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Commit 9f66764e30 ("libnetlink: Add test for error code returned from
netlink reply") changed rtnl_dump_filter_l() to return an error in case
NLMSG_DONE would contain one, even if it was ENOENT.
This in turn breaks ss when it tries to dump DCCP sockets on a system
without support for it: The function tcp_show(), which is shared between
TCP and DCCP, will start parsing /proc since inet_show_netlink() returns
an error - yet it parses /proc/net/tcp which doesn't make sense for DCCP
sockets at all.
On my system, a call to 'ss' without further arguments prints the list
of connected TCP sockets twice.
Fix this by introducing a dedicated function dccp_show() which does not
have a fallback to /proc, just like sctp_show(). And since tcp_show()
is no longer "multi-purpose", drop it's socktype parameter.
Fixes: 9f66764e30 ("libnetlink: Add test for error code returned from netlink reply")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Both 'timer' and 'timeout' variables of struct tcpstat are either
scanned as unsigned values from /proc/net/tcp{,6} or copied from
'idiag_timer' and 'idiag_expries' fields of struct inet_diag_msg, which
itself are unsigned. Therefore they may be unsigned as well, which
eliminates the need to check for negative values.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Relying upon callers and using unsafe strcpy() is probably not the best
idea. Aside from that, using snprintf() allows to format the string for
lf->path in one go.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Prevent passing NULL FILE pointer to fgets() later.
Fix both tools in a single patch since the code changes are basically
identical.
Signed-off-by: Phil Sutter <phil@nwl.cc>
There's some misleading information in --help and ss(8) manpage about
TCP-STATE named 'listen'.
ss doesn't know such a state, but it knows 'listening' state.
$ ss -tua state listen
ss: wrong state name: listen
$ ss -tua state listening
[...]
Addresses: https://bugs.debian.org/872990
Reported-by: Pavel Lyulchenko <p.lyulchenko@gmail.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
This renames Config to config.mk and includes more Make input.
Now configure generates all the required CFLAGS and LDLIBS for
the optional libraries.
Also, use pkg-config to test for libelf, rather than using a test
program. This makes it consistent with other libraries.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This has the additional benefit of initializing st.ino to zero which is
used later in is_sctp_assoc() function.
Signed-off-by: Phil Sutter <phil@nwl.cc>
The passed 'addr' parameter is dereferenced by caller before and in
parse_hostcond() multiple times before this check, so assume it is
always true.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Looks like this was forgotten when converting to common json output
formatter.
Fixes: fcc16c2287 ("provide common json output formatter")
Signed-off-by: Phil Sutter <phil@nwl.cc>
The recent LIBMNL changes was made more difficult to debug because
of how Config is handle in clean make. The Config file is generated
by top level make, but since it is not recursive, the values generated
would not be visible on a clean make.
The change is to not include Config in top level make, and move
all the conditionals down into sub makefiles. Not ideal, but beter
than going full autoconf route. Or forcing separate configure
step.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The code was always building without libmnl support, so it was
doing nothing.
Fixes: b6432e68ac ("iproute: Add support for extended ack to rtnl_talk")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Based on patch by Lehner Florian <dev@der-flo.net>
Adds support for RFC2732 IPv6 address format with brackets.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use the new helper functions rta_getattr_u* instead of direct
cast of RTA_DATA(). Where RTA_DATA() is a structure, then remove
the unnecessary cast since RTA_DATA() is void *
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
tcpi_rcv_mss and tcpi_advmss tcp info fields were not yet reported
by ss.
While adding GRO support to packetdrill, I found this was useful.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Add support for extended statistics of SW only type, for counting only the
packets that went via the cpu. (useful for systems with forward
offloading). It reads it from filter type IFLA_STATS_LINK_OFFLOAD_XSTATS
and sub type IFLA_OFFLOAD_XSTATS_CPU_HIT.
It is under the name 'cpu_hits'
(or any shorten of it as 'cpu' or simply 'c')
For example:
ifstat -x c
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Extended stats are part of the RTM_GETSTATS method. This patch adds them
to ifstat.
While extended stats can come in many forms, we support only the
rtnl_link_stats64 struct for them (which is the 64 bits version of struct
rtnl_link_stats).
We support stats in the main nesting level, or one lower.
The extension can be called by its name or any shorten of it. If there is
more than one matched, the first one will be picked.
To get the extended stats the flag -x <stats type> is used.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reorder the includes in misc/ifstat.c to match convention.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Initialise for loops outside of for loops. GCC flags this as being
out of spec unless C99 or C11 mode is used.
With this change the entire tree appears to compile cleanly with -Wall.
$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2
...
$ make
...
ss.c: In function ‘unix_show_sock’:
ss.c:3128:4: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
...
Signed-off-by: Simon Horman <simon.horman@netronome.com>
A struct with only a single field does not make much sense. Besides
that, it was used by print_summary() only.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This function is used only at a single place anymore, so replace the
call to it by it's content, which makes that specific part of
unix_show() consistent with e.g. tcp_show().
Signed-off-by: Phil Sutter <phil@nwl.cc>
Although this complicates the dedicated procfs-based code path in
unix_show() a bit, it's the only sane way to get rid of unix_show_sock()
output diverging from other socket types in that it prints all socket
details in a new line.
As a side effect, it allows to eliminate all procfs specific code in
the same function.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This consolidates identical code in three places. While the function
name is not quite perfect as there is different proc_ctx printing code
in netlink_show_one() as well, I sadly didn't find a more suitable one.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Unix sockets used that field already to hold info about the socket type.
By replicating this approach in all other socket types, we can get rid
of protocol parameter in inet_stats_print() and have sock_state_print()
figure things out by itself.
Signed-off-by: Phil Sutter <phil@nwl.cc>
When dumping UNIX sockets and show_details is active but not show_mem
(ss -xne), the socket details are printed without being prefixed by tab.
Fix this by printing the tab character when either one of '-e' or '-m'
has been specified.
Signed-off-by: Phil Sutter <phil@nwl.cc>
When dumping UDP sockets and show_tcpinfo (-i) is active but not
show_mem (-m), print_tcpinfo() does not output anything leading to an
empty line being printed after every socket. Fix this by skipping the
call to print_tcpinfo() and the previous newline printing in that case.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Dump some new fields added to tcp_info in v4.10: tcpi_busy_time,
tcpi_rwnd_limited, tcpi_sndbuf_limited.
Example output for a flow busy for 110ms but never measurably limited by
receive window or send buffer:
busy:110ms
Example output for a flow usually limited by receive window:
busy:111ms rwnd_limited:101ms(91.0%)
Example output for a flow sometimes limited by send buffer:
busy:50ms sndbuf_limited:10ms(20.0%)
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Dump the new delivery_rate and delivery_rate_app_limited fields that
were added to tcp_info in Linux v4.9.
Example output:
pacing_rate 65.7Mbps delivery_rate 62.9Mbps
And for the application-limited case this looks like:
pacing_rate 1031.1Mbps delivery_rate 87.4Mbps app_limited
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
unix, tcp, udp[lite], packet, netlink sockets already support diag
interface for their collection and killing. Implement support
for raw sockets.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
This makes use of the sctp_diag interface recently added to the kernel.
Joint work with Xin Long who provided the PoC implementation which I
merely polished up a bit.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Abstract unix domain socket may embed null characters,
these should be translated to '@' when printed by ss the
same way the null prefix is currently being translated.
Signed-off-by: Isaac Boukris <iboukris@gmail.com>
tcp->snd_cwd is a u32, but ss treats it like a signed int. This may
results in negative bandwidth calculations.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
This allows the user to dump sockets with a given mark (via
"fwmark = 0x1234/0x1234" or "fwmark = 12345", etc.) , and to
display the socket marks of dumped sockets.
The relevant kernel commits are: d545caca827b ("net: inet: diag:
expose the socket mark to privileged processes.") and
- a52e95abf772 ("net: diag: allow socket bytecode filters to
match socket marks")
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Dump useful TCP BBR state information from a struct tcp_bbr_info that
was grabbed using the inet_diag API.
We tolerate info that is shorter or longer than expected, in case the
kernel is older or newer than the ss binary. We simply print the
minimum of what is expected from the kernel and what is provided from
the kernel. We use the same trick as that used for struct tcp_info:
when the info from the kernel is shorter than we hoped, we pad the end
with zeroes, and don't print fields if they are zero.
The BBR output looks like:
bbr:(bw:1.2Mbps,mrtt:18.965,pacing_gain:2.88672,cwnd_gain:2.88672)
The motivation here is to be consistent with DCTCP, which looks like:
dctcp(ce_state:23,alpha:23,ab_ecn:23,ab_tot:23)
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
SCTP module was not load by default. But this should be OK since we will not
load table if fdopen() failed, also opening the proc file won't load SCTP
kernel module.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
This only replaces occurrences where the newly allocated memory is
cleared completely afterwards, as in other cases it is a theoretical
performance hit although code would be cleaner this way.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
This big patch was compiled by vimgrepping for memset calls and changing
to C99 initializer if applicable. One notable exception is the
initialization of union bpf_attr in tc/tc_bpf.c: changing it would break
for older gcc versions (at least <=3.4.6).
Calls to memset for struct rtattr pointer fields for parse_rtattr*()
were just dropped since they are not needed.
The changes here allowed the compiler to discover some unused variables,
so get rid of them, too.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Top level can be any json type and can be created using
jsonw_start_object/jsonw_end_object etc.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Add option to suppress header line. When used the following line
is not shown:
"State Recv-Q Send-Q Local Address:Port Peer Address:Port"
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Support was recently added for device filters. The intent was to allow
the device to be specified by name or index, and using the if%u format
(dev == if5) or the simpler and more intuitive index alone (dev == 5).
The latter case is broken since the index is not saved to the filter
after the strtoul conversion. Further, the tmp variable used for the
conversion shadows another variable used in the function. Fix both.
With this change all 3 variants work as expected:
$ ss -t 'dev == 62'
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 224 10.0.1.3%mgmt:ssh 192.168.0.50:58442
$ ss -t 'dev == mgmt'
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 224 10.0.1.3%mgmt:ssh 192.168.0.50:58442
$ ss -t 'dev == if62'
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 36 10.0.1.3%mgmt:ssh 192.168.0.50:58442
Fixes: 2d29321256 ("ss: Add support to filter on device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
This patch was generated by the following semantic patch (a trimmed down
version of what is shipped with Linux sources):
@@
type T;
T[] E;
@@
(
- (sizeof(E)/sizeof(*E))
+ ARRAY_SIZE(E)
|
- (sizeof(E)/sizeof(E[...]))
+ ARRAY_SIZE(E)
|
- (sizeof(E)/sizeof(T))
+ ARRAY_SIZE(E)
)
The only manual adjustment was to include utils.h in misc/nstat.c to make
the macro known there.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Add support for device names in the filter. Example:
root@kenny:~# ss -t 'sport == :22 && dev == red'
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 0 10.100.1.2%red:ssh 10.100.1.254:47814
ESTAB 0 0 2100:1::2%red:ssh 2100:1::64:49406
Since kernel does not support iface in the filter specifying a
device name means all filtering is done in userspace.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Allow ssfilter_bytecompile to return 0 for filter ops the kernel
does not support. If such an op is in the filter string then all
filtering is done in userspace.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Extract parsing of sockstat and filter from inet_show_sock.
While moving run_ssfilter into callers of inet_show_sock enable
userspace filtering before the kill.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Similar to the Linux kernel and perf add infrastructure to reduce the
amount of output tossed to a user during a build. Full build output
can be obtained with 'make V=1'
Builds go from:
make[1]: Leaving directory `/home/dsa/iproute2.git/lib'
make[1]: Entering directory `/home/dsa/iproute2.git/ip'
gcc -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -c -o ip.o ip.c
gcc -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -c -o ipaddress.o ipaddress.c
to:
...
AR libutil.a
ip
CC ip.o
CC ipaddress.o
...
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
SK_MEMINFO_DROPS is added in linux-4.7 for TCP, UDP and SCTP
skmem will display the socket drop count using d prefix as in :
$ ss -tm src :22 | more
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 52 10.246.7.151:ssh 172.20.10.101:50759
skmem:(r0,rb8388608,t0,tb8388608,f1792,w2304,o0,bl0,d0)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Kernel sets info->tcpi_min_rtt to ~0U when no RTT sample was ever
taken for the session, thus min_rtt is unknown.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Passing a filter expression and selecting an address family using the
'-f' flag would overwrite the state filter by accident. Therefore
calling e.g. 'ss -nl -f inet '(sport = :22)' would not only print
listening sockets (as requested by '-l' flag) but connected ones, as
well.
Fix this by reusing the formerly ineffective call to filter_states_set()
to restore the state filter as it was before the call to
filter_af_set().
Signed-off-by: Phil Sutter <phil@nwl.cc>
There are only three users which require it to be reentrant, the rest is
fine without. Instead, provide a reentrant format_host_r() for users
which need it.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Since the relevant code (and it's bugs) is identical in both files, fix
them in one go. This patch fixes multiple issues:
* Using 'int' for the 'tdiff' variable does not suffice on 64bit
systems, the assigned initial time difference makes it wrap and
contain a negative value afterwards. Instead use the more appropriate
'time_t' type.
* As far as I understood the code, poll() is supposed to time out just
at the right time to trigger update_db() in the configured interval.
Therefore it's timeout must be set to the desired interval *minus* the
time that has already passed since then.
* With the last change to the algorithm in place, it does not make sense
to call update_db() before returning data to the connected client.
Actually, it never does otherwise we could skip the periodic updates
in the first place.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch adds a -K / --kill option to ss that attempts to
forcibly close matching sockets using SOCK_DESTROY.
Because ss typically prints sockets instead of acting on them,
and because the kernel only supports forcibly closing some types
of sockets, the output of -K is as follows:
- If closing the socket succeeds, the socket is printed.
- If the kernel does not support forcibly closing this type of
socket (e.g., if it's a UDP socket, or a TIME_WAIT socket),
the socket is silently skipped.
- If an error occurs (e.g., permission denied), the error is
reported and ss exits.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
fgets() will read at most size-1 bytes into the buffer and add a
terminating null-char at the end. Therefore it is not necessary to pass
a reduced buffer size when calling it.
This change was generated using the following semantic patch:
@@
identifier buf, fp;
@@
- fgets(buf, sizeof(buf) - 1, fp)
+ fgets(buf, sizeof(buf), fp)
Signed-off-by: Phil Sutter <phil@nwl.cc>
Although not fundamentally necessary to check return codes in these
spots, preventing the warnings will put new ones into focus.
Signed-off-by: Phil Sutter <phil@nwl.cc>
No need to keep static port boundaries global, they are not used
directly. Keeping them local also allows to safely reduce their names to
the minimum. Assign hardcoded fallback values also if fscanf() fails.
Get rid of unnecessary braces around return parameter.
Instead of more or less duplicating is_ephemeral() in run_ssfilter(),
simply call the function instead.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Exit early or continue on error instead of putting conditional into
conditional to make reading the code a bit easier.
Also, the call to memcpy() can be skipped by initialising prog with the
desired prefix.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Instead of calling rewind() and fgets() before every call to
scan_lines(), move them into scan_lines() itself.
This should also fix compat mode, as before the second call to
scan_lines() the first line was skipped unconditionally.
Signed-off-by: Phil Sutter <phil@nwl.cc>
The algorithm depends on the loop counter ('i') to increment by one in
each iteration. Though if running endlessly (count==0), the counter was
not incremented at all.
Also change formatting of the header printing conditional a bit so it's
hopefully easier to read.
Fixes: e7e2913 ("lnstat: run indefinitely by default")
Signed-off-by: Phil Sutter <phil@nwl.cc>