iproute2/examples
Daniel Borkmann 32e93fb7f6 {f,m}_bpf: allow for sharing maps
This larger work addresses one of the bigger remaining issues on
tc's eBPF frontend, that is, to allow for persistent file descriptors.
Whenever tc parses the ELF object, extracts and loads maps into the
kernel, these file descriptors will be out of reach after the tc
instance exits.

Meaning, for simple (unnested) programs which contain one or
multiple maps, the kernel holds a reference, and they will live
on inside the kernel until the program holding them is unloaded,
but they will be out of reach for user space, even worse with
(also multiple nested) tail calls.

For this issue, we introduced the concept of an agent that can
receive the set of file descriptors from the tc instance creating
them, in order to be able to further inspect/update map data for
a specific use case. However, while that is more tied towards
specific applications, it still doesn't easily allow for sharing
maps accross multiple tc instances and would require a daemon to
be running in the background. F.e. when a map should be shared by
two eBPF programs, one attached to ingress, one to egress, this
currently doesn't work with the tc frontend.

This work solves exactly that, i.e. if requested, maps can now be
_arbitrarily_ shared between object files (PIN_GLOBAL_NS) or within
a single object (but various program sections, PIN_OBJECT_NS) without
"loosing" the file descriptor set. To make that happen, we use eBPF
object pinning introduced in kernel commit b2197755b263 ("bpf: add
support for persistent maps/progs") for exactly this purpose.

The shipped examples/bpf/bpf_shared.c code from this patch can be
easily applied, for instance, as:

 - classifier-classifier shared:

  tc filter add dev foo parent 1: bpf obj shared.o sec egress
  tc filter add dev foo parent ffff: bpf obj shared.o sec ingress

 - classifier-action shared (here: late binding to a dummy classifier):

  tc actions add action bpf obj shared.o sec egress pass index 42
  tc filter add dev foo parent ffff: bpf obj shared.o sec ingress
  tc filter add dev foo parent 1: bpf bytecode '1,6 0 0 4294967295,' \
     action bpf index 42

The toy example increments a shared counter on egress and dumps its
value on ingress (if no sharing (PIN_NONE) would have been chosen,
map value is 0, of course, due to the two map instances being created):

  [...]
          <idle>-0     [002] ..s. 38264.788234: : map val: 4
          <idle>-0     [002] ..s. 38264.788919: : map val: 4
          <idle>-0     [002] ..s. 38264.789599: : map val: 5
  [...]

... thus if both sections reference the pinned map(s) in question,
tc will take care of fetching the appropriate file descriptor.

The patch has been tested extensively on both, classifier and
action sides.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2015-11-23 16:10:44 -08:00
..
bpf {f,m}_bpf: allow for sharing maps 2015-11-23 16:10:44 -08:00
diffserv (Logical change 1.3) 2004-04-15 20:56:59 +00:00
README.cbq Grab some more CBQ examples from Fedora Core 2005-10-12 22:46:23 +00:00
SYN-DoS.rate.limit (Logical change 1.3) 2004-04-15 20:56:59 +00:00
cbq.init-v0.7.3 cbq: fix find syntax in example 2015-04-20 09:57:14 -07:00
cbqinit.eth1 (Logical change 1.3) 2004-04-15 20:56:59 +00:00
dhcp-client-script dhcp-client-script: don't use /tmp 2012-02-15 10:05:45 -08:00
gaiconf gaiconf: /etc/gai.conf configuration helper. 2010-03-29 13:59:28 -07:00

README.cbq

# CHANGES
# -------
# v0.3a2- fixed bug in "if" operator. Thanks kad@dgtu.donetsk.ua.
# v0.3a-  added TIME parameter. Example:
#         TIME=00:00-19:00;64Kbit/6Kbit
#         So, between 00:00 and 19:00 RATE will be 64Kbit.
#         Just start "cbq.init timecheck" periodically from cron (every 10
#         minutes for example).
#         !!! Anyway you MUST start "cbq.init start" for CBQ initialize.
# v0.2 -  Some cosmetique changes. Now it more compatible with
#         old bash version. Thanks to Stanislav V. Voronyi
#         <stas@cnti.uanet.kharkov.ua>.
# v0.1 -  First public release
# 
# README
# ------
# 
# First of all - this is just a SIMPLE EXAMPLE of CBQ power.
# Don't ask me "why" and "how" :)
# 
# This is an example of using CBQ (Class Based Queueing) and policy-based
# filter for building smart ethernet shapers. All CBQ parameters are
# correct only for ETHERNET (eth0,1,2..) linux interfaces. It works for
# ARCNET too (just set bandwidth parameter to 2Mbit). It was tested
# on 2.1.125-2.1.129 linux kernels (KSI linux, Nostromo version) and 
# ip-route utility by A.Kuznetsov (iproute2-ss981101 version). 
# You can download ip-route from ftp://ftp.inr.ac.ru/ip-routing or
# get iproute2*.rpm (compiled with glibc) from ftp.ksi-linux.com.
# 
# 
# HOW IT WORKS
# 
# Each shaper must be described by config file in $CBQ_PATH
# (/etc/sysconfig/cbq/) directory - one config file for each CBQ shaper.
# 
# Some words about config file name:
# Each shaper has its personal ID - two byte HEX number. Really ID is 
# CBQ class.
# So, filename looks like:
# 
# cbq-1280.My_first_shaper
# ^^^ ^^^  ^^^^^^^^^^^^^
#  |  |            |______ Shaper name - any word
#  |  |___________________ ID (0000-FFFF), let ID looks like shaper's rate
#  |______________________ Filename must begin from "cbq-" 
# 
# 
# Config file describes shaper parameters and source[destination] 
# address[port].
# For example let's prepare /etc/sysconfig/cbq/cbq-1280.My_first_shaper:
# 
# ----------8<---------------------
# DEVICE=eth0,10Mbit,1Mbit
# RATE=128Kbit
# WEIGHT=10Kbit
# PRIO=5
# RULE=192.168.1.0/24
# ----------8<---------------------
# 
# This is minimal configuration, where:
# DEVICE:  eth0   - device where we do control our traffic
#          10Mbit - REAL ethernet card bandwidth
#          1Mbit  - "weight" of :1 class (parent for all shapers for eth0),
#                   as a rule of thumb weight=batdwidth/10.
#          100Mbit adapter's example: DEVICE=eth0,100Mbit,10Mbit
#          *** If you want to build more than one shaper per device it's
#              enough to describe bandwidth and weight once  - cbq.init
#              is smart :) You can put only 'DEVICE=eth0' into cbq-* 
#              config file for eth0.
# 
# RATE:    Shaper's speed - Kbit,Mbit or bps (bytes per second)
# 
# WEIGHT:  "weight" of shaper (CBQ class). Like for DEVICE - approx. RATE/10
# 
# PRIO:    shaper's priority from 1 to 8 where 1 is the highest one.
#          I do always use "5" for all my shapers.
# 
# RULE:    [source addr][:source port],[dest addr][:dest port]
#          Some examples:
# RULE=10.1.1.0/24:80         - all traffic for network 10.1.1.0 to port 80
#                               will be shaped.
# RULE=10.2.2.5               - shaper works only for IP address 10.2.2.5   
# RULE=:25,10.2.2.128/25:5000 - all traffic from any address and port 25 to
#                               address 10.2.2.128 - 10.2.2.255 and port 5000
#                               will be shaped.
# RULE=10.5.5.5:80,           - shaper active only for traffic from port 80 of
#                               address 10.5.5.5
# Multiple RULE fields per one config file are allowed. For example:
# RULE=10.1.1.2:80
# RULE=10.1.1.2:25
# RULE=10.1.1.2:110
# 
# *** ATTENTION!!!
# All shapers do work only for outgoing traffic!
# So, if you want to build bidirectional shaper you must set it up for
# both ethernet card. For example let's build shaper for our linux box like:
# 
#                     ---------             192.168.1.1
# BACKBONE -----eth0-|  linux  |-eth1------*[our client]
#                     ---------
# 
# Let all traffic from backbone to client will be shaped at 28Kbit and
# traffic from client to backbone - at 128Kbit. We need two config files:
# 
# ---8<-----/etc/sysconfig/cbq/cbq-28.client-out----
# DEVICE=eth1,10Mbit,1Mbit
# RATE=28Kbit
# WEIGHT=2Kbit
# PRIO=5
# RULE=192.168.1.1
# ---8<---------------------------------------------
# 
# ---8<-----/etc/sysconfig/cbq/cbq-128.client-in----
# DEVICE=eth0,10Mbit,1Mbit
# RATE=128Kbit
# WEIGHT=10Kbit
# PRIO=5
# RULE=192.168.1.1,
# ---8<---------------------------------------------
#                 ^pay attention to "," - this is source address!
# 
# Enjoy.