Commit 4905924d authored by Alessandro Rubini's avatar Alessandro Rubini

doc: documented radiusvlan

Signed-off-by: Alessandro Rubini's avatarAlessandro Rubini <rubini@gnudd.com>
parent d53a95c0
\input texinfo @c -*-texinfo-*-
%
% radiusvlan.in - main file for the documentation
%
%%%%
%------------------------------------------------------------------------------
%
% NOTE FOR THE UNAWARE USER
% =========================
%
% This file is a texinfo source. It isn't the binary file of some strange
% editor of mine. If you want ASCII, you should "make radiusvlan.txt".
%
%------------------------------------------------------------------------------
%
% This is not a conventional info file...
% I use three extra features:
% - The '%' as a comment marker, if at beginning of line ("\%" -> "%")
% - leading blanks are allowed (this is something I cannot live without)
% - braces are automatically escaped when they appear in example blocks
%
@comment %**start of header
@documentlanguage en
@documentencoding ISO-8859-1
@setfilename radiusvlan.info
@settitle radiusvlan
@iftex
@afourpaper
@end iftex
@paragraphindent none
@comment %**end of header
@setchapternewpage off
@set update-month October 2020
@c the release name below is substituted at build time
@set release __RELEASE_GIT_ID__
@finalout
@titlepage
@title Radius Vlan
@subtitle Description of the mechanism used in WR switch
@subtitle @value{update-month} (@value{release})
@author A. Rubini
@end titlepage
@headings single
@c ##########################################################################
@iftex
@contents
@end iftex
@c ##########################################################################
@c in texinfo we are mandated to have a Top node
@node Top
@top Introduction
This document describes a new feature of the White Rabbit Switch,
to be used in GSI. When a devices is detected on a port which is
configured as ``access'', a Radius server is queried for authorization.
The features is called @i{radiusvlan} or @i{rvlan} when a shorter name
is to be preferred. It relies on @i{radclient}, which was added
to @i{buildroot} in a new @i{freeradius-utils} package. The package
installs @i{radtest}, @i{radclient} and a minimal dictionary (the full
dictionary is more than 1MB worth of data).
@c ##########################################################################
@node Related Kconfig Items
@chapter Related Kconfig Items
Like most features in White Rabbit Switch, @i{radiusvlan} is configured
through @i{Kconfig}. The @i{dot-config} file is used both at build time
and at run time (where it lives in @i{/wr/etc}).
This is the list of configuration items related to @i{radiusvlan}. None
of them has effects on the firmware build, they are only used at runtime.
@table @code
@item RVLAN_DAEMON
The boolean option selects whether the tool is to be run or not.
If disabled, the tool will not run and the related @i{monit} rule
won't be activated. No further config option has any effect if
this flag is false.
@item RVLAN_PMASK
A port mask. If any bit in the mask is 0, the associated port will
not be monitored by the tool. Port @i{wri1} is associated to bit 0,
and so on until @i{wri18} associated to bit 17. Bits 18-31 are
ignored. The default value is all-1.
@item RVLAN_AUTH_VLAN
A temporary @sc{vid} to be used during port authorization.
Defaults to 4094.
@item RVLAN_NOAUTH_VLAN
The @sc{vid} to be used for ports that are not authorized.
Defaults to 4094.
@item RVLAN_OBEY_DOTCONFIG
A boolean option. If set, @i{radiusvlan} will obey the @sc{vid}
value set forth in @i{dot-config} rather than what the Radius
server returned. Thus, the Radius server's reply is only used
to authorize or not the port (if not, @t{NOAUTH_VLAN} is applied).
@item RVLAN_RADIUS_SERVERS
A string listing the IP addresses of a set of Radius servers.
Currently, only the first address (or the only one) is used.
@end table
@c ##########################################################################
@node Service Activation and Monitoring
@chapter Service Activation and Monitoring
The tool is a standalone program that ignores its own command line;
all configuration information comes from dot-config, as described
above.
The service is executed at boot from @i{/etc/init.d/radiusvlan}, which
has the same structure of all other similar scripts.
In @i{/etc/rcS}, the symbolic link must be after @i{vlan} configuration.
The service, like most other @i{wrs} services, is monitored by @i{monit},
with the same parameters as all other services. Working @i{monit} setup
can be confirmed by running
@example
while true; do killall radiusvlan; sleep 3; done; done
@end example
which will properly trigger a @i{monit-triggered} reboot.
Both invocation and monitoring depend on @i{dot-config}: if @t{RVLAN_DAEMON}
is false, neither of them is activated.
@c ##########################################################################
@node Internal Design
@chapter Internal Design
The tools enumerates all @i{wri*} interfaces in @i{/sys/net/class}.
It opens a @i{netlink} socket to get notification of any change in
interface up/down status, and then checks whether each of them
is up or down (thus, no change can get undetected).
Only ports configured as @t{VLAN_PORTxx_MODE_ACCESS} are monitored,
and only if the corresponding bit in @t{RVLAN_PMASK} is set.
For each monitored port, the tool runs a state machine, where the initial state
is @i{DOWN} or @i{JUSTUP}. Whenever a state change is reported by
@i{netlink}, the port is moved to either @i{JUSTUP} or @i{GODOWN}.
This is the list of states. No state is blocking, so operation on one
port does not stop operation on other ports (the engine is based
on @i{select()}).
@table @code
@item RVLAN_DOWN
The port is quietly down.
@item RVLAN_JUSTUP
The port was just reported as ``up'' and we must start
authentication. The first step is identifying the MAC address
of the peer. Thus, the tool starts sniffing the port, but opening
a raw socket listening to this port.
@item RVLAN_SNIFF
Get a frame from the port. If the frame is sent from the switch
itlsef, it is ignored. Any address configured in the switch is
considered as @i{self} (i.e., also the @i{eth0} mac address, used
by @i{ppsi} as sender address, is ignored). When a foreign
frame is received, the tool runs @i{radclient}, feeding data
to its @i{stdin} and collecting its @i{stdout}. The @i{pid}
of the child process is retained for later cleanup.
@item RVLAN_AUTH
@i{radclient} returned some data. This state collects it until EOF.
When the reply is complete, the tool looks for ``@t{Framed-User}''
and ``@t{Tunnel-Private-Group-Id}''. If both exist authentication
succeeded. The @i{chosen_vlan} is either the one returned by the
Radius Server or the one set forth in dot-config, according to
the @t{OBEY_DOTCONFIG} parameter.
@item RVLAN_CONFIG
This state calls ``@t{wrs_vlans --port <port> --pvid <pvid>''}'',
where @i{pvid} is @i{noauth_vlan} if authorization failed.
We then move to @t{CONFIGURED} state. The external @i{wrs_vlans}
tool is lazily executed with @i{system(3)}, and thus this state
is blocking. However, @i{wrs_vlans} completes in no time, so this
is acceptable in my opinion.
@item RVLAN_CONFIGURED
The port is quietly running, no action is performed.
@item RVLAN_GODOWN
This state is entered whenever @i{netlink} reports that the
interface went down. It closes any open file descriptor and
kills the child process, if it exists. Thus, no remaining
garbage remains in the system even if the fiber is unplugged
and re-plugged quickly several times.
@item RVLAN_WAIT
Wait for the child process to terminate (after we killed it).
A transient state that leads to @t{RVLAN_DOWN}.
@end table
@c ##########################################################################
@node Robustness
@chapter Robustness
The tool is designed to be robust. All possible errors are reported back to
the caller (and to @i{stderr}) and no blocking operation is performed.
The only exception is the call to @i{wrs_vlans}, which is blocking.
Any errors in the state machine leaves the port in the same state,
so @i{wrs_vlans} is re-run if it fails, and so on. Failure in
reading replies from @i{radclient} turn the FSM to @i{GODOWN}, so
the procedure is started again -- because the port is up.
If the Radius server is not reachable @i{radclient} will time out
The startup script uses @t{CONFIG_WRS_LOG_OTHER} as a destination for
its own output, and it is verified to work with my local @i{rsyslog}
server.
@c ##########################################################################
@node Diagnostic Tools
@chapter Diagnostic Tools
@c ==========================================================================
@node Checking the Current Status
@section Checking the Current Status
You can always see the current configuration by running @t{rvlan-status}.
This example is taken in a running switch,where port @i{wri1} is in
trunk mode (and thus not monitored), and only port @i{wri17} is connected to
a slave, which was authorized:
@smallexample
nwt0075m66# /wr/bin/rvlan-status
wri2 (70b3d591e346 <-> ): state down, vlan 0, pid 0, fd -1
wri3 (70b3d591e347 <-> ): state down, vlan 0, pid 0, fd -1
wri4 (70b3d591e348 <-> ): state down, vlan 0, pid 0, fd -1
wri5 (70b3d591e349 <-> ): state down, vlan 0, pid 0, fd -1
wri6 (70b3d591e34a <-> ): state down, vlan 0, pid 0, fd -1
wri7 (70b3d591e34b <-> ): state down, vlan 0, pid 0, fd -1
wri8 (70b3d591e34c <-> ): state down, vlan 0, pid 0, fd -1
wri9 (70b3d591e34d <-> ): state down, vlan 0, pid 0, fd -1
wri10 (70b3d591e34e <-> ): state down, vlan 0, pid 0, fd -1
wri11 (70b3d591e34f <-> ): state down, vlan 0, pid 0, fd -1
wri12 (70b3d591e350 <-> ): state down, vlan 0, pid 0, fd -1
wri13 (70b3d591e351 <-> ): state down, vlan 0, pid 0, fd -1
wri14 (70b3d591e352 <-> ): state down, vlan 0, pid 0, fd -1
wri15 (70b3d591e353 <-> ): state down, vlan 0, pid 0, fd -1
wri16 (70b3d591e354 <-> ): state down, vlan 0, pid 0, fd -1
wri17 (70b3d591e355 <-> 00267b0003d4): state configured, vlan 31, pid 0, fd -1
wri18 (70b3d591e356 <-> ): state down, vlan 0, pid 0, fd -1
@end smallexample
This works by sending @i{SIGUSR1} to the running @i{radiusvlan}, which
creates @i{/tmp/rvlan-status} with the above information. If
@i{radiusvlan} is not running, @i{rvlan-status} will report
``@t{radiusvlan: no process found}''.
@c ==========================================================================
@node Forcing Re-Authorizazion
@section Forcing Re-Authorizazion
By sending @i{SIGUSR2} to a running @i{radiusvlan} all state machines
are turned to @t{JUSTUP} so all authorization is retried. Please note
that this is pretty raw, and should only be run in a quiet system
where all interfaces are @i{configured} or @i{down} (the cleanup of
state @t{GODOWN} is not performed).
@c ==========================================================================
@node Looking at Authorization Strings
@section Looking at Authorization Strings
Communication with @i{radclient} happens using @i{stdin} and @i{stdout}.
Currently @i{radiusvlan} saves in @i{/tmp} both files, to help tracing
any errors. The file names are port-specific, so only the last iteration
will be visible.
There are two example: a successful @i{wri17} authentication and a failed
@i{wri3}authentication:
@smallexample
nwt0075m66# grep . /tmp/radclient-wri17-*
/tmp/radclient-wri17-in:User-Name = "00267b0003d4"
/tmp/radclient-wri17-in:User-Password = "00267b0003d4"
/tmp/radclient-wri17-out:Received response ID 93, code 2, length = 50
/tmp/radclient-wri17-out: Tunnel-Type:0 = 13
/tmp/radclient-wri17-out: Tunnel-Medium-Type:0 = IEEE-802
/tmp/radclient-wri17-out: Framed-Protocol = PPP
/tmp/radclient-wri17-out: Service-Type = Framed-User
/tmp/radclient-wri17-out: Tunnel-Private-Group-Id:0 = "2984"
wrs# grep . /tmp/radclient-wri17-*
/tmp/radclient-wri3-in:User-Name = "90e2ba456c6b"
/tmp/radclient-wri3-in:User-Password = "90e2ba456c6b"
/tmp/radclient-wri3-out:radclient: no response from server for ID 98 socket 4
@end smallexample
@c ==========================================================================
@node Verbose Operation
@section Verbose Operation
Finally, if you set @t{RVLAN_VERBOSE} to a non-empty value in the
tool's environment, initial enumeration and state machine changes are
reported to @i{stdout}. This is an example on a running switch where
@i{radiusvlan} was already automatically run:
@smallexample
wrs# export RVLAN_VERBOSE=y; killall radiusvlan; /wr/bin/radiusvlan
device wri3 left promiscuous mode
Pmask = 0xffffffff
Interface "wri1": not access mode
Check wri2: up
Check wri5: down
Check wri6: down
Check wri7: down
Check wri3: up
Check wri4: down
Check wri8: down
[...]
FSM: device wri3 entered promiscuous mode
wri2: justup -> sniff
FSM: wri3: justup -> sniff
vfrom(wri2): 0026-0008546f9863
FSM: wri2: sniff -> auth
recvfrom(wri3): 0800-90e2ba456c6b
FSM: wri3: sniff -> auth
dev wri2, got 55 bytes so far
wri2: reaped radclient: 0x00000100
dev wri2: vlan 4094
FSM: wri2: auth -> config
FSM: wri2: config -> configured
dev wri3, got 54 bytes so far
wri3: reaped radclient: 0x00000100
dev wri3: vlan 4094
FSM: wri3: auth -> config
FSM: wri3: config -> configured
@end smallexample
In the above example, two interface were up and authorization failed for
both (as seen, @i{radclient} did @t{exit(1)}). Both interfaces
are configured in vlan 4094.
@c ##########################################################################
@node Bugs and Missing Features
@chapter Bugs and Missing Features
A few, unfortunately
@itemize @bullet
@item Only one Radius server is queried. The tools should use all
the servers found in @t{RVLAN_RADIUS_SERVERS}, moving to the next one
when a server is not replying (like in the @i{wri3} example above).
@item Currently a port is not moved to @i{vid} ``auth'' during authorization.
The value is saved to the internal state but not enacted.
@item It is not expected that MAC addresses change. Both identification
of self frames and blessing of peers (for authorization) has an
ever-lasting effect. Clearly, if you change client in a port, the
link-down and link-up events will force authentication on the new mac
address.
@item Vlan configuration only happens with @t{--pvid} configuration,
and no action is performed on the routing table.
@end itemize
The last item is tricky. The White Rabbit Switch must be informed
about vlan-sets, in order to correctly route frames, but those sets
sometimes cannot just be derived by the individual @i{vid} settings.
I was told that for the current application (an ``obey-dotconfig''
one) I should not touch the routing table, but I'm sure this is not
correct for a real multi-vlan setup (especially a dynamic
radius-driven environment). This should be investigated.
@bye
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment