Commit 65dbe669 authored by Alessandro Rubini's avatar Alessandro Rubini

userspace/libwr and doc: add shared memory support

This is the new IPC mechanism (not RPC) to pass port status and
other status-like information among processes, to avoid lenghty RPC
calls.

Each WR process has 32k of shared space.

Thanks to Adam for some fixes.
Signed-off-by: Alessandro Rubini's avatarAlessandro Rubini <rubini@gnudd.com>
Signed-off-by: Adam Wujek's avatarAdam Wujek <adam.wujek@cern.ch>
parent fe9f6db6
......@@ -1515,18 +1515,28 @@ For further details on the update procedure, please see
distributed in @t{userspace/rootfs_override/}.
@c ##########################################################################
@node Inter-Process Comnunication
@node Inter-Process Communication
@chapter Inter-Process Communication
This chapter described the network of IPC/RPC communications that are
active inside the switch.
Currently there are two mechanisms in place: a simple RPC library
(``Remote Procedure Call'', so a process can ask another process
to perform an action, and a shared memory mechanism, so status of
each @sc{wrs} process can be shared with other processes.
Initially, and up to release 4.1 of
@t{wr-switch-sw} everything was RPC-based, including the passing
of status information. Starting in November 2014 we introduced
shared memory, to lower the CPU usage and increase the ability
to monitor overall Switch status.
@c ==========================================================================
@node mini-rpc
@section mini-rpc
Inter-process communication in the WR switch is based on remote
procedure calls (RPC), relying on the @i{mini-rpc} package.
The RPC mechanism in the switch relies on the @i{mini-rpc} package.
The package is a submodule, currently at this commit:
@smallexample
......@@ -1564,20 +1574,20 @@ from @i{wrpc-sw} since version 4.0 (Aug 2014).
@section RPC Sockets and Communication
This section describes the network of RPC calls that exist in the
whitre rabbir switch.
White Rabbit Switch.
First, @t{rtud} creates a socket with name @t{rtud}
which is only used by @t{rtu_stat}, to
Other RPC servers are created in the following places:
RPC servers are created in the following places:
@table @code
@item userspace/wrsw_rtud/rtud_exports.c
@c FIXME: rtud should use shmem for status
The socket is called @t{rtud} and is used by the @i{rtu_stat}
program to gather runtime information, and by @i{wrs_vlans}
to request actual actions.
@item userspace/ppsi/arch-wrs/wrs-startup.c
@c FIXME: ppsi should use shmem
The @t{ptpd} channel is created to report PTP status information.
......@@ -1598,6 +1608,7 @@ Clients are created in the following places:
@table @code
@item userspace/tools/rtu_stat.c
@c FIXME: rtu_stat should use shmem
The tool connects to @i{rtud} to get runtime information.
......@@ -1607,6 +1618,7 @@ Clients are created in the following places:
actions related to vlan setup.
@item userspace/wrsw_hal/hal_exports.c
@c FIXME: no more check using a socked, but with the shmem mechanism
A temporary client is created to check whether a HAL process
is already running.
......@@ -1617,13 +1629,14 @@ Clients are created in the following places:
soft-pll.
@item userspace/tools/wr_mon.c
@c FIXME: wr_mon should use shmem
The tty-based monitoring interface connects to @i{ptpd} (@i{ppsi})
to get run-time information.
userspace/tools/wr_management.c
@item userspace/tools/wr_management.c
To be verified. (FIXME)
To be removed as soon as possible, using wr_mon instead.
@item userspace/libwr/hal_client.c
......@@ -1641,8 +1654,6 @@ in two places, because @i{ppsi} is a separate package; the two
are identical and are expected to remain so. Same applies to @t{rt_ipc.h},
which appears both here and in @i{wrpc-sw}.
@c FIXME: check the headers...
@c ==========================================================================
@node The RT Subsystem
@section The RT Subsystem
......@@ -1650,7 +1661,78 @@ which appears both here and in @i{wrpc-sw}.
The in-FPGA processor running the real-time subsystem of the switch
is using a shared memory connection for communication. The details
are documented in the @t{mini-rpc} manual. Only the hal process
sends commands to the @t{rt} subsystem.
sends commands to the @sc{rt} subsystem.
@c ==========================================================================
@node WRS Shared Memory
@section WRS Shared Memory
The White Rabbit Switch has a shared memory system, with librarized
functions to access it. All status information is collected in a single
file, where each process has 32kB of storage for local structures.
The initial part of each process' area is a @t{stuct wrs_shm_head}, which
allows to make some sense of the overall area. The structure
is filled by library functions and accessed by shared memory users.
You can see how it is used in @t{tools/wrs_dump_shmem.c}.
The following functions are defined. Pleae look in the source code
for details about how they are used:
@table @code
@item void *wrs_shm_get(enum wrs_shm_name name_id, char *name, unsigned long flags);
@itemx void wrs_shm_put(void *headptr);
Request access to a shared memory area, and stop using it.
Currently only @t{WRS_SHM_WRITE} and @t{WRS_SHM_READ} are defined as
a flag. If non-zero the head is properly initialized.
On error NULL is returned, and errno is set.
@item void *wrs_shm_alloc(void *headptr, size_t size);
Allocate data space within the shared memory area. The returned
pointer can be used directly. Only writers should allocate
but the code is not checking for this. The function is used,
for example, in @t{wrsw_hal/hal_ports.c}.
@item void *wrs_shm_follow(void *headptr, void *ptr);
A reader can follow a pointer using this function. The writer
can allocate shared memory with @t{wrs_shm_alloc} and store the
pointer in the same shared memory area, thus instantiating
structures that point to other structures. But the reader processes
map the shared memory at a different address: this function
can be used to convert a pointer in the writer's address space
to a pointer in the reader's address space, or NULL if on error.
Please see @t{tools/wrs_dump_shmem.c} about how this is used.
@item void wrs_shm_write(void *headptr, int begin);
Whenever internal consistency of data structure is needed, the
writer should call this function before modifying shared structures,
with @t{begin} set to 1. It should also call the function again
when all modifications are done and data is internally consistent,
this time with @t{begin} set to zero.
@item unsigned wrs_shm_seqbegin(void *headptr);
@itemx int wrs_shm_seqretry(void *headptr, unsigned start);
A reader can use these functions to ensure it reads
internally-consistent data from a shared structure. It relies
on proper use of @t{wrs_shm_write()} by the writer.
@item int wrs_shm_age(void *headptr);
The function returns the age, in seconds, of the last
modification to the memory area. It relies
on proper use of @t{wrs_shm_write()} by the writer.
@item void *wrs_shm_data(void *headptr, unsigned version);
Returns a pointer to data after the @t{struct wrs_shm_head}.
@end table
@c ##########################################################################
@node Reboot/Reset Diagnostics
......
......@@ -5,7 +5,7 @@ CFLAGS = -Wall -I. -O2 -DDEBUG -ggdb -I./include -I../include \
OBJS = trace.o init.o fpga_io.o util.o pps_gen.o i2c.o shw_io.o i2c_bitbang.o \
i2c_fpga_reg.o pio.o libshw_i2c.o i2c_sfp.o fan.o i2c_io.o hwiu.o \
ptpd_netif.o hal_client.o
ptpd_netif.o hal_client.o shmem.o
LIB = libwr.a
......
/*
* This is the shared memory interface for multi-process cooperation
* within the whiterabbit switch. Everyone exports status information.
*/
#ifndef __WRS_SHM_H__
#define __WRS_SHM_H__
#include <stdint.h>
#define WRS_SHM_FILE "/dev/shm/wrs-shmem"
#define WRS_SHM_SIZE (32*1024) /* each */
/* Every process gets 8 pages (32k) to be safe for the future */
enum wrs_shm_name {
wrs_shm_ptp,
wrs_shm_rtu,
wrs_shm_hal,
wrs_shm_vlan,
WRS_SHM_N_NAMES, /* must be last */
};
/* Each area starts with this process identifier */
struct wrs_shm_head {
void *mapbase; /* In writer's addr space (to track ptrs) */
char name[7 * sizeof(void *)];
unsigned long stamp; /* Last modified, w/ CLOCK_MONOTONIC */
unsigned long data_off; /* Where the structure lives */
int shm_name; /* The enum above, for cross-checking */
int pid; /* The current pid owning the area */
unsigned pidsequence; /* Each new pid must increments this */
unsigned sequence; /* If we need consistency, this is it */
unsigned version; /* Version of the data structure */
unsigned data_size; /* Size of it (for binary dumps) */
};
/* flags */
#define WRS_SHM_READ 0x0000
#define WRS_SHM_WRITE 0x0001
/* get vs. put, like in the kernel. Errors are in errno (see source) */
void *wrs_shm_get(enum wrs_shm_name name_id, char *name, unsigned long flags);
int wrs_shm_put(void *headptr);
/* The writer can allocate structures that live in the area itself */
void *wrs_shm_alloc(void *headptr, size_t size);
/* The reader can track writer's pointers, if they are in the area */
void *wrs_shm_follow(void *headptr, void *ptr);
/* Before and after writing a chunk of data, act on sequence and stamp */
extern void wrs_shm_write(void *headptr, int begin);
/* A reader can rely on the sequence number (in the <linux/seqlock.h> way) */
extern unsigned wrs_shm_seqbegin(void *headptr);
extern int wrs_shm_seqretry(void *headptr, unsigned start);
/* A reader can check wether information is current enough */
extern int wrs_shm_age(void *headptr);
/* A reader can get the information pointer, for a specific version, or NULL */
extern void *wrs_shm_data(void *headptr, unsigned version);
#endif /* __WRS_SHM_H__ */
/* Alessandro Rubini for CERN 2014, LGPL-2.1 or later */
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <libwr/shmem.h>
/* Get wrs shared memory */
/* return NULL and set errno on error */
void *wrs_shm_get(enum wrs_shm_name name_id, char *name, unsigned long flags)
{
struct wrs_shm_head *head;
struct stat stbuf;
void *map;
int write_access = flags & WRS_SHM_WRITE;
int fd;
if (name_id >= WRS_SHM_N_NAMES) {
errno = EINVAL;
return NULL;
}
fd = open(WRS_SHM_FILE, O_RDWR | O_CREAT | O_SYNC, 0644);
if (fd < 0)
return NULL; /* keep errno */
/* The file may be too short: enlarge it as needed */
if (fstat(fd, &stbuf) < 0)
return NULL; /* keep errno */
if (stbuf.st_size < WRS_SHM_SIZE * (name_id + 1)) {
lseek(fd, WRS_SHM_SIZE * (name_id + 1) -1, SEEK_SET);
write(fd, "", 1);
}
map = mmap(0, WRS_SHM_SIZE,
PROT_READ | (write_access ? PROT_WRITE : 0),
MAP_SHARED, fd, WRS_SHM_SIZE * name_id);
if (map == MAP_FAILED)
return NULL; /* keep errno */
if (!write_access)
return map;
/* Init the fields */
head = map;
if (head->pid && kill(head->pid, 0) == 0) {
munmap(map, WRS_SHM_SIZE);
errno = EBUSY;
return NULL;
}
head->sequence = 1; /* a sort of lock */
head->mapbase = head;
strncpy(head->name, name, sizeof(head->name));
head->name[sizeof(head->name) - 1] = '\0';
head->stamp = 0;
head->data_off = sizeof(*head);
head->data_size = 0;
head->shm_name = name_id;
head->pid = getpid();
head->pidsequence++;
/* version and size are up to the user (or to allocation) */
head->sequence = 0; /* a sort of unlock */
return map;
}
/* Put wrs shared memory */
/* return 0 on success, !0 on error */
int wrs_shm_put(void *headptr)
{
struct wrs_shm_head *head = headptr;
int err;
if (head->pid == getpid())
head->pid = 0; /* mark that we are not writers any more */
if ((err = munmap(headptr, WRS_SHM_SIZE)) < 0)
return err;
return 0;
}
/* The writer can allocate structures that live in the area itself */
void *wrs_shm_alloc(void *headptr, size_t size)
{
struct wrs_shm_head *head = headptr;
void *nextptr;
if (head->pid != getpid())
return NULL; /* we are not writers */
if (head->data_off + head->data_size + size > WRS_SHM_SIZE)
return NULL; /* no space left */
nextptr = headptr + head->data_off + head->data_size;
head->data_size += (size + 7) & ~7; /* force 8-alignment */
return nextptr;
}
/* The reader can track writer's pointers, if they are in the area */
void *wrs_shm_follow(void *headptr, void *ptr)
{
struct wrs_shm_head *head = headptr;
if (ptr < head->mapbase || ptr > head->mapbase + WRS_SHM_SIZE)
return NULL; /* not in the area */
return headptr + (ptr - head->mapbase);
}
/* Before and after writing a chunk of data, act on sequence and stamp */
void wrs_shm_write(void *headptr, int begin)
{
struct wrs_shm_head *head = headptr;
struct timespec tv;
if (!begin) {
/* At end-of-writing update the timestamp too */
clock_gettime(CLOCK_MONOTONIC, &tv);
head->stamp = tv.tv_sec;
}
head->sequence++;
return;
}
/* A reader can rely on the sequence number (in the <linux/seqlock.h> way) */
unsigned wrs_shm_seqbegin(void *headptr)
{
struct wrs_shm_head *head = headptr;
return head->sequence;
}
int wrs_shm_seqretry(void *headptr, unsigned start)
{
struct wrs_shm_head *head = headptr;
if (start & 1)
return 1; /* it was odd: retry */
return head->sequence != start;
}
/* A reader can check wether information is current enough */
int wrs_shm_age(void *headptr)
{
struct wrs_shm_head *head = headptr;
struct timespec tv;
clock_gettime(CLOCK_MONOTONIC, &tv);
return tv.tv_sec - head->stamp;
}
/* A reader can get the information pointer, for a specific version, or NULL */
void *wrs_shm_data(void *headptr, unsigned version)
{
struct wrs_shm_head *head = headptr;
if (head->version != version)
return NULL;
return headptr + head->data_off;
}
#!/bin/sh
test -d /dev/shm && rmdir /dev/shm
mkdir -m 1777 /dev/shm
mount -t tmpfs -o nosuid,nodev,size=4096k none /dev/shm
\ No newline at end of file
......@@ -268,7 +268,6 @@ int hal_init_ports()
int index = 0, i;
char port_name[128];
TRACE(TRACE_INFO, "Initializing switch ports...");
/* default timeouts */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment