... | ... | @@ -56,7 +56,7 @@ the write over Wishbone. Finally, the WB interconnect routes the write |
|
|
to the correct
|
|
|
slave.
|
|
|
|
|
|
![](/uploads/4c176d2f9bc9c9639617b7c38411a8cf/software.jpg)
|
|
|
![](/uploads/b301407fc043b1ab537e8763bc6022cd/system.jpg)
|
|
|
|
|
|
Either or both of the devices may be replaced by software, as shown in
|
|
|
the figure below. In this scenario, the operating system buffers and
|
... | ... | @@ -70,51 +70,162 @@ may be mapped into the Wishbone bus of other Etherbone nodes, hardware |
|
|
or
|
|
|
software.
|
|
|
|
|
|
![](/uploads/b301407fc043b1ab537e8763bc6022cd/system.jpg)
|
|
|
![](/uploads/4c176d2f9bc9c9639617b7c38411a8cf/software.jpg)
|
|
|
|
|
|
# Introduction
|
|
|
|
|
|
Etherbone is an FPGA-core that connects Ethernet to internal on-chip
|
|
|
wishbone buses permitting any core to talk to any other across Ethernet.
|
|
|
A software library is provided that permits any computer with an
|
|
|
Ethernet card to easily communicate with remote cores on the Etherbone
|
|
|
network. The Etherbone core implements a wishbone master and a wishbone
|
|
|
slave bus controller. With this scheme, any number of Etherbone
|
|
|
FPGA-cores and application software tasks can be connected together to
|
|
|
implement hybrid distributed networks of arbitrary complexity such as
|
|
|
field-buses, timing systems, or testbeds for hardware debugging.
|
|
|
Etherbone provides basic read, write and addressing functions. Etherbone
|
|
|
data transfers are initiated either by FPGA cores connected to the
|
|
|
Etherbone wishbone buses or by application software via the library. It
|
|
|
is within in these Etherbone Accessible Devices (EAD) that specific
|
|
|
cores may implement other levels of abstraction on top of Etherbone as
|
|
|
required. More information is available in the [Document](/project/etherbone-core/wikis/Documents/Etherbone-core-functional-specifications) document.
|
|
|
Ebinternals.png
|
|
|
|
|
|
# Work so far
|
|
|
|
|
|
<table>
|
|
|
<tbody>
|
|
|
<tr class="odd">
|
|
|
<td><strong>Date</strong></td>
|
|
|
<td><b> Event </b></td>
|
|
|
</tr>
|
|
|
<tr class="even">
|
|
|
<td>19-08-2010</td>
|
|
|
<td>Project start.</td>
|
|
|
</tr>
|
|
|
<tr class="odd">
|
|
|
<td>11-10-2010</td>
|
|
|
<td>First functional spec draft released for comments.</td>
|
|
|
</tr>
|
|
|
</tbody>
|
|
|
</table>
|
|
|
|
|
|
# Outlook
|
|
|
|
|
|
The target date for a first implementation is March 2011. The functional
|
|
|
spec should be approved at the end of October 2010 and the technical
|
|
|
spec should be ready by December 2010.
|
|
|
## Addressing
|
|
|
|
|
|
Slaves on a Wishbone bus have a mapped address range. Masters read and
|
|
|
write to an address on the local bus and the Intercon routes the
|
|
|
operation to the matching slave device. However, with the introduction
|
|
|
of Etherbone, there are now multiple reachable Wishbone buses in the
|
|
|
facility-wide system. To select the destination slave, additional
|
|
|
address information is required.
|
|
|
|
|
|
When using the software interface, an application acquires a handle
|
|
|
object for the remote bus. Reads and writes are then performed via the
|
|
|
handle object, requiring only the WB bus address per operation. To
|
|
|
acquire the handle object, the application supplies the hostname and
|
|
|
port of the remote WB bus.
|
|
|
|
|
|
For a hardware implementation, the requests come from a local WB master,
|
|
|
which can only provide a WB bus address. To determine the missing
|
|
|
address information, an EB bridge must infer the destination WB bus
|
|
|
based only on the local WB address requested. To achieve this, the EB
|
|
|
bridge establishes a configurable mapping from local WB addresses to
|
|
|
destination hostname:ports and target WB addresses.
|
|
|
|
|
|
For example, consider an EB bridge occupying address range 0x1000-0x3000
|
|
|
on the local bus. There is a remote WB bus available on the the host
|
|
|
example.com:3434. We would like to access the address range 0x100-0x200
|
|
|
on that bus. Thus, we configure our EB bridge to map this range as
|
|
|
0x2000-0x2100 on the local bus. Now, when a WB write on our local bus to
|
|
|
address 0x2050 is performed, the bridge transforms this into an EB write
|
|
|
destined for example.com:3434 at address 0x150.
|
|
|
|
|
|
## Pipelining
|
|
|
|
|
|
Unlike a local WB bus, where devices answer in a few clock cycles, a
|
|
|
remote bus accessed via EB has a much high latency. For a 100MHz bus and
|
|
|
a distance of only 20km the difference is 10ns to 100us. For
|
|
|
Internet-scale distances, the latency can easily rise to 100ms.
|
|
|
Therefore, an application which only issues a new read/write operation
|
|
|
when the previous operation completes will perform 10^4 to 10^7 times
|
|
|
slower over EB than direct WB.
|
|
|
|
|
|
EB supports pipelining to overcome this significant performance
|
|
|
bottleneck. Instead of issuing a single operation at a time, an
|
|
|
application/device can issue new operations without waiting for the
|
|
|
previous operation to complete. The results of the operations will
|
|
|
arrive in the same order they were issued. Whenever new operations do
|
|
|
not dependend on still incomplete operations, this can almost entirely
|
|
|
mask the performance lost to remote access.
|
|
|
|
|
|
As an example, considering two application using EB. The first
|
|
|
application is a firmware writing tool that needs to write the
|
|
|
firmware and confirm the firmware was written correctly. This problem
|
|
|
can be readily pipelined; the operations have an order requirement
|
|
|
(confirmation happens after write), but the choice of operation to issue
|
|
|
does not depend on previous results. The firmware writer can issue a
|
|
|
sequence of WWWW...RRRR... operations in the pipeline without waiting.
|
|
|
Alternatively, it might also use the sequence WRWRWR... to confirm each
|
|
|
word immediately after writing it. In both cases, the application can
|
|
|
issue all of the operations without waiting. This would not be possible
|
|
|
if the application were to iterate a remote function.
|
|
|
Suppose the application wants to compute f(f(f(...f(x)...))) using a
|
|
|
remote WB slave to calculate function f.
|
|
|
Here, the aplication writes x to the remote slave and reads back y =
|
|
|
f(x). Then the application writes y to the remote slave and reads back z
|
|
|
= f(y). The write pattern WRWRWR... is the same as the firmware loader,
|
|
|
but here we can only pipeline a single write-read operation pair
|
|
|
together. Until we have received the result f(x), we cannot issue f(y).
|
|
|
|
|
|
In Wishbone, several operations can be grouped into a single bus cycle.
|
|
|
A particularly bad situation that can occur is when dependencies appear
|
|
|
within a WB cycle. Generally, WB cycles acquire the device for use until
|
|
|
cycle completion. On a local bus, any access pattern will work, as the
|
|
|
operations will complete quickly and release the cycle line. However,
|
|
|
when this happens with EB-sized latencies, a cycle might tie up a device
|
|
|
for potentially unacceptable duration. Consider for example a WB cycle
|
|
|
that reads from one address and writes the result to another address.
|
|
|
Locally, there is no problem; the entire WB cycles executes in a few
|
|
|
nanoseconds. However, when that same access pattern runs over the
|
|
|
network, the slave device needs to wait for the read result to travel to
|
|
|
the master and the final write to travel back. A very bad design.
|
|
|
|
|
|
Dealing with data dependencies within a cycle is a major complication
|
|
|
addressed in the different implementation options described later.
|
|
|
|
|
|
## Config Space
|
|
|
|
|
|
In addition to remote bus access,
|
|
|
Etherbone also provides a configuration space.
|
|
|
This config space is used to specify transmission parameters,
|
|
|
recover bus error status codes,
|
|
|
and match read results to the requests.
|
|
|
|
|
|
The config space is an complementary 16-bit wide address space attached
|
|
|
to every EB slave.
|
|
|
EB requests can read/write to this configuration space in addition the
|
|
|
the normal WB bus.
|
|
|
The config space is divided up into two regions: the register space and
|
|
|
the implementation space.
|
|
|
All addresses in the register space correspond to EB control
|
|
|
registers,
|
|
|
specified in this document.
|
|
|
The register space spans addresses 0x0-0x7FFF.
|
|
|
The implementation space is guaranteed to be free for whatever use
|
|
|
a hardware/software implementation chooses.
|
|
|
The implementation space spans 0x8000-0xFFFF.
|
|
|
|
|
|
Two important registers in the address space include the error status
|
|
|
register, which reports WB error status codes, and the WB device map
|
|
|
pointer,
|
|
|
which provides information about the slaves attached to a remote bus.
|
|
|
The implementation space is typically used by an EB master to receive
|
|
|
the data which it read.
|
|
|
Reads to an EB device trigger a write back to the source EB device.
|
|
|
Those writebacks are often sent to the implementation space
|
|
|
where they can be handled by the EB core/code and invisible to the WB
|
|
|
bus.
|
|
|
|
|
|
\\subsection{Bus Widths}
|
|
|
|
|
|
In Wishbone,
|
|
|
a bus may have a port width that is 8/16/32/64 bits wide.
|
|
|
Thus, a master in one WB bus might write 32-bits at a time,
|
|
|
while a slave in another WB bus expects 16-bits at a time.
|
|
|
Etherbone makes no attempt to convert between differing port widths,
|
|
|
because converting a 32-bit write into two 16-bit writes might change
|
|
|
semantics.
|
|
|
|
|
|
However,
|
|
|
Etherbone does negotiate which port widths are acceptable to both
|
|
|
devices.
|
|
|
This mostly affects software,
|
|
|
which can meaningfully support access with different port widths.
|
|
|
Hardware implementations will typically advertise and accept only one
|
|
|
width.
|
|
|
|
|
|
Address spaces in WB are conceptually infinite,
|
|
|
but in practice are constrained to a fixed width.
|
|
|
Address width conversion, as oppposed to port width conversion,
|
|
|
is relatively straight-forward.
|
|
|
Address 0x0400 is that same as 0x00000400.
|
|
|
If a 32-bit device is accessed by a 16-bit device,
|
|
|
the 16-bit device can only see the low 16-bits of the larger device's
|
|
|
address space.
|
|
|
|
|
|
Address width is negotiated by Etherbone simply to determine the amount
|
|
|
of
|
|
|
space to reserve for message exchanges.
|
|
|
A hardware implementation is free to only advertise address widths
|
|
|
whose message alignment is convenient to them.
|
|
|
|
|
|
# Work remaining
|
|
|
|
|
|
- Integrate hardware Etherbone core with White Rabbit spec nodes
|
|
|
- Finish port of Etherbone library to LM32
|
|
|
- Finish SSH-UDP Etherbone gateway
|
|
|
|
|
|
|
|
|
|
... | ... | |