-
Notifications
You must be signed in to change notification settings - Fork 1
Home
This collection of tools is intended for simulating SEU errors in Xilinx FPGAs. It is fully automated system, which randomly places errors into content of LUTs. It uses real hardware (for now Spartan3 is supported and tested) and partial reconfiguration techniques.
The tool are divided into 3 parts:
- Fault Injection Script
- Test Control Script
- Test Harness
Ir is PERL script. Following libraries is required. All can be install through CPAN:
Expect
threads;
IO::Socket::INET
Getopt::Long
Pod::Usage
Expect library is used to communicate between script and Xilinx tools. The script relies on following tools:
- fpga_edline – Xilinx ISE tool. This is used to manipulate netlists. This tool can read and make changes in .ncd proprietary netlists
- bitgen – Xilinx ISE tool. This tool is used to create partial bitstreams.
- xc3sprog – Custom tool. Will be explained later.
Because bitgen takes time to process bitstream, the operations of programming FPGA and preparing next bitstream is run in parallel.
The script has configuration part at the beginning of the file. There are paths to different tools used in the process. You must set them correctly.
The script also takes command line arguments as follows:
--ncd <file.ncd> = NCD file from Xilinx tools
--bit <file.bit> = Initial bitstream
--module <module_name> = Name of module (instance) into which inject faults
[--faultcount <num>] = How many faults to inject simultaneously
--help = This help
The script does following operations:
- Loads .ncd netlist
- Parses netlist and finds all relevant LUTs using module_name. It uses wild-card search so for example if you use
mod
as module name than instances such asmod1
ormodX
is also considered for fault injection. If you don’t want this and instead want to do exact match, place dot
at the end such as inmod.
. - Programs initial bitstream into FPGA
- Starts two processes, which does following in parallel
- Generates partial bitstream with fault in and partial bitstream which can correct his fault
- Programs faulty bitstream into FPGA, waits for test to complete and than programs correction bitstream. At the end of programming round, the FPGA is in initial state so next round is always done on clean design.
SEU simulation is done by randomly switching bits in LUTs. Number of faults generated at once can be set on command line. This is very useful for testing of SEU mitigation techniques. For example if you have technique, which can withstand one fault and you choose to generate 2 or more, you should see your system fail. If not, something is really wrong. If more than one fault is set, than they are randomly distributed along design. More than one fault can appear in single LUT.
The fault injection script doesn’t handle any tests. It is only for injecting of faults. This enables you to use this script without modification on wide variety of designs. The tests are controlled in external program. Communication between this script and so called Test Control Script is done using UDP protocol. Used port can be configured in configuration part at the beginning of injection script source. Main tasks is following:
- Wait for Test Control Script to connect – test injection script is always run first, than the test control
- After each programming of faulty bitstream, transmit the list of injected faults to test control. It is up to test control script what to do with this information. You can use it in test procedures or you can just log it to file or simply throw it away.
- Transmits to test control that FPGA is ready for tests (programming completed) and waits until the test control replies, that the test are completed and the whole operation can continue
The communication protocol uses ASCII messages with some keywords, the list of which (including meaning) follows:
from fault injection to test control
OK - after successfully connecting of testbench driver
FSET - after this command, list of generated faults will follow
each fault is separate packet
list is terminated with “###” escape sequence
FGEN - means, that target device is programmed and ready for test
from test control to fault injection
CONNECT - is sent immediately after connecting to fault-generator
CONTINUE - as reply to FGEN after all test are finished
The main problem is, that impact
can’t program the bitstream without previously erasing the old one. Because of this, it can’t be used to program partials.
I used xc3sprog
from http://sourceforge.net/projects/xc3sprog/ and patched is, so it doesn’t reset the FPGA before programming bitstream. The patch is simple:
--- xc3sprog-r216/progalgxc3s.cpp 2009-07-02 11:05:13.000000000 +0200
+++ xc3sprog-r216-patched/progalgxc3s.cpp 2009-11-12 15:49:56.000000000 +0100
@@ -132,10 +131,10 @@
}
void ProgAlgXC3S::array_program(BitFile &file)
{
- flow_enable();
+// flow_enable();
/* JPROGAM: Trigerr reconfiguration, not explained in ug332, but
DS099 Figure 28: Boundary-Scan Configuration Flow Diagram (p.49) */
- jtag->shiftIR(&JPROGRAM);
+// jtag->shiftIR(&JPROGRAM);
switch(family)
{
@@ -159,5 +158,5 @@
/* use leagcy, if large transfers are faster then chunks */
flow_program_legacy(file);
/*flow_array_program(file);*/
- flow_disable();
+// flow_disable();
}
Already patched version is contained in distribution package.
With this patch, it stops the operation of FPGA (disconnects clock signal and forces all outputs to high impedance), but the content of configuration memory, BRAMs and D-flip flops are left intact. Then it programs partial bitstream and starts FPGA again. This procedure is good for Spartan3, where partial bitstreams always reprograms full column by shifting new configuration in which generates lots of glitches and corrupts flip-flop and BRAM content if FPGA is not stopped before programming. Virtex family doesn’t behave like that, instead all unchanged logic is kept completely intact so no glitches or corruption appear even if FPGA is running.
This process was tested with Xilinx Parallel JTAG cable.
This is design dependent part. It can control the test or can just trigger BIST and than collect the results.
This part is located in FPGA and can be anything from simple register reader/writer to full blown BIST.
Example was developed and tested under Linux operating system (Gentoo). All instructions is for Linux. This doesn’t mean it wont run under Windows, but it wasn’t tested.
All examples are designed for Nexys board from Digilent with Spartan3 and RS232 PMOD.
Perl libraries: Expect, threads,IO::Socket, IO::Socket::INET, Getopt::Long, Pod::Usage, Device::SerialPort
To install these libraries either use your distribution package manager or use CPAN.
Simplest way is using CPAN. To install through CPAN, issue following command:
pavel@pigster-pc ~ $ sudo cpan install Expect threads IO::Socket IO::Socket::INET Getopt::Long Pod::Usage Device::SerialPort
ISE WebPack – get it from http://www.xilinx.com/webpack
GIT – get it through your package manager or from http://git-scm.com/
ISE Make Based build system – more information on installation and configuration here http://github.com/pavels/ise-make-system
- Get the sources from GIT:
pavel@pigster-pc ~ $ mkdir Xilinx-SEU-Simulator pavel@pigster-pc ~ $ cd Xilinx-SEU-Simulator pavel@pigster-pc ~/Xilinx-SEU-Simulator $ git init pavel@pigster-pc ~/Xilinx-SEU-Simulator $ git pull git://github.com/pavels/Xilinx-SEU-Simulator.git
- Build HW examples
pavel@pigster-pc ~/Xilinx-SEU-Simulator $ cd reconf_demo_mult pavel@pigster-pc ~/Xilinx-SEU-Simulator/reconf_demo_mult $ make pavel@pigster-pc ~/Xilinx-SEU-Simulator/reconf_demo_mult $ cd .. pavel@pigster-pc ~/Xilinx-SEU-Simulator $ cd reconf_demo_mult_tmr pavel@pigster-pc ~/Xilinx-SEU-Simulator/reconf_demo_mult_tmr $ make pavel@pigster-pc ~/Xilinx-SEU-Simulator/reconf_demo_mult_tmr $ cd ..
If you have your ISE tools in $PATH, there is no need to configure anything in generator. Otherwise check top of fault-generator.pl
Check serial port settings for testbench scripts (testbench.pl in reconf_demo_mult and reconf_demo_mult_tmr)
Find this line:
$port = Device::SerialPort->new("/dev/ttyS0");
and correct path to serial port device to fit your system.
In fault-generator directory you can find xc3sprog precompiled binary. Check it if it work for you.
With Xilinx Parallel Cable connected to board and power switched ON try to run this command:
pavel@pigster-pc ~/Xilinx-SEU-Simulator/fault-generator $ ./xc3sprog -j
It should detect your FPGA. If it fails, you may try to recompile xc3sprog from source. If it’s OK, skip this step and go straight to next step.
To recompile xc3sprog, you need to do following:
- Get the source form SourceForge – http://sourceforge.net/projects/xc3sprog/files/xc3sprog/v216/xc3sprog-r216.tar.gz/download – always get release r216, different releases may not work.
- Extract the source and patch it with patch, which is part of Xilinx-SEU-Simulator package:
pavel@pigster-pc ~/Xilinx-SEU-Simulator $ tar -xzf xc3sprog-r216.tar.gz pavel@pigster-pc ~/Xilinx-SEU-Simulator $ cp xc3sprog.patch xc3sprog-r216 pavel@pigster-pc ~/Xilinx-SEU-Simulator $ cd xc3sprog-r216 pavel@pigster-pc ~/Xilinx-SEU-Simulator/xc3sprog-r216 $ patch -p1 < xc3sprog.patch
- Build xc3sprog:
pavel@pigster-pc ~/Xilinx-SEU-Simulator/xc3sprog-r216 $ mkdir build pavel@pigster-pc ~/Xilinx-SEU-Simulator/xc3sprog-r216 $ cd build pavel@pigster-pc ~/Xilinx-SEU-Simulator/xc3sprog-r216/build $ cmake .. pavel@pigster-pc ~/Xilinx-SEU-Simulator/xc3sprog-r216/build $ make
- Copy resulting binary to fault-generator:
pavel@pigster-pc ~/Xilinx-SEU-Simulator/xc3sprog-r216/build $ cp xc3sprog ../../fault-generator/
First of all, we need to prepare the test HW.
- Connect RS232 PMOD adapter to most left port on Nexys board (
JA
) - Connect RS232 cable between adapter and PC
- Connect parallel JTAG cable to Nexys board and PC
- Connect power to Nexys board and turn power switch ON
- Nexys board must be set for JTAG programming
At this moment, we are ready to run first test.
- Copy
.ncd
and.bit
fromreconf_demo_mult
tofault-generator
- from one terminal window execute:
pavel@pigster-pc ~/Xilinx-SEU-Simulator/fault-generator $ ./fault-generator.pl --ncd reconf_demo_mult.ncd --bit reconf_demo_mult.bit --module datapath
If all goes well, you will seeWaiting for controller to connect
- from second terminal window execute:
pavel@pigster-pc ~/Xilinx-SEU-Simulator/reconf_demo_mult $ ./testbench.pl
Now the test should be running. As the reconf_demo_mult
is not resistent to SEU in any way, you should see a lot of errors each time the test is
run with faulty bitstream.
Next you can repeat the same procedure, but use .ncd
and .bit
from reconf_demo_mult_tmr
. This is actually the same 8bit multiplier, but his time it is secured by using TMR. You should see no errors now – the system is secure. Now, try to run this test with modified fault-generator
command:
pavel@pigster-pc ~/Xilinx-SEU-Simulator/fault-generator $ ./fault-generator.pl --ncd reconf_demo_mult.ncd --bit reconf_demo_mult.bit --module
datapath --faultcount 2
now you should see sometimes errors during test. TMR can’t handle more than one error when these errors are distributed among different redundant parts.
The VHDL code for our examples is very simple. Basically it is just 8bit multiplier, which is forced to be synthesized as gates. Than there is test controller and UART controller.
Test controller takes two or three bytes from serial port and performs action accordingly. First byte is always command and than continues parameters.
R(0x52) <A> - reads content of register A, registers are numbered from 0
W(0x57) <A> <B> - writes value B to register A
The test bench is very simple. It just writes all possible combinations of operators in range 0-32 and checks the results using test controller in FPGA through UART. The test is so limited in values, because the UART is slow. It is made just as example.