Commit 434e7353 authored by Fabrizio Ferrandi's avatar Fabrizio Ferrandi

* Added flopoco.

parent 966fd278
Installation Instructions
*************************
See doc/flopoco_user_manual.html
# Makefile for flopoco
SUBDIRS = src
DIST_SUBDIRS = src
ACLOCAL_AMFLAGS = -I m4
bin_PROGRAMS = flopoco longacc2fp fp2bin bin2fp fpadder_example
flopoco_SOURCES = src/main.cpp
flopoco_CPPFLAGS = $(CPPFLAGS) -I$(top_srcdir)/src/
flopoco_CXXFLAGS = $(CXXFLAGS)
flopoco_LDADD = $(LIBS) src/libflopoco.la
bin2fp_SOURCES = \
src/Tools/bin2fp.cpp \
src/utils.cpp
bin2fp_CPPFLAGS = $(CPPFLAGS)
bin2fp_CXXFLAGS = $(CXXFLAGS)
bin2fp_LDADD = $(LIBS)
fp2bin_SOURCES = \
src/Tools/fp2bin.cpp \
src/utils.cpp
fp2bin_CPPFLAGS = $(CPPFLAGS)
fp2bin_CXXFLAGS = $(CXXFLAGS)
fp2bin_LDADD = $(LIBS)
longacc2fp_SOURCES = \
src/Tools/longacc2fp.cpp \
src/utils.cpp
longacc2fp_CPPFLAGS = $(CPPFLAGS)
longacc2fp_CXXFLAGS = $(CXXFLAGS)
longacc2fp_LDADD = $(LIBS)
fpadder_example_SOURCES = src/main_minimal.cpp
fpadder_example_CPPFLAGS = $(CPPFLAGS) -I$(top_srcdir)/src/
fpadder_example_CXXFLAGS = $(CXXFLAGS)
fpadder_example_LDADD = $(LIBS) src/libflopoco.la
FloPoCo is a generator of Floating-Point (but not only) Cores for FPGAs (but not only).
Copyright © ENS-Lyon, INRIA, CNRS, UCBL, 2008-2011
All rights reserved
Contact: Florent.de.Dinechin@ens-lyon.fr
The intent of the authors is to distribute FloPoCo as free software (in the FSF AGPL sense), while imposing that the source code generated by FloPoCo is also free software (also AGPL-like).
\ No newline at end of file
TODO file for FloPoCo
Bug (ds FixRealKCM?) ./flopoco -verbose=3 FPExp 7 12
With Matei: see the nextCycles in FPExp and see if we can push them in IntMultiplier somehow
Here and there, fix VHDL style issues needed for whimsical simulators or synthesizers. See doc/VHDLStyle.txt
Clean up src/ file hierarchy:
Add LUT-based integer comparators
BitHeapization
(and provide a bitheap-only constructor for all the following):
* all the KCMs
* PolyEval
* FixSinCos
* FPExp
* HOTBM
* IntAddition/*
* Rework Guillaume Sergent's operators around the bit heap
* define a policy for enableSuperTile: default to false or true?
* Push this option to FPMult and other users of IntMult.
* Replace tiling exploration with cached/classical tilings
* More debogdanization: Get rid of
IntAddition/IntCompressorTree
IntAddition/NewIntCompressorTree
IntAddition/PopCount
after checking the new bit heap compression is at least as good...
* replace "Virtex4" tests in IntAddition with Target calls
Testbench
* Bug on outputs that are bits with isBus false and multiple-valued
(see the P output of Collision in release 2.1.0)
* Multiple valued outputs should always be intervals, shouldn't they?
Pipeline framework, Operator and Targets:
* global switch to ieee standard signed and unsigned libraries
* add FPU object and remove all global variables, to enable library build
* add semantic support of fixed-point to Signal
* fix obvious memory leaks
* fix the default getCycleFromSignal .
* define a Timing as a (Cycle, CriticalPath), and associate that cleanly to Signals with getTiming methods that set both cycle and critical path.
* insert getNewUId and comb/freq info directly in setName(), and clean up
Collision
* manage infinities etc
* decompose into FPSumOf3Squares and Collision
* Sum of n squares
Table and FunctionTable:
* FunctionEvaluator with degree=0 should generate a FunctionTable
* VHDL generation is broken when logicTable=0 (pipeline depth=1 but no register)
FPLog:
* Fix a few remaining last-bit accuracy problems, damn.
* compare with polynomial-based version.
FPExp:
* Everybody wants less-than-single-precision
HOTBM:
* true FloPoCoization, pipeline
* better (DSP-aware) architectural exploration
ConstMult:
* group KCM and shift-and-add in a single OO hierearchy (selecting the one with less hardware)
* For FPConstMult, don't output the LSBs of the IntConstMult
but only their sticky
* Try left to right and right to left; try variations of the initial recoding
* more clever, Lefevre-inspired algorithm
* Use DSP: find the most interesting constant fitting on 18 bits
* compare with Spiral.net and Gustafsson papers
* Implement the continued fraction stuff for FPCRConstMult
SumOfProducts, LongAcc
* add test bench generation
Shifters
* provide finer spec, see the TODOs inside Shifter.cpp
General
* Doxygenize while it's not too late
********************************************************************
Tentative roadmap
(minor version number count, more or less, the number of working operators.
We have left 0.xx for 1.xx when all the basic operators have been backported with working pipeline):
Version 0.1: IntConstMult, FPConstMult, FPCRConstMult
Version 0.4: LongAccumulator, FPMultiplier
Version 0.5 : HOTBM integration
Version 0.6 : FPLog
Version 0.7 : FPExp
Version 0.8 : DotProduct
Version 0.9 : LNS operators, thanks to Sylvain Collange
Version 0.11: FPDiv, IntSquarer, new pipeline framework
Version 1.15: FPSqrt FPSquarer InputIEEE Fix2FP
Version 2.0 : FunctionEvaluator
Version 2.2 : FPExp, FPPowr (experimental)
Version 2.3 : IntConstDiv, FPConstDiv and FPConstDivExpert, FPConstMultRational, TaMaDi architecture
-- we're here (faster than expected)
Version 2.4 RNGs, BitHeap, revamped multipliers, FPSinCos, CORDIC
FPAddSub (for FFT butterfly structure)
Version ??? Complex operators (in particular division)
Version ??? FixToFPUniformDist (2008 ASAP paper by Thomas)
Version ??? FPNorm2D
Version ??? FPNorm3D -- Almost there, see Collision
Version ?? FPBoxMuller
Version ??? Interval operators
(insert your wishlist here)
If we were to redo the pipeline framework from scratch, here is the proper way to do it.
The current situation has a history: we first added cycle management, then, as a refinement, critical-path based subcycle timing.
So we have to manage explicitely the two components of a lexicographic time (cycle and delay within a cycle)
But there is only one wallclock time, and the decomposition of this wallclock time into cycles and sub-cycles could be automatic. And should.
The following version of declare() could remove the need for manageCriticalPath as well as all the explicit synchronization methods.
declare(name, size, delay)
declares a signal, and associates its computation delay to it. This delay is what we currently pass to manageCriticalPath. Each signal now will have a delay associated to it (with a default of 0 for signals that do not add to the critical path).
The semantics is: this signal will not be assigned its value before the instant delta + max(instants of the RHS signals)
This is all what the first pass, the one that populates the vhdl stream, needs to do. No explicit synchronization management needed. No need to setCycle to "come back in time", etc.
Then we have a retiming procedure that must associate a cycle to each signal.
It will do both synchronization and cycle computation. According to Alain Darte there is an old retiming paper that shows that the retiming problem can be solved optimally in linear time for DAGs, which is not surprising.
Example of simple procedure:
first build the DAG of signals (all it takes is the same RHS parsing, looking for signal names, as we do)
Then sit on the existing scheduling literature...
For instance
1/ build the operator's critical path
2/ build the ASAP and ALAP instants for each signal
3/ progress from output to input, allocating a cycle to each signal, with ALAP scheduling (should minimize register count for compressing operators)
4/ possibly do a bit of Leiserson and Saxe retiming
We keep all the current advantages:
- still VHDL printing based
- When developing an operator, we initially leave all the deltas to zero to debug the combinatorial version. Then we incrementally add deltas, just like we currently add manageCriticalPath().
- etc
The difference is that the semantic is now much clearer. No more notion of a block following a manageCriticalPath(), etc
The question is: don't we loose some control on the circuit with this approach, compared to what we currently do?
Note that all this is so much closer to textbook literature, with simple DAGs labelled by delay...
\ No newline at end of file
This diff is collapsed.
dnl Process this file with autoconf to produce a configure script.
AC_INIT([FloPoCo], [svn-trunk], [BUG_REPORT_ADDRESS])
AC_CONFIG_MACRO_DIR([m4])
AM_INIT_AUTOMAKE([-Wall -Werror foreign subdir-objects])
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
m4_ifdef([AM_PROG_AR], [AM_PROG_AR])
dnl Check for programs.
AC_PROG_CXX
AC_PROG_CC
AC_PROG_CPP
AC_PROG_INSTALL
AC_PROG_LIBTOOL
AM_PROG_LEX
AC_PROG_YACC
# we need bison instead of yacc
YACC_TRIMMED=`echo ${YACC}| sed 's/-y//;s/ //'`
if test ${YACC_TRIMMED} != bison; then
AC_MSG_ERROR(bison is required)
fi
#and flex instead of lex
if test "$LEX" != flex; then
AC_MSG_ERROR(flex is required)
fi
dnl Sets C++ as main language
AC_LANG([C++])
dnl Enable and Disable values
enableval="yes"
disableval="no"
dnl Disable executable compilation
AC_ARG_ENABLE( [exec],
[AS_HELP_STRING([--disable-exec], [disable executable generation, building FloPoCo as a dynamic library only])],
[EXEC=$disableval],
[EXEC=$enableval])
AM_CONDITIONAL(BUILD_FLOPOCO_EXEC, test "x$EXEC" = xyes)
dnl Check for libraries.
AC_CHECK_LIB([gmpxx], [main], [], [AC_MSG_ERROR(libgmpxx is missing.)])
AC_CHECK_LIB([gmp], [main], [], [AC_MSG_ERROR(libgmp is missing.)])
AC_CHECK_LIB([mpfr], [main], [], [AC_MSG_ERROR(libmpfr is missing.)])
AC_CHECK_LIB([sollya], [main], [], [AC_MSG_ERROR(libsollya is missing.)])
AC_DEFINE(HAVE_SOLLYA, 1, "Found Sollya")
dnl Check for headers.
AC_CHECK_HEADER([gmpxx.h], [], [AC_MSG_ERROR(gmpxx.h is missing.)])
AC_CHECK_HEADER([gmp.h], [], [AC_MSG_ERROR(gmp.h is missing.)])
AC_CHECK_HEADER([mpfr.h], [], [AC_MSG_ERROR(mpfr.h is missing.)])
dnl need to fix the sollya header test
dnl AC_CHECK_HEADER([sollya.h], [], [AC_MSG_ERROR(sollya.h is missing.)])
dnl These defines do not seem to be used anywhere
dnl AC_DEFINE(HAVE_HOTBM, 1, "HOTBM available")
dnl AC_DEFINE(HAVE_LNS, 1, "LNS available")
dnl Generate output
AC_CONFIG_HEADERS([config.h])
AC_CONFIG_FILES([src/Makefile Makefile])
AC_OUTPUT
This diff is collapsed.
This diff is collapsed.
#include <fstream>
#include <sstream>
#include "FixedComplexAdder.hpp"
using namespace std;
namespace flopoco{
extern vector<Operator *> oplist;
FixedComplexAdder::FixedComplexAdder(Target* target, int wI_, int wF_, bool signedOperator_, map<string, double> inputDelays)
: Operator(target), wI(wI_), wF(wF_), signedOperator(signedOperator_)
{
signedOperator ? w = 1 + wI + wF : w = wI + wF;
ostringstream name;
setCopyrightString ( "Istoan Matei, Florent de Dinechin (2008-2012)" );
if(target->isPipelined())
name << "FixedComplexAdder_" << w << "_f"<< target->frequencyMHz() << "_uid" << getNewUId();
else
name << "FixedComplexAdder_" << w << "_uid" << getNewUId();
setName( name.str() );
addInput( "Xi", w, true);
addInput( "Xr", w, true);
addInput( "Yi", w, true);
addInput( "Yr", w, true);
addInput( "Cinr", 1 );
addInput( "Cini", 1 );
addOutput("Zi", w, 2);
addOutput("Zr", w, 2);
setCriticalPath(getMaxInputDelays(inputDelays));
IntAdder* addOperator = new IntAdder(target, w, inDelayMap("X",getCriticalPath()));
oplist.push_back(addOperator);
inPortMap (addOperator, "X", "Xi");
inPortMap (addOperator, "Y", "Yi");
inPortMap (addOperator, "Cin", "Cini");
outPortMap(addOperator, "R", "Zi", false);
vhdl << instance(addOperator, "ADD_I");
inPortMap (addOperator, "X", "Xr");
inPortMap (addOperator, "Y", "Yr");
inPortMap (addOperator, "Cin", "Cinr");
outPortMap(addOperator, "R", "Zr", false);
vhdl << instance(addOperator, "ADD_R");
syncCycleFromSignal("Zr");
setCriticalPath( addOperator->getOutputDelay("R") );
}
FixedComplexAdder::~FixedComplexAdder()
{
}
void FixedComplexAdder::emulate(TestCase * tc)
{
mpz_class svXi = tc->getInputValue ( "Xi" );
mpz_class svYi = tc->getInputValue ( "Yi" );
mpz_class svCi = tc->getInputValue ( "Cini" );
mpz_class svXr = tc->getInputValue ( "Xr" );
mpz_class svYr = tc->getInputValue ( "Yr" );
mpz_class svCr = tc->getInputValue ( "Cinr" );
mpz_class svZi = svXi + svYi + svCi;
mpz_class svZr = svXr + svYr + svCr;
// Don't allow overflow
mpz_clrbit ( svZi.get_mpz_t(), w );
mpz_clrbit ( svZr.get_mpz_t(), w );
tc->addExpectedOutput ( "Zi", svZi );
tc->addExpectedOutput ( "Zr", svZr );
}
}
#ifndef FixedComplexAdder_HPP
#define FixedComplexAdder_HPP
#include <vector>
#include <sstream>
#include "../Operator.hpp"
#include "../IntAdder.hpp"
namespace flopoco{
/**
* Complex adder for fixed point numbers
*/
class FixedComplexAdder : public Operator
{
public:
FixedComplexAdder(Target* target, int wI, int wF, bool signedOperator = true, map<string, double> inputDelays = emptyDelayMap);
~FixedComplexAdder();
void emulate(TestCase * tc);
int wI, wF, w;
bool signedOperator;
};
}
#endif
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment