Home
Home >>
<< CNets
News Index
2002
Jan Feb Mar
Apr May Jun
Jul Aug Sep
2001
Jan Feb Mar
Apr May Jun
Jul Aug Sep
Oct Nov Dec
2000
Apr Aug Sep
Oct Nov Dec
Links
Fpga-cpu List
Usenet Posts
Site News
Papers
Teaching
Resources
Glossary
Gray Research
GR CPUs
XSOC
Launch Mail
Circuit Cellar
LICENSE
README
XSOC News
XSOC Talk
Issues
xr16
XSOC 2.0
XSOC2 Log
CNets
CNets Log
|
|
|
|
Introducing CNets2000
I believe that to achieve near-optimal FPGA cores it is often necessary
to control both technology mapping and placement of the circuit --
and to do that properly you need the right tools.
(Technology mapping is the process of mapping abstract gate expressions
into FPGA device resources like 4-LUTs (4-input lookup tables)).
This is straightforward to do with schematics. What you see is what you
get. You control technology mapping use FMAPs (4000X) or LUTs (Virtex).
You control placement with RLOCs (relative location constraints). You can
see RLOCs and FMAPs liberally sprinkled throughout the XSOC/xr16 schematics.
The many disadvantages of schematics include that they are awkward to
share, excerpt, diff, and manage with version control. And they are
slow to draw (if you are fussy). I'm through designing with schematics.
Mapping and placement control is also possible in structural HDLs if
your HDL compiler allows the structure to be tagged with attributes that
in effect generate FMAPS (LUTs) and RLOCs. FPGA Express (>=3.x)
allows attributes (incl. RLOCs) but swallows FMAPs. Workarounds
are awkward (see www.fpgacpu.org/usenet/rope_pushing.html). Apparently
Synplicity and Exemplar give you decent control over both FMAPs and RLOCs.
The attribute syntax is lamentably different across tools, of course!
The schematic version of XSOC/xr16 was 25% faster and somewhat more
compact than the partially-floorplanned Verilog version.
C++ ciruit generators
When I built j32 (www3.sympatico.ca/jsgray/homebrew.htm) in 1995, I first
wrote a C++ class library called CNets, which provided classes for nets,
buses, and circuit primitives, and which used C++ operator overloading to
provide a convenient design notation. ("Notation is a tool of thought" --
Ken Iverson.) This provided a simple and extensible text-based structural
design representation (see www.fpgacpu.org/usenet/cnets.html). Indeed,
all of j32, including processor, bus, serial port, simple assembler,
on-chip boot ROM builder, etc. was a single manageable .cpp file.
Whenever I discuss C++ and/or Java design representations, people confuse
structural circuit generators with behavioral synthesizers. I attempt
to explain the distinction in www.fpgacpu.org/usenet/generators.html.
CNets and CNets2000 are circuit generators.
In 1996 or so, I fiddled with a Java version of CNets. The allure of
Java is it offers reflection, so it is easy to enumerate the members
of a class. If you represent each hierarchical sheet/module of your
design as a class, reflection makes it all too easy to traverse your
design hierarchy. Unfortunately, Java lacks operator overloading.
Where in CNets I could write
net(mux) = a&~sel | b&sel;
in my-JHDL I had to write
mux.is(a.andnot(sel).or(b.and(sel)));
which (although not terrible) I did not like.
In contrast, C++'s operator overloading provides a nice notation for
expressions, but requires that you somehow "introduce" each member of
a class to itself and to its class. To do this, I use a set of _()
macros that is not too ugly (see below).
Later in 1996, I went back to C++ and designed a sketch of a new version
of CNets that handled hierarchical designs properly, but I never finished
it. I've picked at it over the years, but never got much traction.
Back in July I again revisited CNets, producing something I'm calling
CNets2000. It's very much a work in progress. I whipped up about
600 lines of new C++ code and now I can produce EDIF from a CNets2000
structural specification. For example, here is mux.h, which defines
2-input and 4-input muxes:
// mux.h
#include "cnets.h"
// define a 2-1 mux
module(Mux2) {
In a, b, sel;
Out o;
imp(Mux2) _4(a,b,sel,o) is
o = a&~sel | b&sel;
endimp
};
// define a 4-1 mux as a composition of three 2-1 muxes
module(Mux4) {
In a, b, c, d, sel1, sel2;
Out o;
Wire o1, o2;
Mux2 m1, m2, m3;
imp(Mux4) _7(a,b,c,d,sel1,sel2,o), _5(o1,o2,m1,m2,m3) is
m1.a(a).b(b).sel(sel1).o(o1).rloc(0,0);
m2.a(c).b(d).sel(sel1).o(o2).rloc(0,0);
m3.a(o1).b(o2).sel(sel2).o(o).rloc(0,1);
endimp
};
// main.cpp
#include "mux.h"
int main() {
Mux4 m("m");
m.addIOs();
cnets.edif(cout, m);
return 0;
}
(I am especially pleased with the 'portmap' like dot-notation for
specifying port-wire associations (m2.a(c). etc.)
Behind the scenes, each module maps into a C++ class. Members can
be In and Out port declarations, Wires, and submodules. The imp
(implementation) of each module is actually its class's constructor.
The implementation defines the port-wire connectivity of the submodules
as well as (in Mux2 above) any other logic. This example doesn't show
any registers, I'm still fiddling with them.
Here is an excerpt of its output, which is approximately correct EDIF.
Note that 'm.addIOs()' above has inserted IBUF and OBUFs and _i and _o
nets into the design automatically:
(cell Mux4 (cellType generic)
(view net (viewType netlist)
(interface
(wire a (direction INPUT))
(wire b (direction INPUT))
(wire c (direction INPUT))
(wire d (direction INPUT))
(wire o (direction OUTPUT))
(wire sel1 (direction INPUT))
(wire sel2 (direction INPUT))
(contents
(instance a_ibuf (viewRef net (cellRef IBUF)))
(instance b_ibuf (viewRef net (cellRef IBUF)))
(instance c_ibuf (viewRef net (cellRef IBUF)))
(instance d_ibuf (viewRef net (cellRef IBUF)))
(instance m1 (viewRef net (cellRef Mux2)))
(instance m2 (viewRef net (cellRef Mux2)))
(instance m3 (viewRef net (cellRef Mux2)))
(instance o_obuf (viewRef net (cellRef OBUF)))
(instance sel1_ibuf (viewRef net (cellRef IBUF)))
(instance sel2_ibuf (viewRef net (cellRef IBUF)))
(net a (joined
(portRef a)
(portRef I (instanceRef a_ibuf))))
(net a_i (joined
(portRef a (instanceRef m1))
(portRef O (instanceRef a_ibuf))))
(net b (joined
(portRef b)
(portRef I (instanceRef b_ibuf))))
(net b_i (joined
(portRef b (instanceRef m1))
(portRef O (instanceRef b_ibuf))))
(net c (joined
(portRef c)
(portRef I (instanceRef c_ibuf))))
(net c_i (joined
(portRef a (instanceRef m2))
(portRef O (instanceRef c_ibuf))))
(net d (joined
(portRef d)
(portRef I (instanceRef d_ibuf))))
(net d_i (joined
(portRef b (instanceRef m2))
(portRef O (instanceRef d_ibuf))))
(net o (joined
(portRef o)
(portRef O (instanceRef o_obuf))))
(net o1 (joined
(portRef o (instanceRef m1))
(portRef a (instanceRef m3))))
(net o2 (joined
(portRef o (instanceRef m2))
(portRef b (instanceRef m3))))
(net o_o (joined
(portRef o (instanceRef m3))
(portRef I (instanceRef o_obuf))))
(net sel1 (joined
(portRef sel1)
(portRef I (instanceRef sel1_ibuf))))
(net sel1_i (joined
(portRef sel (instanceRef m1))
(portRef sel (instanceRef m2))
(portRef O (instanceRef sel1_ibuf))))
(net sel2 (joined
(portRef sel2)
(portRef I (instanceRef sel2_ibuf))))
(net sel2_i (joined
(portRef sel (instanceRef m3))
(portRef O (instanceRef sel2_ibuf))))))))
Since CNets2000 structurally models the elements of a schematic, it can
do anything a schematic can do. And yet it has all the advantages of
a text representation but with a modern high-level language substrate.
The best thing about CNets2000 (and similar tools) is it builds a data
structure of the design in memory that you can manipulate in C++. So it
becomes possible to simulate it using a simple cycle-based simulator.
Or emit an optimized C program that should simulate it at high speed.
Or write/read designs as XML. Or massage the design (as Module::addIOs()
does to insert IBUFs and OBUFs). Or directly emit it using jbits.
Or ... .
I don't know how far I will polish the current CNets2000, or whether
I will apply it to XSOC/xr in the future. Perhaps I should open source it
and move it to SourceForge.net.
|
|
Generating HDLs from CNets2000
Last time, I wrote:
The best thing about CNets2000 (and similar tools) is it builds a data
structure of the design in memory that you can manipulate in C++. So it
becomes possible to simulate it using a simple cycle-based simulator.
Or emit an optimized C program that should simulate it at high speed.
Or write/read designs as XML. Or massage the design (as Module::addIOs()
does to insert IBUFs and OBUFs). Or directly emit it using jbits.
Or ... .
Or even emit structural HDL! For this CNets2000 code:
// mux.h:
#include "cnets.h"
module(Mux2) { // define a 2-1 mux
In a, b, sel;
Out o;
imp(Mux2) _4(a,b,sel,o) is
o = a&~sel | b&sel;
endimp
};
// define a 2-1 mux with explicit prims
module(Mux2a) {
In a, b, sel;
Out o;
Wire sel_n, asel, bsel;
INV sel_inv;
AND2 a_and, b_and;
OR2 or;
imp(Mux2a) _4(a,b,sel,o),
_7(sel_n,asel,bsel,sel_inv,a_and,b_and,or) is
sel_inv.I(sel).O(sel_n);
a_and.I0(a).I1(sel_n).O(asel);
b_and.I0(b).I1(sel).O(bsel);
or.I0(asel).I1(bsel).O(o);
endimp
};
// define a 4-1 mux as a composition of three 2-1 muxes
module(Mux4) {
In a, b, c, d, sel1, sel2;
Out o;
Wire o1, o2;
Mux2 m1, m2;
Mux2a m3;
imp(Mux4) _7(a,b,c,d,sel1,sel2,o), _5(o1,o2,m1,m2,m3) is
m1.a(a).b(b).sel(sel1).o(o1).rloc(0,0);
m2.a(c).b(d).sel(sel1).o(o2).rloc(0,0);
m3.a(o1).b(o2).sel(sel2).o(o).rloc(0,1);
endimp
};
// main.cpp
#include "mux.h"
int main() {
Mux4 m("m");
// m.addIOs();
cnets.verilog(cout, m);
return 0;
}
we get this structural Verilog:
/* generator: CNets2000 1.0.0 */
module Mux4(a, b, c, d, sel1, sel2, o);
in a;
in b;
in c;
in d;
in sel1;
in sel2;
out o;
wire o1;
wire o2;
Mux2 m1(.a(a), .b(b), .sel(sel1), .o(o1));
Mux2 m2(.a(c), .b(d), .sel(sel1), .o(o2));
Mux2a m3(.a(o1), .b(o2), .sel(sel2), .o(o));
endmodule
module Mux2(a, b, sel, o);
in a;
in b;
in sel;
out o;
assign o = a & ~sel | b & sel;
endmodule
module Mux2a(a, b, sel, o);
in a;
in b;
in sel;
out o;
wire sel_n;
wire asel;
wire bsel;
INV sel_inv(.I(sel), .O(sel_n));
AND2 a_and(.I0(a), .I1(sel_n), .O(asel));
AND2 b_and(.I0(b), .I1(sel), .O(bsel));
OR2 or(.I0(asel), .I1(bsel), .O(o));
endmodule
and if we modify main.cpp to enable the m.addIOs(); call,
which inserts IBUFs and OBUFs into m, this generates:
module Mux4(a, b, c, d, sel1, sel2, o);
in a;
in b;
in c;
in d;
in sel1;
in sel2;
out o;
wire o1;
wire o2;
wire a_i;
wire b_i;
wire c_i;
wire d_i;
wire sel1_i;
wire sel2_i;
wire o_o;
Mux2 m1(.a(a_i), .b(b_i), .sel(sel1_i), .o(o1));
Mux2 m2(.a(c_i), .b(d_i), .sel(sel1_i), .o(o2));
Mux2a m3(.a(o1), .b(o2), .sel(sel2_i), .o(o_o));
IBUF a_ibuf(.I(a), .O(a_i));
IBUF b_ibuf(.I(b), .O(b_i));
IBUF c_ibuf(.I(c), .O(c_i));
IBUF d_ibuf(.I(d), .O(d_i));
IBUF sel1_ibuf(.I(sel1), .O(sel1_i));
IBUF sel2_ibuf(.I(sel2), .O(sel2_i));
OBUF o_obuf(.I(o_o), .O(o));
endmodule
Similarly, it should be straightforward to emit structural VHDL.
It is my hope that, in time, this system will prove to be an acceptable way to
maintain just one design representation, and yet satisfy requests for
Verilog and VHDL versions, and perhaps, anticipated requests for
synthesis-vendor-specific attributes, including mapping and
placement constraint attributes.
"I need a VHDL version with Exemplar attribute syntax." "No problem!"
"Well I need a Verilog version with Synplicity attribute syntax." "No problem!"
...
|
|
I am open sourcing
cnets, moving it to sourceforge.net, and I am inviting you
to collaborate on its evolution.
Stay tuned for code check-ins, new documentation, specs, schedules, etc.
|
|