Professional Documents
Culture Documents
Lecture 10
Memories (RAM/ROM)
11/11/08
1
Outline
• Memory
• Distributed RAM
• Block RAM
• Instantiation versus Inference
• VHDL Inference Code
• Distributed RAM
• Block RAM
• ROM
• VHDL Instantiation Code
2
Memory Types
3
Memory Types
Memory
RAM ROM
Memory
Memory
5
CLB Slice
COUT
YB
G4 Y
G3 S
Look-Up Carry D Q
G2 Table O
G1 &
CK
Control
Logic EC
R
F5IN
BY
SR
XB
X S
F4
F3 Look-Up Carry D Q
F2 Table O
F1
& CK
Control
Logic EC
R
CIN
CLK
CE
SLICE
6
Xilinx Multipurpose LUT
16-bit SR
16 x 1 RAM
4-input LUT
7
Distributed RAM
RAM16X1S
• CLB LUT configurable as D
WE
Distributed RAM =
WCLK
LUT A0 O
A1
• An LUT equals 16x1 RAM A2
A3
• Asynchronous read A0
A1
A2
O
or
A0 SPO
LUT
• 16 x 2 single-port RAM
A1
A2
• 16 x 1 dual-port RAM A3
DPRA0 DPO
DPRA1
DPRA2
DPRA3
8
FPGA Block RAM
9
Block RAM
Port B
Port A
Spartan-3
Dual-Port
Block RAM
Block RAM
• Most efficient memory implementation
• Dedicated blocks of memory
• Ideal for most memory requirements
• 4 to 104 memory blocks
• 18 kbits = 18,432 bits per block (16 k without parity bits)
• Use multiple blocks for larger memories
• Builds both single and true dual-port RAMs
• Synchronous write and read (different from distributed RAM)
10
RAM Blocks and Multipliers in Xilinx FPGAs
RAM blocks
Multipliers
Logic blocks
11
Spartan-3 Block RAM Amounts
12
Block RAM can have various configurations (port
aspect ratios)
1 2
0 4
0
0
8k x 2 4k x 4
4,095
16k x 1 8,191
8+1
0
2k x (8+1)
2047
16+2
0
1023
1024 x (16+2)
16,383
13
Block RAM Port Aspect Ratios
14
Single-Port Block RAM
15
Dual-Port Block RAM
16
Dual-Port Bus Flexibility
RAMB4_S16_S8
WEA
ENA
Port A In RSTA DOA[17:0]
Port A Out
1K-Bit Depth CLKA 18-Bit Width
ADDRA[9:0]
DIA[17:0]
WEB
ENB
17
Two Independent Single-Port RAMs
RAMB4_S1_S1
Port A In WEA
Port B In WEB
Port B Out
8K-Bit Depth ENB
1-Bit Width
RSTB DOB[0]
CLKB
1, ADDR[12:0]
ADDRB[12:0]
DIB[0]
18
Inference vs. Instantiation
19
20
21
Generic Inferred RAM
22
Distributed versus Block RAM Inference
• Examples:
1. Distributed RAM with asynchronous read
23
Distributed RAM with asynchronous read
24
Distributed RAM with asynchronous read
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
USE ieee.std_logic_unsigned.all;
entity raminfr is
generic ( bits : integer := 32; -- number of bits per RAM word
addr_bits : integer := 3); -- 2^addr_bits = number of words in RAM
port (clk : in std_logic;
we : in std_logic;
a : in std_logic_vector(addr_bits-1 downto 0);
di : in std_logic_vector(bits-1 downto 0);
do : out std_logic_vector(bits-1 downto 0));
end raminfr;
25
Distributed RAM with asynchronous read
architecture behavioral of raminfr is
type ram_type is array (2**addr_bits-1 downto 0) of std_logic_vector (bits-1
downto 0);
signal RAM : ram_type;
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (we = '1') then
RAM(conv_integer(unsigned(a))) <= di;
end if;
end if;
end process;
do <= RAM(conv_integer(unsigned(a)));
end behavioral;
26
Report from Synthesis
Mapping Summary:
Total LUTs: 32 (2%)
27
Report from Implementation
Design Summary:
Number of errors: 0
Number of warnings: 0
Logic Utilization:
Logic Distribution:
Number of occupied Slices: 16 out of 768 2%
Number of Slices containing only related logic: 16 out of 16 100%
Number of Slices containing unrelated logic: 0 out of 16 0%
*See NOTES below for an explanation of the effects of unrelated logic
Total Number of 4 input LUTs: 32 out of 1,536 2%
Number used as 16x1 RAMs: 32
Number of bonded IOBs: 69 out of 124 55%
Number of GCLKs: 1 out of 8 12%
28
Distributed RAM with "false" synchronous read
29
Distributed RAM with "false" synchronous read
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
USE ieee.std_logic_unsigned.all;
entity raminfr is
generic ( bits : integer := 32; -- number of bits per RAM word
addr_bits : integer := 3); -- 2^addr_bits = number of words in RAM
port (clk : in std_logic;
we : in std_logic;
a : in std_logic_vector(addr_bits-1 downto 0);
di : in std_logic_vector(bits-1 downto 0);
do : out std_logic_vector(bits-1 downto 0));
end raminfr;
30
Distributed RAM with "false" synchronous read
architecture behavioral of raminfr is
type ram_type is array (2**addr_bits-1 downto 0) of std_logic_vector (bits-1
downto 0);
signal RAM : ram_type;
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (we = '1') then
RAM(conv_integer(unsigned(a))) <= di;
end if;
do <= RAM(conv_integer(unsigned(a)));
end if;
end process;
end behavioral;
31
Report from Synthesis
Mapping Summary:
Total LUTs: 32 (2%)
32
Report from Implementation
Design Summary:
Number of errors: 0
Number of warnings: 0
Logic Utilization:
Number of Slice Flip Flops: 32 out of 1,536 2%
Logic Distribution:
Number of occupied Slices: 16 out of 768 2%
Number of Slices containing only related logic: 16 out of 16 100%
Number of Slices containing unrelated logic: 0 out of 16 0%
*See NOTES below for an explanation of the effects of unrelated logic
Total Number of 4 input LUTs: 32 out of 1,536 2%
Number used as 16x1 RAMs: 32
Number of bonded IOBs: 69 out of 124 55%
Number of GCLKs: 1 out of 8 12%
33
Block RAM with synchronous read (read through)
34
Block RAM with synchronous read (read through)
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
USE ieee.std_logic_unsigned.all;
library synplify; -- XST does not need this
entity raminfr is
generic ( bits : integer := 32; -- number of bits per RAM word
addr_bits : integer := 3); -- 2^addr_bits = number of words in RAM
port (clk : in std_logic;
we : in std_logic;
a : in std_logic_vector(addr_bits-1 downto 0);
di : in std_logic_vector(bits-1 downto 0);
do : out std_logic_vector(bits-1 downto 0));
end raminfr;
35
Block RAM with synchronous read (read through)
cont'd
architecture behavioral of raminfr is
type ram_type is array (2**addr_bits-1 downto 0) of std_logic_vector (bits-1
downto 0);
signal RAM : ram_type;
signal read_a : std_logic_vector(addr_bits-1 downto 0);
attribute syn_ramstyle : string; -- XST does not need this
attribute syn_ramstyle of RAM : signal is "block_ram"; -- XST does not need this
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (we = '1') then
RAM(conv_integer(unsigned(a))) <= di;
end if;
read_a <= a;
end if;
end process;
do <= RAM(conv_integer(unsigned(read_a)));
end behavioral;
36
Report from Synthesis
Mapping Summary:
Total LUTs: 0 (0%)
37
Report from Implementation
Design Summary:
Number of errors: 0
Number of warnings: 0
Logic Utilization:
Logic Distribution:
Number of Slices containing only related logic: 0 out of 0 0%
Number of Slices containing unrelated logic: 0 out of 0 0%
*See NOTES below for an explanation of the effects of unrelated logic
Number of bonded IOBs: 69 out of 124 55%
Number of Block RAMs: 1 out of 4 25%
Number of GCLKs: 1 out of 8 12%
38
Distributed dual-port RAM with asynchronous read
39
Distributed dual-port RAM with asynchronous read
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
use ieee.std_logic_arith.all;
entity raminfr is
generic ( bits : integer := 32; -- number of bits per RAM word
addr_bits : integer := 3); -- 2^addr_bits = number of words in RAM
40
Distributed dual-port RAM with asynchronous read
architecture syn of raminfr is
type ram_type is array (2**addr_bits-1 downto 0) of std_logic_vector (bits-1
downto 0);
signal RAM : ram_type;
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (we = '1') then
RAM(conv_integer(unsigned(a))) <= di;
end if;
end if;
end process;
spo <= RAM(conv_integer(unsigned(a)));
dpo <= RAM(conv_integer(unsigned(dpra)));
end syn;
41
Report from Synthesis
Mapping Summary:
Total LUTs: 64 (4%)
42
Report from Implementation
Design Summary:
Number of errors: 0
Number of warnings: 0
Logic Utilization:
Logic Distribution:
Number of occupied Slices: 32 out of 768 4%
Number of Slices containing only related logic: 32 out of 32 100%
Number of Slices containing unrelated logic: 0 out of 32 0%
*See NOTES below for an explanation of the effects of unrelated logic
Total Number of 4 input LUTs: 64 out of 1,536 4%
Number used for Dual Port RAMs: 64
(Two LUTs used per Dual Port RAM)
Number of bonded IOBs: 104 out of 124 83%
Number of GCLKs: 1 out of 8 12%
43
Specification of memory types recognized
by Synplify Pro
SIGNAL memory : vector_array;
44
Generic Inferred ROM
45
Distributed dual-port RAM with asynchronous read
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
USE ieee.std_logic_unsigned.all;
entity rominfr is
generic ( bits : integer := 10; -- number of bits per ROM word
addr_bits : integer := 3); -- 2^addr_bits = number of
words in ROM
port (a : in std_logic_vector(addr_bits-1 downto 0);
do : out std_logic_vector(bits-1 downto 0));
end rominfr;
46
Distributed dual-port RAM with asynchronous read
architecture behavioral of rominfr is
type rom_type is array (2**addr_bits-1 downto 0) of std_logic_vector
(bits-1 downto 0);
constant ROM : rom_type :=
("0000110001",
"0100110100",
"0100110110",
"0110110000",
"0000111100",
"0111110101",
"0100110100",
"1111100111");
begin
do <= ROM(conv_integer(unsigned(a)));
end behavioral;
47
Report from Synthesis
I/O ports: 13
I/O primitives: 13
IBUF 3 uses
OBUF 10 uses
Mapping Summary:
Total LUTs: 9 (0%)
48
Report from Implementation
Design Summary:
Number of errors: 0
Number of warnings: 0
Logic Utilization:
Number of 4 input LUTs: 9 out of 1,536 1%
Logic Distribution:
Number of occupied Slices: 5 out of 768 1%
Number of Slices containing only related logic: 5 out of 5 100%
Number of Slices containing unrelated logic: 0 out of 5 0%
*See NOTES below for an explanation of the effects of unrelated logic
Total Number of 4 input LUTs: 9 out of 1,536 1%
Number of bonded IOBs: 13 out of 124 10%
49
FPGA Specific Memories (Instantiation)
50
Distributed RAM 16x1 (1)
library IEEE;
use IEEE.STD_LOGIC_1164.all;
library UNISIM;
use UNISIM.all;
entity RAM_16X1_DISTRIBUTED is
port(
CLK : in STD_LOGIC;
WE : in STD_LOGIC;
ADDR : in STD_LOGIC_VECTOR(3 downto 0);
DATA_IN : in STD_LOGIC;
DATA_OUT : out STD_LOGIC
);
end RAM_16X1_DISTRIBUTED;
51
Distributed RAM 16x1 (2)
architecture RAM_16X1_DISTRIBUTED_STRUCTURAL of RAM_16X1_DISTRIBUTED is
-- part used by the synthesis tool, Synplify Pro, only; ignored during
simulation
attribute INIT : string;
attribute INIT of RAM_16x1s_1: label is "0000";
------------------------------------------------------------------------
component ram16x1s
generic(
INIT : BIT_VECTOR(15 downto 0) := X"0000");
port(
O : out std_ulogic; -- note std_ulogic not std_logic
A0 : in std_ulogic;
A1 : in std_ulogic;
A2 : in std_ulogic;
A3 : in std_ulogic;
D : in std_ulogic;
WCLK : in std_ulogic;
WE : in std_ulogic);
end component;
52
Distributed RAM 16x1 (3)
begin
end RAM_16X1_DISTRIBUTED_STRUCTURAL;
53
Distributed RAM 16x8 (1)
library IEEE;
use IEEE.STD_LOGIC_1164.all;
library UNISIM;
use UNISIM.all;
entity RAM_16X8_DISTRIBUTED is
port(
CLK : in STD_LOGIC;
WE : in STD_LOGIC;
ADDR : in STD_LOGIC_VECTOR(3 downto 0);
DATA_IN : in STD_LOGIC_VECTOR(7 downto 0);
DATA_OUT : out STD_LOGIC_VECTOR(7 downto 0)
);
end RAM_16X8_DISTRIBUTED;
54
Distributed RAM 16x8 (2)
architecture RAM_16X8_DISTRIBUTED_STRUCTURAL of RAM_16X8_DISTRIBUTED is
-- part used by the synthesis tool, Synplify Pro, only; ignored during
simulation
attribute INIT : string;
--attribute INIT of RAM_16x1s_1: label is "0000";
component ram16x1s
generic(
INIT : BIT_VECTOR(15 downto 0) := X"0000");
port(
O : out std_ulogic;
A0 : in std_ulogic;
A1 : in std_ulogic;
A2 : in std_ulogic;
A3 : in std_ulogic;
D : in std_ulogic;
WCLK : in std_ulogic;
WE : in std_ulogic);
end component;
55
Distributed RAM 16x8 (3)
begin
GENERATE_MEMORY:
for I in 0 to 7 generate
RAM_16x1_S_1: ram16x1s
generic map (INIT => X"0000")
port map
(O => DATA_OUT(I),
A0 => ADDR(0),
A1 => ADDR(1),
A2 => ADDR(2),
A3 => ADDR(3),
D => DATA_IN(I),
WCLK => CLK,
WE => WE
);
end generate;
end RAM_16X8_DISTRIBUTED_STRUCTURAL;
56
Distributed ROM 16x1 (1)
library IEEE;
use IEEE.STD_LOGIC_1164.all;
library UNISIM;
use UNISIM.all;
entity ROM_16X1_DISTRIBUTED is
port(
ADDR : in STD_LOGIC_VECTOR(3 downto 0);
DATA_OUT : out STD_LOGIC
);
end ROM_16X1_DISTRIBUTED;
57
Distributed ROM 16x1 (2)
architecture ROM_16X1_DISTRIBUTED_STRUCTURAL of ROM_16X1_DISTRIBUTED is
-- part used by the synthesis tool, Synplify Pro, only; ignored during
simulation
attribute INIT : string;
attribute INIT of rom16x1s_1: label is "F0C1";
component ram16x1s
generic(
INIT : BIT_VECTOR(15 downto 0) := X"0000");
port(
O : out std_ulogic;
A0 : in std_ulogic;
A1 : in std_ulogic;
A2 : in std_ulogic;
A3 : in std_ulogic;
D : in std_ulogic;
WCLK : in std_ulogic;
WE : in std_ulogic);
end component;
58
Distributed ROM 16x1 (3)
begin
rom16x1s_1: ram16x1s
generic map (INIT => X"F0C1")
port map
(O=>DATA_OUT,
A0=>ADDR(0),
A1=>ADDR(1),
A2=>ADDR(2),
A3=>ADDR(3),
D=>Low,
WCLK=>Low,
WE=>Low
);
end ROM_16X1_DISTRIBUTED_STRUCTURAL;
59
Block RAM library components
Component Data Cells Parity Cells Address Bus Data Bus Parity Bus
60
Component declaration for BRAM (1)
61
Genaral template of BRAM instantiation (1)
62
Initializing Block RAMs 256x16
INIT_00 : BIT_VECTOR := X"014A0C0F09170A04076802A800260205002A01C5020A0917006A006800060040";
INIT_01 : BIT_VECTOR := X"000000000000000008000A1907070A1706070A020026014A0C0F03AA09170026";
INIT_02 : BIT_VECTOR := X"0000000000000000000000000000000000000000000000000000000000000000";
INIT_03 : BIT_VECTOR := X"0000000000000000000000000000000000000000000000000000000000000000";
……………………………………………………………………………………………………………………………………
INIT_0F : BIT_VECTOR := X"0000000000000000000000000000000000000000000000000000000000000000")
DATA
ADDRESS
INIT_00 014A 0C0F 0917 006A 0068 0006 0040
ADDRESS 0F 0E 04 03 02 01 00
INIT_01 0000 0000 014A 0C0F 03AA 0917 0026
ADDRESS 1F 1E 14 13 12 11 10
Addresses are
shown in red and
data corresponding
to the same
memory location is
shown in black
INIT_0F 0000 0000 0000 0000 0000 0000 0000
ADDRESS FF FE F4 F3 F2 F1 F0
63
Component declaration for BRAM (2)
VHDL Instantiation Template for RAMB16_S9, S18 and S36
-- Component Declaration for RAMB16_{S9 | S18 | S36}
component RAMB16_{S9 | S18 | S36}
-- synthesis translate_off
generic (
INIT : bit_vector := X"0";
INIT_00 : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
INIT_3E : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
INIT_3F : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
INITP_00 : bit_vector :=
X"0000000000000000000000000000000000000000000000000000000000000000";
INITP_07 : bit_vector :=
X"0000000000000000000000000000000000000000000000000000000000000000";
SRVAL : bit_vector := X"0";
WRITE_MODE : string := "WRITE_FIRST"; );
-- synthesis translate_on
port (DO : out STD_LOGIC_VECTOR (0 downto 0);
DOP : out STD_LOGIC_VECTOR (1 downto 0);
ADDR : in STD_LOGIC_VECTOR (13 downto 0);
CLK : in STD_ULOGIC;
DI : in STD_LOGIC_VECTOR (0 downto 0);
DIP : in STD_LOGIC_VECTOR (0 downto 0);
EN : in STD_ULOGIC;
SSR : in STD_ULOGIC;
WE : in STD_ULOGIC);
end component;
64
Genaral template of BRAM instantiation (2)
-- Component Attribute Specification for RAMB16_{S9 | S18 | S36}
-- Component Instantiation for RAMB16_{S9 | S18 | S36}
-- Should be placed in architecture after the begin keyword
RAMB16_{S9 | S18 | S36}_INSTANCE_NAME : RAMB16_S1
-- synthesis translate_off
generic map (
INIT => bit_value,
INIT_00 => vector_value,
. . . . . . . . . .
INIT_3F => vector_value,
INITP_00 => vector_value,
……………
INITP_07 => vector_value
SRVAL => bit_value,
WRITE_MODE => user_WRITE_MODE)
-- synopsys translate_on
port map (DO => user_DO,
DOP => user_DOP,
ADDR => user_ADDR,
CLK => user_CLK,
DI => user_DI,
DIP => user_DIP,
EN => user_EN,
SSR => user_SSR,
WE => user_WE);
65
Block RAM Waveforms – WRITE_FIRST
66
Block RAM Waveforms – READ_FIRST
67
Block RAM Waveforms – NO_CHANGE
68