Xilinx MIG Tutorial

Using external memory with Xilinx Spartan-6 FPGAs, ISE and Core Generator.

I'd quite like to be able to use the external memory that comes on most FPGA development boards! Unfortunately, most of the newer Avnet and Digilent self-test applications are based on the EDK and use Microblaze to exercise various functions. I don't have an EDK license and would rather do things from ISE for now. The problem is that there's scant information on getting external memory working using the Core Generator wizard, other than reading the manual (too long and baffling) or the auto-generated example code (too long and baffling).

After a few days of fiddling around, I've distilled all the information you'll need to get the DDR2 memory working with the Digilent Atlys through ISE.

Generate a core

  • In general, follow Xilinx UG416 for this. I called my component 'ddr2'.
  • Don't care about pin-compatible parts.
  • Atlys has DDR2 on Bank 3, so set that. Don't use AXI. It's for interconnection with certain processors, so if you're not doing that then you don't need it.
  • Atlys has a MT47H64M16xx-25E.
  • Leave frequency at 3200ps, all MCB options as default.
  • Port configuration will depend on your application. I suggest working through UG416 so that you understand what it means. For the example code that follows, I used two 64-bit R/W ports.
  • Default timeslot options. Use calibrated inputs: On the Atlys they are RZQ (L6) and ZIO (C2).
  • System clock is single ended (100 MHz) on the Atlys.

On the Xilinx forum, user gloomy suggests that there is a timing parameter mistake in the MIG, which can be overcome by generating a custom part and setting tRAS to 45nS. I haven't tried this yet. Additionally, enabling extended MCB performance range allows the speed to be set to 2500 ps (400 MHz), which Digilent claim is supported.

Fix clock generation

The MIG generated by the core assumes that the memory frequency will be equal to the clock. On the Atlys, this means that it'll be only 100 MHz (though that's doubled to 200 MHz for DDR). This is pretty lame - you paid for more than that!

The PLL parameters are defined using localparam, so the only way to change this, as of.. whatever version is in ISE 12.4.. is by editing the generated files directly. The recommended way to determine the PLL parameters to use is to create a new PLL_BASE using the Clocking Wizard core generator. There's no need to actually save the core - just plug in the numbers, view the parameters at the end and cancel.

I wanted to use a 312.50 MHz clock (doubled to 625 MHz, because the MCB uses double the clock speed to generate DDR signals). Read "Modifying the Clock Setup" in UG388. The parameters I ended up with were:

localparam C3_CLKOUT0_DIVIDE       = 1; // sysclk_2x = 625 MHz
localparam C3_CLKOUT1_DIVIDE       = 1; // sysclk_2x_180 = 625 MHz
localparam C3_CLKOUT2_DIVIDE       = 8; // user clock = 78.125 MHz
localparam C3_CLKOUT3_DIVIDE       = 4; // calibration clock = 156.25 MHz - seems a bit high, actually!
localparam C3_CLKFBOUT_MULT        = 25;
localparam C3_DIVCLK_DIVIDE        = 4; // 100 * 25 / 4 = 625 MHz

There's a discussion on the minimum clock rate necessary to achieve the highest bandwidth in "Clocking"

Change these in your ipcore_dir/ddr2/user_design/rtl/ddr2.v file (replacing ddr2 with whatever you named your core). Note that these probably get overwritten if you regenerate your core.

Instantiation

Using the "View HDL Instantiation Template" command in ISE, find the template for your new core and paste it into your project somewhere, such as the top level of your project or in the functional wrapper you're planning to write. I opted for the latter option.

Here's my complete wrapper module, ddr2_interface. It has two 64-bit R/W ports, so it won't work directly if you're doing something different.


module ddr2_interface(

	// These pins are named after the nets given in the Atlys UCF file.
	output DDR2CLK_P,
	output DDR2CLK_N,
	output DDR2CKE,
	output DDR2RASN,
	output DDR2CASN,
	output DDR2WEN,
	inout DDR2RZQ,
	inout DDR2ZIO,
	output [2:0] DDR2BA, // With the exception of these three. I changed the UCF file to do DDR2BA[0] etc., which is neater

	output [12:0] DDR2A,
	inout [15:0] DDR2DQ,

	inout DDR2UDQS_P,
	inout DDR2UDQS_N,
	inout DDR2LDQS_P,
	inout DDR2LDQS_N,
	output DDR2LDM,
	output DDR2UDM,
	output DDR2ODT,

	input clk, // 100 MHz oscillator = 10ns period (top level pin)

	input wire [2:0] c3_p0_cmd_instr,
	input wire [5:0] c3_p0_cmd_bl,
	input wire [29:0] c3_p0_cmd_byte_addr,
	input wire [7:0] c3_p0_wr_mask,
	input wire [63:0] c3_p0_wr_data,
	output wire [6:0] c3_p0_wr_count,
	output wire [63:0] c3_p0_rd_data,
	output wire [6:0] c3_p0_rd_count,
	input wire c3_p0_rd_en,
	output wire c3_p0_rd_empty,
	input wire c3_p0_wr_en,

	input wire [2:0] c3_p1_cmd_instr,
	input wire [5:0] c3_p1_cmd_bl,
	input wire [29:0] c3_p1_cmd_byte_addr,
	input wire [7:0] c3_p1_wr_mask,
	output wire [6:0] c3_p1_wr_count,
	input wire [63:0] c3_p1_wr_data,
	output wire [63:0] c3_p1_rd_data,
	output wire [6:0] c3_p1_rd_count,
	input wire c3_p1_rd_en,
	output wire c3_p1_rd_empty,
	input wire c3_p1_wr_en,
	input wire c3_p0_cmd_en,

	output wire c3_calib_done,
	input wire reset
	);

	wire c3_rst0; // It's an output
	wire c3_clk0; // 32 MHz clock generated by PLL. Actually, this should be 78 MHz! Investigate this.

	ddr2 # (
		.C3_P0_MASK_SIZE(8),
		.C3_P0_DATA_PORT_SIZE(64),
		.C3_P1_MASK_SIZE(8),
		.C3_P1_DATA_PORT_SIZE(64),
		.DEBUG_EN(1),
		.C3_MEMCLK_PERIOD(3200),
		.C3_CALIB_SOFT_IP("TRUE"),
		.C3_SIMULATION("TRUE"), // This was FALSE in the template - probably doesn't hurt to switch on!
		.C3_RST_ACT_LOW(0),
		.C3_INPUT_CLK_TYPE("SINGLE_ENDED"),
		.C3_MEM_ADDR_ORDER("BANK_ROW_COLUMN"),
		.C3_NUM_DQ_PINS(16),
		.C3_MEM_ADDR_WIDTH(13),
		.C3_MEM_BANKADDR_WIDTH(3)
	)
	u_ddr2 (

		.c3_sys_clk           	(clk), // 100 MHz system clock
		.c3_sys_rst_n           (reset),                        

		.mcb3_dram_dq           (DDR2DQ),  
		.mcb3_dram_a            (DDR2A),  
		.mcb3_dram_ba           (DDR2BA),
		.mcb3_dram_ras_n        (DDR2RASN),                        
		.mcb3_dram_cas_n        (DDR2CASN),                        
		.mcb3_dram_we_n         (DDR2WEN),                          
		.mcb3_dram_odt          (DDR2ODT),
		.mcb3_dram_cke          (DDR2CKE),                          
		.mcb3_dram_ck           (DDR2CLK_P),                          
		.mcb3_dram_ck_n         (DDR2CLK_N),       
		.mcb3_dram_dqs          (DDR2LDQS_P),                          
		.mcb3_dram_dqs_n        (DDR2LDQS_N),
		.mcb3_dram_udqs         (DDR2UDQS_P),    // for X16 parts                        
		.mcb3_dram_udqs_n       (DDR2UDQS_N),  // for X16 parts
		.mcb3_dram_udm          (DDR2UDM),     // for X16 parts
		.mcb3_dram_dm           (DDR2LDM),

		.c3_clk0		        		(c3_clk0), // This is the user clock output generated by the PLL
		.c3_rst0		        		(c3_rst0),
		.c3_calib_done          (c3_calib_done),
		.mcb3_rzq               (DDR2RZQ),
		.mcb3_zio               (DDR2ZIO),

		// Here we're feeding the user clock into the port FIFOs. You will not want to do this if you are running 
		// the FIFOs (three per port!) in different clock domains, but then you'll need to bring clocks in from elsewhere.
		.c3_p0_cmd_clk                          (c3_clk0), 
		.c3_p0_cmd_en                           (c3_p0_cmd_en),
		.c3_p0_cmd_instr                        (c3_p0_cmd_instr),
		.c3_p0_cmd_bl                           (c3_p0_cmd_bl),
		.c3_p0_cmd_byte_addr                    (c3_p0_cmd_byte_addr),
		.c3_p0_cmd_empty                        (c3_p0_cmd_empty),
		.c3_p0_cmd_full                         (c3_p0_cmd_full),
		.c3_p0_wr_clk                           (c3_clk0), // A clock!
		.c3_p0_wr_en                            (c3_p0_wr_en),
		.c3_p0_wr_mask                          (c3_p0_wr_mask),
		.c3_p0_wr_data                          (c3_p0_wr_data),
		.c3_p0_wr_full                          (c3_p0_wr_full),
		.c3_p0_wr_empty                         (c3_p0_wr_empty),
		.c3_p0_wr_count                         (c3_p0_wr_count),
		.c3_p0_wr_underrun                      (c3_p0_wr_underrun),
		.c3_p0_wr_error                         (c3_p0_wr_error),
		.c3_p0_rd_clk                           (c3_clk0), // A clock!
		.c3_p0_rd_en                            (c3_p0_rd_en),
		.c3_p0_rd_data                          (c3_p0_rd_data),
		.c3_p0_rd_full                          (c3_p0_rd_full),
		.c3_p0_rd_empty                         (c3_p0_rd_empty),
		.c3_p0_rd_count                         (c3_p0_rd_count),
		.c3_p0_rd_overflow                      (c3_p0_rd_overflow),
		.c3_p0_rd_error                         (c3_p0_rd_error),

		.c3_p1_cmd_clk                          (c3_clk0), // A clock!
		.c3_p1_cmd_en                           (c3_p1_cmd_en),
		.c3_p1_cmd_instr                        (c3_p1_cmd_instr),
		.c3_p1_cmd_bl                           (c3_p1_cmd_bl),
		.c3_p1_cmd_byte_addr                    (c3_p1_cmd_byte_addr),
		.c3_p1_cmd_empty                        (c3_p1_cmd_empty),
		.c3_p1_cmd_full                         (c3_p1_cmd_full),
		.c3_p1_wr_clk                           (c3_clk0), // A clock!
		.c3_p1_wr_en                            (c3_p1_wr_en),
		.c3_p1_wr_mask                          (c3_p1_wr_mask),
		.c3_p1_wr_data                          (c3_p1_wr_data),
		.c3_p1_wr_full                          (c3_p1_wr_full),
		.c3_p1_wr_empty                         (c3_p1_wr_empty),
		.c3_p1_wr_count                         (c3_p1_wr_count),
		.c3_p1_wr_underrun                      (c3_p1_wr_underrun),
		.c3_p1_wr_error                         (c3_p1_wr_error),
		.c3_p1_rd_clk                           (c3_clk0), // A clock!
		.c3_p1_rd_en                            (c3_p1_rd_en),
		.c3_p1_rd_data                          (c3_p1_rd_data),
		.c3_p1_rd_full                          (c3_p1_rd_full),
		.c3_p1_rd_empty                         (c3_p1_rd_empty),
		.c3_p1_rd_count                         (c3_p1_rd_count),
		.c3_p1_rd_overflow                      (c3_p1_rd_overflow),
		.c3_p1_rd_error                         (c3_p1_rd_error)
	);

endmodule

Test bench

The critical (and totally obvious in retrospect) detail about simulating RAM is that you can't simulate RAM without actually having a RAM connected. Fortunately, a RAM model is provided when you generate the core! I found one in ipcore_dir/corename/user_design/sim/ddr2_model_c3.v . This might differ if you chose a different manufacturer's RAM, so have a poke around. Add it to your project.

The example project generated by the core sets up the RAM parameters using some defines given on the command line. I couldn't find a way to set these through the GUI, and editing the .prj file for the test bench seemed a little naff, so I just edited the ddr2_model_parameters_c3.vh include file that came with the model. Add the following lines to the top (customised for your RAM)

	`define x1Gb
	`define sg25E
	`define x16

Now, add a test bench for this module. I called mine tb_ddr2_interface.v.

module tb_ddr2_interface;

	// Inputs
	reg clk;
	reg [2:0] c3_p0_cmd_instr;
	reg [5:0] c3_p0_cmd_bl;
	reg [29:0] c3_p0_cmd_byte_addr;
	reg [7:0] c3_p0_wr_mask;
	reg [63:0] c3_p0_wr_data;
	reg [2:0] c3_p1_cmd_instr;
	reg [5:0] c3_p1_cmd_bl;
	reg [29:0] c3_p1_cmd_byte_addr;
	reg [7:0] c3_p1_wr_mask;
	reg [63:0] c3_p1_wr_data;
	reg c3_p0_wr_en, c3_p1_wr_en;
	reg c3_p0_rd_en, c3_p1_rd_en;
	reg c3_p0_cmd_en;
	reg reset;

	// Outputs
	wire DDR2CLK_P;
	wire DDR2CLK_N;
	wire DDR2CKE;
	wire DDR2RASN;
	wire DDR2CASN;
	wire DDR2WEN;
	wire DDR2RZQ;
	wire DDR2ZIO;
	wire [2:0] DDR2BA;
	wire [12:0] DDR2A;
	wire DDR2UDQS_P;
	wire DDR2UDQS_N;
	wire DDR2LDQS_P;
	wire DDR2LDQS_N;
	wire DDR2LDM;
	wire DDR2UDM;
	wire DDR2ODT;
	wire [6:0] c3_p0_wr_count;
	wire [6:0] c3_p1_wr_count;
	wire [63:0] c3_p0_rd_data;
	wire [63:0] c3_p1_rd_data;
	wire [6:0] c3_p0_rd_count;
	wire [6:0] c3_p1_rd_count;
	wire c3_p0_rd_empty, c3_p1_rd_empty;
	wire c3_calib_done;
	wire [15:0] DDR2DQ;

	// Instantiate the Unit Under Test (UUT)
	ddr2_interface uut (
		.DDR2CLK_P(DDR2CLK_P), 
		.DDR2CLK_N(DDR2CLK_N), 
		.DDR2CKE(DDR2CKE), 
		.DDR2RASN(DDR2RASN), 
		.DDR2CASN(DDR2CASN), 
		.DDR2WEN(DDR2WEN), 
		.DDR2RZQ(DDR2RZQ), 
		.DDR2ZIO(DDR2ZIO), 
		.DDR2BA(DDR2BA), 
		.DDR2A(DDR2A), 
		.DDR2DQ(DDR2DQ), 
		.DDR2UDQS_P(DDR2UDQS_P), 
		.DDR2UDQS_N(DDR2UDQS_N), 
		.DDR2LDQS_P(DDR2LDQS_P), 
		.DDR2LDQS_N(DDR2LDQS_N), 
		.DDR2LDM(DDR2LDM), 
		.DDR2UDM(DDR2UDM), 
		.DDR2ODT(DDR2ODT), 
		.clk(clk), 
		.c3_p0_cmd_instr(c3_p0_cmd_instr), 
		.c3_p0_cmd_bl(c3_p0_cmd_bl), 
		.c3_p0_cmd_byte_addr(c3_p0_cmd_byte_addr), 
		.c3_p0_wr_mask(c3_p0_wr_mask), 
		.c3_p0_wr_data(c3_p0_wr_data), 
		.c3_p0_wr_count(c3_p0_wr_count), 
		.c3_p0_rd_data(c3_p0_rd_data), 
		.c3_p0_rd_count(c3_p0_rd_count),
		.c3_p0_wr_en(c3_p0_wr_en),
		.c3_p1_cmd_instr(c3_p1_cmd_instr), 
		.c3_p1_cmd_bl(c3_p1_cmd_bl), 
		.c3_p1_cmd_byte_addr(c3_p1_cmd_byte_addr), 
		.c3_p1_wr_mask(c3_p1_wr_mask), 
		.c3_p1_wr_count(c3_p1_wr_count), 
		.c3_p1_rd_data(c3_p1_rd_data), 
		.c3_p1_wr_data(c3_p1_wr_data), 
		.c3_p1_rd_count(c3_p1_rd_count),
		.c3_p1_wr_en(c3_p1_wr_en),
		.c3_p0_rd_empty(c3_p0_rd_empty),
		.c3_p1_rd_empty(c3_p1_rd_empty),
		.c3_p0_rd_en(c3_p0_rd_en),
		.c3_p1_rd_en(c3_p1_rd_en),
		.c3_p0_cmd_en(c3_p0_cmd_en),
		.c3_calib_done(c3_calib_done),
		.reset(reset)
	);

	PULLDOWN zio_pulldown3 (.O(DDR2ZIO));
	PULLDOWN rzq_pulldown3 (.O(DDR2RZQ));

	// The Micron DDR2 SDRAM simulation model
	ddr2_model_c3 u_mem_c3(
		.ck         (DDR2CLK_P),
		.ck_n       (DDR2CLK_N),
		.cke        (DDR2CKE),
		.cs_n       (1'b0),
		.ras_n      (DDR2RASN),
		.cas_n      (DDR2CASN),
		.we_n       (DDR2WEN),
		.dm_rdqs    ({DDR2UDM,DDR2LDM}),
		.ba         (DDR2BA),
		.addr       (DDR2A),
		.dq         (DDR2DQ),
		.dqs        ({DDR2UDQS_P,DDR2LDQS_P}),
		.dqs_n      ({DDR2UDQS_N,DDR2LDQS_N}),
		.rdqs_n     (),
		.odt        (DDR2ODT)
	);

	localparam DATA_CLOCK = 1/(312.5/4)*1000; // 12.8 ns at 78 MHz

	initial begin
		// Initialize Inputs
		clk = 0;
		c3_p0_cmd_instr = 0;
		c3_p0_cmd_bl = 0;
		c3_p0_cmd_byte_addr = 0;
		c3_p0_wr_mask = 0;
		c3_p0_wr_data = 0;
		c3_p1_cmd_instr = 0;
		c3_p1_cmd_bl = 0;
		c3_p1_cmd_byte_addr = 0;
		c3_p1_wr_mask = 0;
		c3_p1_wr_data = 0;
		c3_p0_wr_en = 0;
		c3_p1_wr_en = 0;
		c3_p0_rd_en = 0;
		c3_p1_rd_en = 0;
		c3_p0_cmd_en = 0;
		reset = 1;

		#200;
		reset = 0;

		// Calibration
		wait (c3_calib_done);
		$display("Calibration done");

		#100;

		// Put some data into FIFO in preparation for writing it.
		c3_p0_wr_en = 1; // Start writing to the FIFO
		c3_p0_wr_data = 20; // .. a 20

		#DATA_CLOCK; c3_p0_wr_data = 21; // .. a 21
		#DATA_CLOCK; c3_p0_wr_data = 22;
		#DATA_CLOCK; c3_p0_wr_data = 23;
		#DATA_CLOCK; c3_p0_wr_data = 24;
		#DATA_CLOCK; c3_p0_wr_data = 25;
		#DATA_CLOCK; c3_p0_wr_data = 26;
		#DATA_CLOCK;
		c3_p0_wr_en = 0; // Stop writing to the FIFO.

		$display("WR count: %d", c3_p0_wr_count);

		// Write to memory
		#DATA_CLOCK;
		c3_p0_cmd_instr = 3'b000; // Prepare to write
		c3_p0_cmd_bl = 7; // a total of seven words
		c3_p0_cmd_byte_addr = 16; // to address 16

		#DATA_CLOCK;
		c3_p0_cmd_en = 1; // Write to command FIFO

		#DATA_CLOCK;
		c3_p0_cmd_en = 0; // Stop writing to command FIFO

		// Perform a read, some time later
		#32.5;
		c3_p0_cmd_bl = 15; // Read 16 words (note, 0 will read one word)
		c3_p0_cmd_byte_addr = 16; // From address 16

		#DATA_CLOCK;
		c3_p0_cmd_instr = 3'b001; // Issue a read command

		#DATA_CLOCK;
		c3_p0_cmd_en = 1; // Write to command FIFO

		#DATA_CLOCK;
		c3_p0_cmd_en = 0; // Stop writing to command FIFO

		#DATA_CLOCK;

		wait(~c3_p0_rd_empty); // Wait until the RAM has retrieved the data
		c3_p0_rd_en = 1; // Start reading data
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		#DATA_CLOCK; $display("Read: %d, empty %d", c3_p0_rd_data, c3_p0_rd_empty);
		c3_p0_rd_en = 0; // Stop reading data

		#10000; 

	end

	always begin
		#5; clk = ~clk; // 100 MHz system clock
	end

endmodule

Note that the MIG has udqs and ldqs ports, while the Micron model only has a 2-bit dqs port. For this reason, the two MIG signals are concatenated in this test bench.

When running the simulation, be aware that calibration takes a long time - around 75821ns. With the WebPACK version of ISim, this may actually take a couple of days. There are a few other parameters in the MIG that don't seem to be documented but look promising, such as C_MC_CALIB_BYPASS.

UCF file

Unfortunately the UCF that Digilent supplies isn't quite enough (though the one buried in the EDK example probably works, if it uses DDR2). Following is the minimum you need for the DDR pads:

NET "DDR2CLK_P"   LOC = "G3"; # Bank = 3, Pin name = IO_L46P_M3CLK,     		  Sch name = DDR-CK_P
NET "DDR2CLK_N"   LOC = "G1"; # Bank = 3, Pin name = IO_L46N_M3CLKN,    		  Sch name = DDR-CK_N
NET "DDR2CKE"    LOC = "H7"; # Bank = 3, Pin name = IO_L53P_M3CKE,       		  Sch name = DDR-CKE
NET "DDR2RASN"   LOC = "L5"; # Bank = 3, Pin name = IO_L43P_GCLK23_M3RASN,		  Sch name = DDR-RAS
NET "DDR2CASN"   LOC = "K5"; # Bank = 3, Pin name = IO_L43N_GCLK22_IRDY2_M3CASN, Sch name = DDR-CAS
NET "DDR2WEN"    LOC = "E3"; # Bank = 3, Pin name = IO_L50P_M3WE,   			  Sch name = DDR-WE
NET "DDR2RZQ"	  LOC = "L6"; # Bank = 3, Pin name = IO_L31P,   				  Sch name = RZQ
NET "DDR2ZIO"	  LOC = "C2"; # Bank = 3, Pin name = IO_L83P,   				  Sch name = ZIO
NET "DDR2BA[0]"    LOC = "F2"; # Bank = 3, Pin name = IO_L48P_M3BA0,        		  Sch name = DDR-BA0
NET "DDR2BA[1]"    LOC = "F1"; # Bank = 3, Pin name = IO_L48N_M3BA1,        		  Sch name = DDR-BA1
NET "DDR2BA[2]"    LOC = "E1"; # Bank = 3, Pin name = IO_L50N_M3BA2,       		  Sch name = DDR-BA2
NET "DDR2A[0]"     LOC = "J7"; # Bank = 3, Pin name = IO_L47P_M3A0,        		  Sch name = DDR-A0
NET "DDR2A[1]"     LOC = "J6"; # Bank = 3, Pin name = IO_L47N_M3A1,        		  Sch name = DDR-A1
NET "DDR2A[2]"     LOC = "H5"; # Bank = 3, Pin name = IO_L49N_M3A2,     			  Sch name = DDR-A2
NET "DDR2A[3]"     LOC = "L7"; # Bank = 3, Pin name = IO_L45P_M3A3,     			  Sch name = DDR-A3
NET "DDR2A[4]"     LOC = "F3"; # Bank = 3, Pin name = IO_L51N_M3A4,     			  Sch name = DDR-A4
NET "DDR2A[5]"     LOC = "H4"; # Bank = 3, Pin name = IO_L44P_GCLK21_M3A5,     	  Sch name = DDR-A5
NET "DDR2A[6]"     LOC = "H3"; # Bank = 3, Pin name = IO_L44N_GCLK20_M3A6,    	  Sch name = DDR-A6
NET "DDR2A[7]"     LOC = "H6"; # Bank = 3, Pin name = IO_L49P_M3A7,    			  Sch name = DDR-A7
NET "DDR2A[8]"     LOC = "D2"; # Bank = 3, Pin name = IO_L52P_M3A8,    			  Sch name = DDR-A8
NET "DDR2A[9]"     LOC = "D1"; # Bank = 3, Pin name = IO_L52N_M3A9,   			  Sch name = DDR-A9
NET "DDR2A[10]"    LOC = "F4"; # Bank = 3, Pin name = IO_L51P_M3A10,        		  Sch name = DDR-A10
NET "DDR2A[11]"    LOC = "D3"; # Bank = 3, Pin name = IO_L54N_M3A11,   			  Sch name = DDR-A11
NET "DDR2A[12]"    LOC = "G6"; # Bank = 3, Pin name = IO_L53N_M3A12,       		  Sch name = DDR-A12
NET "DDR2DQ[0]"    LOC = "L2"; # Bank = 3, Pin name = IO_L37P_M3DQ0,       		  Sch name = DDR-DQ0
NET "DDR2DQ[1]"    LOC = "L1"; # Bank = 3, Pin name = IO_L37N_M3DQ1,       		  Sch name = DDR-DQ1
NET "DDR2DQ[2]"    LOC = "K2"; # Bank = 3, Pin name = IO_L38P_M3DQ2,       		  Sch name = DDR-DQ2
NET "DDR2DQ[3]"    LOC = "K1"; # Bank = 3, Pin name = IO_L38N_M3DQ3,       		  Sch name = DDR-DQ3
NET "DDR2DQ[4]"    LOC = "H2"; # Bank = 3, Pin name = IO_L41P_GCLK27_M3DQ4,        Sch name = DDR-DQ4
NET "DDR2DQ[5]"    LOC = "H1"; # Bank = 3, Pin name = IO_L41N_GCLK26_M3DQ5,        Sch name = DDR-DQ5
NET "DDR2DQ[6]"    LOC = "J3"; # Bank = 3, Pin name = IO_L40P_M3DQ6,       		  Sch name = DDR-DQ6
NET "DDR2DQ[7]"    LOC = "J1"; # Bank = 3, Pin name = IO_L40N_M3DQ7,       		  Sch name = DDR-DQ7
NET "DDR2DQ[8]"    LOC = "M3"; # Bank = 3, Pin name = IO_L36P_M3DQ8,    			  Sch name = DDR-DQ8
NET "DDR2DQ[9]"    LOC = "M1"; # Bank = 3, Pin name = IO_L36N_M3DQ9,        		  Sch name = DDR-DQ9
NET "DDR2DQ[10]"   LOC = "N2"; # Bank = 3, Pin name = IO_L35P_M3DQ10,        	  Sch name = DDR-DQ10
NET "DDR2DQ[11]"   LOC = "N1"; # Bank = 3, Pin name = IO_L35N_M3DQ11,        	  Sch name = DDR-DQ11
NET "DDR2DQ[12]"   LOC = "T2"; # Bank = 3, Pin name = IO_L33P_M3DQ12,       		  Sch name = DDR-DQ12
NET "DDR2DQ[13]"   LOC = "T1"; # Bank = 3, Pin name = IO_L33N_M3DQ13,    		  Sch name = DDR-DQ13
NET "DDR2DQ[14]"   LOC = "U2"; # Bank = 3, Pin name = IO_L32P_M3DQ14,        	  Sch name = DDR-DQ14
NET "DDR2DQ[15]"   LOC = "U1"; # Bank = 3, Pin name = IO_L32N_M3DQ15,        	  Sch name = DDR-DQ15
NET "DDR2UDQS_P"   LOC="P2"; # Bank = 3, Pin name = IO_L34P_M3UDQS,       		  Sch name = DDR-UDQS_P
NET "DDR2UDQS_N"  LOC="P1"; # Bank = 3, Pin name = IO_L34N_M3UDQSN,        		  Sch name = DDR-UDQS_N
NET "DDR2LDQS_P"   LOC="L4"; # Bank = 3, Pin name = IO_L39P_M3LDQS,        		  Sch name = DDR-LDQS_P
NET "DDR2LDQS_N"  LOC="L3"; # Bank = 3, Pin name = IO_L39N_M3LDQSN,        		  Sch name = DDR-LDQS_N
NET "DDR2LDM"    LOC="K3"; # Bank = 3, Pin name = IO_L42N_GCLK24_M3LDM,          Sch name = DDR-LDM
NET "DDR2UDM"    LOC="K4"; # Bank = 3, Pin name = IO_L42P_GCLK25_TRDY2_M3UDM,	  Sch name = DDR-UDM
NET "DDR2ODT"    LOC="K6"; # Bank = 3, Pin name = IO_L45N_M3ODT,        		  Sch name = DDR-ODT

NET "DDR2DQ[*]"    IN_TERM = NONE;
NET "DDR2LDQS_P"   IN_TERM = NONE;
NET "DDR2LDQS_N"   IN_TERM = NONE;
NET "DDR2UDQS_P"   IN_TERM = NONE;
NET "DDR2UDQS_N"   IN_TERM = NONE;

NET "DDR2DQ[*]"    IOSTANDARD = SSTL18_II;
NET "DDR2A[*]"     IOSTANDARD = SSTL18_II;
NET "DDR2BA[*]"    IOSTANDARD = SSTL18_II;
NET "DDR2LDQS_P"   IOSTANDARD = DIFF_SSTL18_II;
NET "DDR2LDQS_N"   IOSTANDARD = DIFF_SSTL18_II;
NET "DDR2UDQS_P"   IOSTANDARD = DIFF_SSTL18_II;
NET "DDR2UDQS_N"   IOSTANDARD = DIFF_SSTL18_II;
NET "DDR2CLK_P"    IOSTANDARD = DIFF_SSTL18_II;
NET "DDR2CLK_N"    IOSTANDARD = DIFF_SSTL18_II;
NET "DDR2CKE"      IOSTANDARD = SSTL18_II;
NET "DDR2RASN"     IOSTANDARD = SSTL18_II;
NET "DDR2CASN"     IOSTANDARD = SSTL18_II;
NET "DDR2WEN"      IOSTANDARD = SSTL18_II;
NET "DDR2ODT"      IOSTANDARD = SSTL18_II;
NET "DDR2LDM"      IOSTANDARD = SSTL18_II;
NET "DDR2UDM"      IOSTANDARD = SSTL18_II;
NET "DDR2RZQ"      IOSTANDARD = SSTL18_II;
NET "DDR2ZIO"      IOSTANDARD = SSTL18_II;

CONFIG MCB_PERFORMANCE= STANDARD;
NET "*/memc3_wrapper_inst/mcb_ui_top_inst/mcb_raw_wrapper_inst/selfrefresh_mcb_mode" TIG;
NET "*/c?_pll_lock" TIG;
NET "*/memc?_wrapper_inst/mcb_ui_top_inst/mcb_raw_wrapper_inst/gen_term_calib.mcb_soft_calibration_top_inst/mcb_soft_calibration_inst/CKE_Train" TIG;

The final four lines are taken from ipcore_dir/ddr2/user_design/par/ddr2.ucf and are necessary to prevent your design from failing timing constraints deep within the MIG code.

Synthesis

It builds! I haven't tried it, though.

The code

Here's an example project that runs on the Digilent Atlys. It contains the test bench above and a simple design that endlessly writes and reads a memory location. The test bench has been tested and works. I haven't yet tried it out on the Atlys, since it currently does not do anything particularly interesting.

Note that there are some errors in it - reset will never be deasserted because it relies on c3_clk0, which is never generated because the PLL is held in reset. Change line 64 of atlys_ddr_test.v to reg reset = 0; and design a proper reset controller.