How to design a 64 x 64 bit array multiplier in Verilog?

Asked 23/5, 2013 at 7:45 Answered 4/8, 2019 at 6:40

I know how to design a 4x4 array multiplier , but if I follow the same logic , the coding becomes tedious.

4 x 4 - 16 partial products
64 x 64 - 4096 partial products.

Along with 8 full adders and 4 half adders, How many full adders and half adders do I need for 64 x 64 bit. How do I reduce the number of Partial products? Is there any simple way to solve this ?

Paganini answered 23/5, 2013 at 7:45 Comment(0)

Whenever tediously coding a repetitive pattern you should use a generate statement instead:

module array_multiplier(a, b, y);

parameter width = 8;
input [width-1:0] a, b;
output [width-1:0] y;

wire [width*width-1:0] partials;

genvar i;
assign partials[width-1 : 0] = a[0] ? b : 0;
generate for (i = 1; i < width; i = i+1) begin:gen
    assign partials[width*(i+1)-1 : width*i] = (a[i] ? b << i : 0) +
                                   partials[width*i-1 : width*(i-1)];
end endgenerate

assign y = partials[width*width-1 : width*(width-1)];

endmodule

I've verified this module using the following test-bench: http://svn.clifford.at/handicraft/2013/array_multiplier/array_multiplier_tb.v

EDIT:

As @Debian has asked for a pipelined version - here it is. This time using a for loop in an always-region for the array part.

module array_multiplier_pipeline(clk, a, b, y);

parameter width = 8;

input clk;
input [width-1:0] a, b;
output [width-1:0] y;

reg [width-1:0] a_pipeline [0:width-2];
reg [width-1:0] b_pipeline [0:width-2];
reg [width-1:0] partials [0:width-1];
integer i;

always @(posedge clk) begin
    a_pipeline[0] <= a;
    b_pipeline[0] <= b;
    for (i = 1; i < width-1; i = i+1) begin
        a_pipeline[i] <= a_pipeline[i-1];
        b_pipeline[i] <= b_pipeline[i-1];
    end

    partials[0] <= a[0] ? b : 0;
    for (i = 1; i < width; i = i+1)
        partials[i] <= (a_pipeline[i-1][i] ? b_pipeline[i-1] << i : 0) +
                partials[i-1];
end

assign y = partials[width-1];

endmodule

Note that with many synthesis tools it's also possible to just add (width) register stages after the non-pipelined adder and let the tools register balancing pass do the pipelining.

Fanfaron answered 24/5, 2013 at 11:20 Comment(5)

What if I have to pipeline it ? How would I do that , isn't it a bit more difficult ? – Paganini 29/5, 2013 at 0:53

I've now also added a pipelined version to my answer (see EDIT above). – Fanfaron 29/5, 2013 at 5:0

I know its a long time.Can you re-evaluate your code ? output [width-1:0] y; // shouldn't it be [2*width - 1] y ; – Paganini 22/11, 2014 at 13:40

@Paganini you can of course also create a multiplier that has 2*width output size. But it would need more changes than simply the size of 'y': for unsigned multiply it should be enough to also increase the size of the partials. for signed multiply it becomes a bit more complicated. (as long as input and output width are the same we do not need to distinguish between signed and unsigned multiply) I also like to add that in practice you would use something like a carry save adder tree instead of "normal" adders to implement multiplication. – Fanfaron 23/11, 2014 at 13:48

Wouldn't this result in loss of precision ? Also if I want to write verilog code for multiplication of 64-bit unsigned number , the precision will matter a lot right ? Can we discuss about this in chatroom ? – Paganini 23/11, 2014 at 14:3

[how to] reduce the number of partial products?
A method somewhat common used to be modified Booth encoding:
At the cost of more complicated addend selection, it at least almost halves their number.
In its simplest form, considering groups of three adjacent bits (overlapping by one) from one of the operands, say, b, and selecting 0, a, 2a, -2a or -a as an addend.

Tay answered 21/2, 2019 at 13:46 Comment(0)

-1

The code below generates only half of expected the output.

module arr_multi(a, b, y);      
parameter w = 8;       
input [w-1:0] a, b;                // w-width       
output [(2*w)-1:0] y;              // p-partials       
wire [(2*w*w)-1:0] p;        //assign width as input bits multiplied by 
 output bits
 genvar i;        
 assign p[(2*w)-1 : 0] = a[0] ? b : 0;  //first output size bits          
  generate            
      for (i = 1; i < w; i = i+1)       
         begin        
             assign p[(w*(4+(2*(i-1))))-1 : (w*2)*i] = (a[i]?b<<i :0) + p[(w*(4+(2* 
                (i-2))))-1 :(w*2)*(i-1)];       
         end     
   endgenerate          
  assign y=p[(2*w*w)-1:(2*w)*(w-1)];     //taking last output size bits            
   endmodule

Selie answered 4/8, 2019 at 6:40 Comment(1)

It's always a good idea to put some text in your answer to explain what you're doing, especially when there already is an accepted answer or a highly rated answer. Read how to write a good answer – Toname 4/8, 2019 at 7:14

Recommended topics

Hot tags