Why is the knapsack problem pseudo-polynomial?
M

6

118

I know that Knapsack is NP-complete while it can be solved by DP. They say that the DP solution is pseudo-polynomial, since it is exponential in the "length of input" (i.e. the numbers of bits required to encode the input). Unfortunately I did not get it. Can anybody explain that pseudo-polynomial thing to me slowly ?

Magyar answered 27/12, 2010 at 12:19 Comment(1)
Possible duplicate of How to understand the knapsack problem is NP-complete?Tug
O
117

The running time is O(NW) for an unbounded knapsack problem with N items and knapsack of size W. W is not polynomial in the length of the input though, which is what makes it pseudo-polynomial.

Consider W = 1,000,000,000,000. It only takes 40 bits to represent this number, so input size = 40, but the computational runtime uses the factor 1,000,000,000,000 which is O(240).

So the runtime is more accurately said to be O(N.2bits in W), which is exponential.

Also see:

Overbid answered 27/12, 2010 at 12:35 Comment(5)
Link #3 referring to "Complexity of dynamic programming for the 0-1 knapsack problem" is dead.Herd
Sorry, I didn't get it. Let's say if we have an algorithm with time complexity O(N), then we have O(2^(bits in N)), which is exponential? Thanks~Postdoctoral
@LushaLi This helped me: youtube.com/watch?v=9oI7fg-MIpE. If N is an array where each element has a fixed max size input (say each element in the array is no more than 32 bits), and you run a for loop on this array once, then it is a polynomial time algorithm in the input size N of the array. However, if N is an integer and you run a loop over N, then N is unbounded and grows exponentially in the number of bits it takes to represent it. So a simple for loop on N is, actually, exponential. Note that in the case of the array, the size of each element in the array was upper-bounded.Topeka
I wasn’t convinced. There are a lot of algorithms with the same properties that are not “pseudo-polynomial”. Say, what is the complexity of Sieve of Eratosthenes (or any other prime numbers finder)?Callisto
It is a really weird way to describe the running time of an algorithm indeed. If you have an outer-loop with N iterations and an inner-loop with W iterations, then the running time of your algorithm is O(NW)... no? The fact that the input to your program will consist of N integers and an integer W seems to be a separate issue - your algorithm will still make N * W iterations.Residential
M
38

In most of our problems, we're dealing with large lists of numbers which fit comfortably inside standard int/float data types. Because of the way most processors are built to handle 4-8 byte numbers at a time at no additional cost (relative to numbers than fit in, say, 1 byte), we rarely encounter a change in running time from scaling our numbers up or down within ranges we encounter in real problems - so the dominant factor remains just the sheer quantity of data points, the n or m factors that we're used to.

(You can imagine that the Big-O notation is hiding a constant factor that divides-out 32 or 64 bits-per-datum, leaving only the number-of-data-points whenever each of our numbers fit in that many bits or less)

But try reworking with other algorithms to act on data sets involving big ints - numbers that require more than 8 bytes to represent - and see what that does to the runtime. The magnitude of the numbers involved always makes a difference, even in the other algorithms like binary sort, once you expand beyond the buffer of safety conventional processors give us "for-free" by handling 4-8 byte batches.

The trick with the Knapsack algorithm that we discussed is that it's unusually sensitive (relative to other algorithms ) to the magnitude of a particular parameter, W. Add one bit to W and you double the running time of the algorithm. We haven't seen that kind of dramatic response to changes in value in other algorithms before this one, which is why it might seem like we're treating Knapsack differently - but that's a genuine analysis of how it responds in a non-polynomial fashion to changes in input size.

Magnificence answered 16/10, 2013 at 6:11 Comment(1)
This is the best response I have read so far.Mcmullen
P
21

The way I understand this is that the capacity would've been O(W) if the capacity input were an array of [1,2,...,W], which has a size of W. But the capacity input is not an array of numbers, it's instead a single integer. The time complexity is about the relationship to the size of input. The size of an integer is NOT the value of the integer, but the number of bits representing it. We do later convert this integer W into an array [1,2,...,W] in the algorithm, leading people into mistakenly thinking W is the size, but this array is not the input, the integer itself is.

Think of input as "an array of stuff", and the size as "how many stuff in the array". The item input is actually an array of n items in the array so size=n. The capacity input is NOT an array of W numbers in it, but a single integer, represented by an array of log(W) bits. Increase the size of it by 1 (adding 1 meaningful bit), W doubles so run time doubles, hence the exponential time complexity.

Pawpaw answered 22/1, 2018 at 19:38 Comment(2)
This clears it up a whole lot, thank you.Rossie
This makes no sense. By this logic, a for loop that sums the first n natural numbers is pseudo-polynomial?Isolecithal
M
12

The Knapsack algorithm's run-time is bound not only on the size of the input (n - the number of items) but also on the magnitude of the input (W - the knapsack capacity) O(nW) which is exponential in how it is represented in computer in binary (2^n) .The computational complexity (i.e how processing is done inside a computer through bits) is only concerned with the size of the inputs, not their magnitudes/values.

Disregard the value/weight list for a moment. Let's say we have an instance with knapsack capacity 2. W would take two bits in the input data. Now we shall increase the knapsack capacity to 4, keeping the rest of the input. Our input has only grown by one bit, but the computational complexity has increased twofold. If we increase the capacity to 1024, we would have just 10 bits of the input for W instead of 2, but the complexity has increased by a factor of 512. Time complexity grows exponentially in the size of W in binary (or decimal) representation.

Another simple example that helped me understand the pseudo-polynomial concept is the naive primality testing algorithm. For a given number n we are checking if it's divided evenly by each integer number in range 2..√n, so the algorithm takes √(n−1) steps. But here, n is the magnitude of the input, not it's size.

                     Now The regular O(n) case

By contrast, searching an array for a given element runs in polynomial time: O(n). It takes at most n steps and here n is the size of the input (the length of the array).

[ see here ]

Calculating bits required to store decimal number

Magnificence answered 16/10, 2013 at 5:50 Comment(1)
So for your last searching example, why don't consider n as binary too? if n=1024, it also only takes 10bits, so shouldn't it be pseudo-polynomial?Vouchsafe
R
2

Complexity is based on input. In knapsack problem, Inputs are size, max Capacity, and profit, weight arrays. We construct dp table as size * W so we feel as its of polynomial time complexity. But, input W is an integer, not an array. So, it will be O(size*(no Of bits required to store given W)). If no of bits increase by 1, then running time doubles. Thus it is exponential, thereby pseudo-polynomial.

Rafe answered 9/7, 2021 at 13:1 Comment(0)
M
0

Time complexity is defined as the function of input size (not the input)

Knapsack has complexity of O(nW), here n = size of the coins array. To represent W, we need log2W bits. This gives us O(n * 2^log2W) = O(n * 2^input_size), Hence exponential.

With the same logic, coin change (nW) is also exponential. However greedy algorithm with canonical coins is polynomial.

// assumed the coins are sorted in ascending order.
int CoinChange(const vector<int>& coins, int amount) {
    int coinsUsed = 0;
    for(int i = n-1; i >= 0; i --) {
        coinsUsed += amount / coins[i];
        amount %= coins[i];
    }
    return coinsUsed;
 } 
Minster answered 12/1 at 2:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.