The Knapsack Problem
Written by Mike James   
Tuesday, 01 July 2014
Article Index
The Knapsack Problem
Knapsack solved

 

Knapsack solved

There are recursive ways to solve the Knapsack problem without explicitly coding a variable number of nested loops but it is instructive to do things this way. 

Before we can create a Knapsack function we need some data to test it.

All we have to do is to load an array with a reasonable set of integers to try to make the target up from and it then calls the recursive SUM function which tries to find a subset:

int N = 50;
Random R = new Random();
int Target = R.Next(1,N * N);
int[] A = new int[N];
for (int i = 0; i < N; i++)
               A[i] = R.Next(1,N * N / 2);

Now we have an integer array loaded with random values and a randomly generated target.  Notice that there might be repeated values in the array - as they say "it is left to the reader ....". 

All we have to do next is write the target so we know what it is and call the Sum function which solves the problem and prints the answer.


Console.WriteLine("Target: " +
                    Target.ToString());
int CurrentSum = 0;
Sum(0,N,ref CurrentSum,A,Target);
Console.WriteLine("done");

Sum is recursive but not difficult to follow – it is just a set of nested for loops one per call of the function and is based on the Loop function listed earlier. 

Now as well as nest and N we also need the CurrentSum, i.e. what the selected elements add up to, the array, and the Target value:

static void Sum(int nest,
         int N,
         ref int CurrentSum,
         int[] A,
         int Target){
 if (nest >= N)return;
 for (int i = nest; i < N; i++) {

It the nesting level is greater than or equal to N it is time to stop the nesting as in the previous example. We are going to generate N nested loops. Also notice that we are starting the loop off from the value of nest rather than 1. The reason is that we only want to generate half of the possible arrangements and this removes repeats or 1234 as 4321. That is the outer loop starts at 1, the next most inner loop at 2 and the next at 3 and so on.  

The CurrentSum has to be passed by ref because it is changed by the function and this change needs to be returned to show what the result so far is. It is also what brings all of the loops to a stop. When we find a set of values that add to the Target that's enough, i.e. we are looking for the first solution not all of the solutions.

The logic in each loop is to first test to see if adding the next selected element will take the CurrentSum over the Target. If so we simply move on to the next element in the loop and see if this is worth considering:

if (CurrentSum + A[i] > Target) continue;

If the CurrentSum is made too big by this element we simply move on to consider the next using the continue. 

If the current element can be added to the CurrentSum without exceeding the Target then we update the CurrentSum and call the function to find the next element using the next nested loop:

CurrentSum += A[i];
Sum(nest+1,N,ref CurrentSum,A,Target);

Notice we increment the nesting level as you should expect. 

When this function returns it has either tested all of the elements and failed or it has found the correct sum, i.e. CurrentSum equals the Target and we test for this and write out the element tested at this level:

if (CurrentSum == Target){
  Console.WriteLine(nest.ToString() + "," +
                    A[i].ToString());
  break;
}

If the CurrentSum is equal to the target we print the current index and the array element that was used in the sum and break out of the loop.

Notice that once the CurrentSum is equal to the target it stays equal to the target and all of the loops come to an end by executing the break.

If the inner loop fails to find a solution we have to correct CurrentSum by subtracting the failed element and moving on to try the next element in the loop.

  CurrentSum -= A[i];
 }
}

If you run it and it finds a solution it prints the elements of the array. If it fails to find a solution you just see the magic word "done" – you can't always make the target up from a subset of the values.

Solving The Bounded Problem

There are minor problems in this program – in particular it can generate repeated values in the set of numbers but this gets increasingly unlikely as N gets bigger. We are not really solving the bounded knapsack problem nor are we solving the unbounded knapsack problem because each element is only tried a maximum of N times and in principle it could take N+1, N+2 and so on repeats to find a solution. 

To change the program so that it doesn't generate any repeats isn't difficult but it is a bit messy. The solution is to set any element in the array that has been used as part of the solution to zero. We don't have to add a test for zero elements because they don't change the outcome when added to the CurrentSum. However we do have to remember to restore the original array value when it has been rejected in the search for a solution at each nested level. Notice that by zeroing the array element at each level this means the element can only be used in one level hence eliminating all repeats.

The code changes needed are slight. For efficiency we need to reject any zero elements as they have been used at a level above the current one:

 if (CurrentSum + A[i] > Target || A[i]==0)
                                   continue;

Next we need to zero the current element and then restore it after the levels below have been completed.

int temp = A[i];
CurrentSum += A[i];
A[i] = 0;
Sum(nest+1, N, ref CurrentSum, A, Target);
A[i] = temp;

Simple but some how messy.

Now all that remains is to fix the random array generator not to create repeats. Look up How not to shuffle - the Knuth Fisher-Yates algorithm.  

For convenience the entire program is;

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace knapsack
{
 class Program
 {
  static void Main(string[] args)
  {
   int N = 50;
   Random R = new Random();
   int Target = R.Next(1, N * N);
   int[] A = new int[N];
   for (int i = 0; i < N; i++)
               A[i] = R.Next(1, N * N / 2);
   Console.WriteLine("Target: " +
   Target.ToString());
   int CurrentSum = 0;
   Sum(0, N, ref CurrentSum, A, Target);
   Console.WriteLine("done");
}

  static void Sum(
   int nest,
   int N,
   ref int CurrentSum,   
   int[] A,
   int Target)
   {
    if (nest >= N) return;

    for (int i = nest; i < N; i++) 
    {
     if (CurrentSum + A[i] > Target ||
            A[i]==0) continue;
     int temp = A[i];
     CurrentSum += A[i];
     A[i] = 0;
     Sum(nest+1, N, ref CurrentSum, A, Target);
     A[i] = temp;
     if (CurrentSum == Target)
     {
      Console.WriteLine(nest.ToString()+","+
                           i.ToString()+","+
                        A[i].ToString());
      break;
     }
    CurrentSum -= A[i];
   }
  }
 }
}

Speed And Heuristics

If you try this program out you will find that at first everything works quickly with answers flashing up after a few seconds. However if you try N=30,000 say then things slow to a stop.

The problem is that the knapsack algorithm slows down as 2^N which makes the problem for 30,000 number about 10^10000 (yes 1 followed by 10,000 zeros) times longer to solve than the 1000 number problem – so don't wait for a solution too long. 

To see this in action try N=50, 500 and 5000. In most cases you will give up waiting for the 5000 case. 

It is the huge number of different combinations that have to be considered that makes the task so difficult.

It is a the definition of NP problems that finding a solution takes a long time but verifying that you have a solution is very quick. In this case you have to search all possible combinations to find a solution but checking that you have a solution is just a matter of adding the values provided up and checking that the sum to the target and are in the array. 

 

However, how do you know that there isn't a short cut that means you don't have to consider every possible combination?

There are algorithms that attempt to solve the problem more quickly. For example there are variations on the greedy algorithm - pick the first number closest to the target, then the next closest and so on. A more sophisticated greedy algorithm is based on dynamic programming. The key factor in all of these algorithms is that they usually find the solution in reasonable time but there is no promise that they will ever find the solution without examining every possible combination. They may be advanced but they are still "heuristics", i.e. algorithms that usually work quicker than an exhaustive search but this isn't guaranteed.

 

knapsack

Uses

The knapsack problem is the basis for one method of public key cryptography.

A set of numbers in a given order is made public and any user can code a message by multiplying each number by a zero or a one according to the messages binary representation.

All of the values are added up to give a single value and this is of course the "target" value in a knapsack problem using the given set of numbers.

Of course you can't decode the message because this would mean solving the knapsack problem for very very large N. However the intended recipient of the message can decode it because they have a "shadow" set of numbers with a special properties that make the solution of the knapsack problem very easy. So anyone can code a message using the public numbers but only the recipient can decode it without solving the full knapsack problem.

Unfortunately a few years ago it was discovered that the need to use a "shadow" set of values made the knapsack problem posed by the public values rather easier to solve than was originally thought – i.e. there was a way of doing the job much faster than 2^N. So now the knapsack code is no longer used but it is still interesting and with a better set of shadow values might even be resurrected one day.

 

 

The program can be downloaded from the CodeBin. Note you have to register with I Programmer to access this.  

Related Articles:

How not to shuffle - the Knuth Fisher-Yates algorithm       

Recursion

Public Key Encryption

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.

 

Banner


Getting started with Google Maps JavaScript

Google Maps is the basis of all Google Mapping products and it's easy to use in your website or web app. We take a quick look at how to get started and what features you can take advantage of.



Gmail, Spreadsheets and Google App Script

If you have to administer an email list, creating a Google App Script to process email bounces and send the relevant data to a spreadsheet is not only useful but also a good example of using scripts.

 [ ... ]


Other Projects

 

blog comments powered by Disqus

<ASIN:9810221207>

<ASIN:059651610X>

<ASIN:3540613560>



Last Updated ( Wednesday, 02 July 2014 )
 
 

   
RSS feed of all content
I Programmer - full contents
Copyright © 2014 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.