Program Purpose: Integration. I am implementing an adaptive quadrature (aka numerical integration) algorithm for high dimensions (up to 100). The idea is to randomly break the volume up into smaller sections by evaluating points using a sampling density proportional to an estimate of the error at that point. Early on I "burn-in" a uniform sample, then randomly choose points according to a Gaussian distribution over the estimated error. In a manner similar to simulated annealing, I "lower the temperature" and reduce the standard deviation of my Gaussian as time goes on, so that low-error points initially have a fair chance of being chosen, but later on are chosen with steadily decreasing probability. This enables the program to stumble upon spikes that might be missed due to imperfections in the error function. (My algorithm is similar in spirit to Markov-Chain Monte-Carlo integration.)
Function Characteristics. The function to be integrated is estimated insurance policy loss for multiple buildings due to a natural disaster. Policy functions are not smooth: there are deductibles, maximums, layers (e.g. zero payout up to 1 million dollars loss, 100% payout from 1-2 million dollars, then zero payout above 2 million dollars) and other odd policy terms. This introduces non-linear behavior and functions that have no derivative in numerous planes. On top of the policy function is the damage function, which varies by building type and strength of hurricane and is definitely not bell-shaped.
Problem Context: Error Function. The difficulty is choosing a good error function. For each point I record measures that seem useful for this: the magnitude of the function, how much it changed as a result of a previous measuremnent (a proxy for the first derivative), the volume of the region the point occupies (larger volumes can hide error better), and a geometric factor related to the shape of the region. My error function will be a linear combination of these measures where each measure is assigned a different weight. (If I get poor results, I will contemplate non-linear functions.) To aid me in this effort, I decided to perform an optimization over a wide range of possible values for each weight, hence the Microsoft Solution Foundation.
What to Optimize: Error Rank. My measures are normalized, from zero to one. These error values are progressively revised as the integration proceeds to reflect recent averages for function values, changes, etc. As a result, I am not trying to make a function that yields actual error values, but instead yields a number that sorts the same as the true error, i.e. if all sampled points are sorted by this estimated error value, they should receive a rank similar to the rank they would receive if sorted by the true error.
Not all points are equal. I care very much if the point region with #1 true error is ranked #1000 (or vice versa), but care very little if the #500 point is ranked #1000. My measure of success is to MINIMIZE the sum of the following over many regions at a point partway into the algorithm's execution:
ABS(Log2(trueErrorRank) - Log2(estimatedErrorRank))
For Log2 I am using a function that returns the largest power of two less than or equal to the number. From this definition, come useful results. Swapping #1 and #2 costs a point, but swapping #2 and #3 costs nothing. This has the effect of stratifying points into power of two ranges. Points that are swapped within a range do not add to the function.
How I Evaluate. I have constructed a class called Rank that does this:
Ranks all regions by true error once.
For each separate set of parameterized weights, it computes the trial (estimated) error for that region.
Sorts the regions by that trial error.
Computes the trial rank for each region.
Adds up the absolute difference of logs of the two ranks and calls this the value of the parameterization, hence the value to be minimized.
C# Code. Having done all that, I just need a way to set up Microsoft Solver Foundation to find me the best parameters. The syntax has me stumped. Here is my C# code that I have so far. In it you will see comments for three problems I have identified. Maybe you can spot even more! Any ideas how to make this work?
public void Optimize()
{
// Get the parameters from the GUI and figures out the low and high values for each weight.
ParseParameters();
// Computes the true rank for each region according to true error.
var myRanker = new Rank(ErrorData, false);
// Obtain Microsoft Solver Foundation's core solver object.
var solver = SolverContext.GetContext();
var model = solver.CreateModel();
// Create a delegate that can extract the current value of each solver parameter
// and stuff it in to a double array so we can later use it to call LinearTrial.
Func<Model, double[]> marshalWeights = (Model m) =>
{
var i = 0;
var weights = new double[myRanker.ParameterCount];
foreach (var d in m.Decisions)
{
weights[i] = d.ToDouble();
i++;
}
return weights;
};
// Make a solver decision for each GUI defined parameter.
// Parameters is a Dictionary whose Key is the parameter name, and whose
// value is a Tuple of two doubles, the low and high values for the range.
// All are Real numbers constrained to fall between a defined low and high value.
foreach (var pair in Parameters)
{
// PROBLEM 1! Should I be using Decisions or Parameters here?
var decision = new Decision(Domain.RealRange(ToRational(pair.Value.Item1), ToRational(pair.Value.Item2)), pair.Key);
model.AddDecision(decision);
}
// PROBLEM 2! This calls myRanker.LinearTrial immediately,
// before the Decisions have values. Also, it does not return a Term.
// I want to pass it in a lambda to be evaluated by the solver for each attempted set
// of decision values.
model.AddGoal("Goal", GoalKind.Minimize,
myRanker.LinearTrial(marshalWeights(model), false)
);
// PROBLEM 3! Should I use a directive, like SimplexDirective? What type of solver is best?
var solution = solver.Solve();
var report = solution.GetReport();
foreach (var d in model.Decisions)
{
Debug.WriteLine("Decision " + d.Name + ": " + d.ToDouble());
}
Debug.WriteLine(report);
// Enable/disable buttons.
UpdateButtons();
}
UPDATE: I decided to look for another library as a fallback, and found DotNumerics (http://dotnumerics.com/). Their Nelder-Mead Simplex solver was easy to call:
Simplex simplex = new Simplex()
{
MaxFunEvaluations = 20000,
Tolerance = 0.001
};
int numVariables = Parameters.Count();
OptBoundVariable[] variables = new OptBoundVariable[numVariables];
//Constrained Minimization on the intervals specified by the user, initial Guess = 1;
foreach (var x in Parameters.Select((parameter, index) => new { parameter, index }))
{
variables[x.index] = new OptBoundVariable(x.parameter.Key, 1, x.parameter.Value.Item1, x.parameter.Value.Item2);
}
double[] minimum = simplex.ComputeMin(ObjectiveFunction, variables);
Debug.WriteLine("Simplex Method. Constrained Minimization.");
for (int i = 0; i < minimum.Length; i++)
Debug.WriteLine(Parameters[i].Key + " = " + minimum[i].ToString());
All I needed was to implement ObjectiveFunction as a method taking a double array:
private double ObjectiveFunction(double[] weights)
{
return Ranker.LinearTrial(weights, false);
}
I have not tried it against real data, but I created a simulation in Excel to setup test data and score it. The results coming back from their algorithm were not perfect, but gave a very good solution.