|
|
Question : Find a function which best suits the data
|
|
Hi,
I have data which when I plot , I get a curve. The data keeps changing and I have to find a function, which essentially satisfies the data. Is there any way in perl, where we can estimate a function to the data obtained. Thanks in advance vijay
|
Answer : Find a function which best suits the data
|
|
Ok. You can create a polynomial based on the points you have. If you have seven points, you will need to create six-degree polynomial. Perhaps your data is best represented not by a single high-order polynomial, but by a sucession of low-order polynomials joined together. That's called a spline.
This is the reference I capture from the book stated earlier:
The two subroutines that follow are used in tandem to generate and evaluate cubic splines. Like poly_fit(), they expect a series of points from which to generate the spline spline_generate() produces what we'll call the spline coefficients: the second derivatives at each of the input points. spline_evaluate() allows you to interpolate along the spline: given a new x value and the spline coefficients, it tells you the corresponding y value. # spline_generate(@points) calculates the coefficients for # the cubic spline that hits all of the points in @points # (provided as an array of arrays, each being an [x, y] pair). # It returns the coefficients as a reference to an array. # sub spline_generate { my @points = @_; my ($i, $delta, $temp, @factors, @coeffs); $coeffs[0] = $factors[0] = 0;
# Decomposition phase of the tridiagonal system of equations for ($i = 1; $i < @points - 1; $i++) { $delta = ($points[$i][0] - $points[$i-1][0]) / ($points[$i+1][0] - $points[$i-1][0]); $temp = $delta * $coeffs[$i-1] + 2; $coeffs[$i] = ($delta - 1) / @points; $factors[$i] = ($points[$i+1][1] - $points[$i][1]) / ($points[$i+1][0] - $points[$i][0]) - ($points[$i][1] - $points[$i-1][1]) / ($points[$i][0] - $points[$i-1][0]); $factors[$i] = ( 6 * $factors[$i] / ($points[$i+1][0] - $points[$i-1][0]) - $delta * $factors[$i-1] ) / $temp; }
# Backsubstitution phase of the tridiagonal system # $coeffs[$#points] = 0; for ($i = @points - 2; $i >= 0; $i--) { $coeffs[$i] = $coeffs[$i] * $coeffs[$i+1] + $factors[$i]; } return \@coeffs; }
# spline_evaluate($x, $coeffs, @points) returns the y-value # for the given x-value, along the spline generated by # $coeffs = spline_generate(@points). # sub spline_evaluate { my ($x, $coeffs, @points) = @_; my ($i, $delta, $mult);
# Which section of the spline are we in? # for ($i = @points - 2; $i >= 1; $i--) { last if $x >= $points[$i][0]; }
$delta = $points[$i+1][0] - $points[$i][0]; $mult = ( $coeffs->[$i]/2 ) + ($x - $points[$i][0]) * ($coeffs->[$i+1] - $coeffs->[$i]) / (6 * $delta); $mult *= $x - $points[$i][0]; $mult += ($points[$i+1][1] - $points[$i][1]) / $delta; $mult -= ($coeffs->[$i+1] + 2 * $coeffs->[$i]) * $delta / 6; return $points[$i][1] + $mult * ($x - $points[$i][0]); }
@points = ( [-1,1], [0,2], [1,-1], [2, 2] ); my $coeffs = spline_generate @points; print "Spline coefficients: @$coeffs\n"; for (my $i = -1; $i <= 3; $i += .5) { printf "[%.2f, %.2f]\n", $i, spline_evaluate($i, $coeffs, @points); } We provide four points (in @points) to spline_generate(). To verify that the resulting spline coefficients are accurate, we step along the x in intervals of 0.5 and determine the value of the cubic spline at that point. Spline coefficients: 0 -7.35483870967742 10.8387096774194 0 [-1.00, 1.00] [-0.50, 1.96] [0.00, 2.00] [0.50, 0.28] [1.00, -1.00] [1.50, -0.18] [2.00, 2.00] [2.50, 4.18] [3.00, 5.00]
If you rather prefer a method that can generate polynomial,
sub poly_fit { my @points = @_; my ($i, $j, $y, @solution); for ($i = 0; $i < @points; $i++) { ($x, $y) = ( @ {$points[$i]} ); for ($j = 0; $j < @points-1; $j++) { $points[$i][$j] = $x ** (@points - $j - 1); } $points[$i][@points-1] = 1; $points[$i][@points] = $y; } return linear_solve( @points ); } What polynomial includes the points (2, 7000), (3, 6000), and (4, 9000)? We can fit any three points to a parabolaa polynomial of degree 2and we can find out those points as follows: @solution = poly_fit( [2,7000], [3,6000], [4,9000] ); print "@solution\n"; # 2000 -11000 21000 This tells us that the parabola hitting those three points is 2000x2-11000x+21000.
|
|
|
|
|