Simultaneous optimization of harvest schedule and measurement strategy

In many recent studies, the value of forest inventory information in the harvest scheduling has been examined. Usually only the profitability of measuring simultaneously all the stands in the area is examined. Yet, it may be more profitable to concentrate the measurement efforts to some subset of them. In this paper, the authors demonstrate that stochastic optimization can be used for defining the optimal measurement strategy simultaneously with the harvest decisions. The results show that without end-inventory constraints, it was most profitable to measure the stands that were just below the medium age. Measuring the oldest stands was not profitable at all. It turned out to be profitable to postpone the measurements until just before the potential harvests. Introducing a strict end-inventory constraint increased the number of stands that could be profitably measured. In this case, also the length of the planning horizon had a clear effect on what stands were profitable to measure. With a 15-year planning horizon, measuring the oldest stands was profitable while with longer planning horizons it was not. The interest rate did not affect the number of stands measured much, but it had a clear effect on the timing of the measurements.


Introduction
Any non-trivial decision typically includes uncertainty concerning the prevailing state of nature (Hirsleifer & Riley 1979). The decision maker (DM) can then either make an optimal choice between different alternatives with the current information or reduce the uncertainty by collecting more information. Value of information (VOI) in decision making (ex ante) can hereby be defined as the difference between the expected value of this choice with and without the information (e.g. Hirshleifer and Riley 1979, Lawrence 1999, Birchler and Bütler 2007, Kangas 2010. Collecting information is only profitable if VOI is greater than the collecting costs. In forestry, a few analyses on VOI have been carried out (e.g. Knoke 2002, Amacher et al. 2005, Kangas et al. 2012. A more common approach has been the cost-plus-loss (CPL) analysis, in which the expected losses due to sub-optimal decisions caused by inaccurate data are added to the total costs of data acquisition (e.g. Holmström et al. 2003, Eid et al. 2004). These two approaches are closely related, however, as the expected losses of CPL are equal to the expected value of perfect information (EVPI) for the remaining uncertainty with a given set of data.
In most cases, VOI has been analysed in a case where the utility of the decision maker only consists of the net present value (NPV) of the decision without any area-level constraints, such as even-flow or end inventory constraints. In that case the losses can be defined at stand level, and it is possible to analyse how the properties of the stands affect the losses (e.g. Eid 2000, Islam et al. 2010). Eid (2000) concluded that it is most profitable to measure the stands which were close to the age where they would be ready for final harvest. On the other hand, measuring very old or very young stands would not be that profitable.
It is also possible to formulate the forest planning problem in such a way that the data acquisition decisions are optimized at the same time as the harvest decisions. For instance, Ståhl (1994) formulated the problem so that the inventory design parameters were to be decided upon at the same time as the harvest decisions were made. Duvemo et al. (2012) formulated a problem where the stands that were eligible to be harvested during the first years of planning period in a tactical plan were selected for a pre-harvest inventory, for which more accurate data was acquired before the final operational decisions. The selection of the stands for a pre-harvest inventory was not optimized, though.
If it were assumed that the uncertainty can be resolved, i.e., perfect information of the forests will become available at some time point, two-stage optimization (also programming with recourse) could be utilized. In the first stage, decisions are carried out under uncertainty, but the second stage decisions are assumed to be carried out under certainty. In a multi-stage problem, new information becomes available after each stage and the uncertainty involved diminishes in time. If the decision of forest inventory in a given stand is optimized at the same time as the harvests, the harvest decision gives the optimal timing of the harvest and the inventory decision means that a stand is either measured or not. This decision depends on the costs of the inventory per hectare. This means that if the stand is measured, optimization with perfect information is carried out for that stand, and if it is not measured, optimization is carried out under uncertainty. An obvious generalization of this problem is that besides the harvest and inventory decisions we also decide about the timing of the inventory. Thus, it means that in each stand it is possible to move from the first-stage decision mode to the second-stage decision mode in the beginning of each period.
In a previous study (Kangas unpublished) the VOI was very small if the DM was assumed to just maximize the NPV without constraints: from 0.5 €/ha to 2.6 €/ha. This was assumed to be partly due to the differences between the alternative options being quite small, when they only affect the period in which the stand is harvested but none of the stands remained unharvested. Partly it was assumed to be due to the growth models used, which were not too sensitive to the errors in the initial input values, as in previous studies the VOIs have been much higher also for unconstrained cases (e.g. Mäkinen et al. 2010). The conclusion was that in a case of no area-level constraints, it would not be profitable to invest on forest inventory, as making an inventory would probably cost more than 2.6 €/ha. The even-flow and end-inventory constraints increased the value of information markedly, up to 344.9 €/ha. However, even in the case of no area-level constraints it could be profitable to invest on measuring a part of the stands. In this study, we examine a) whether it is profitable to make measurements in a part of the stands, b) whether it is useful to postpone the measurements in some stands and c) how introducing an end inventory constraint affects the results. We study this by formulating two optimization problems: 1) a problem, where we optimize simultaneously both the decision to measure or not to measure a stand and the cutting schedule, and 2) a problem, where in addition to the previous we optimize the timing of the measuring of stands, with and without end inventory constraints. We solve the above-mentioned problems using linear optimization and study the effects of inventory errors, costs, interest rate and planning horizon on the decisions. We also analyse the characteristics of the stands selected to be measured in different cases. The uncertainty of forest stand inventory is modelled with a scenario-based approach, where each scenario is assumed equally probable. If the inventory is available, cutting decisions are made for each scenario separately, and if the inventory is unavailable, there is a single cutting decision that is common for all scenarios. From the results of optimization, we are able to draw some conclusions that we believe to be generalizable to other forest estates than the one studied in this paper.

Material
The study site consists of 29 stands and 67.29 hectares located in the municipality of Juuka in Eastern Finland. This estate is part of a larger estate owned by UPM-Kymmene. The stands were measured using a compartment-wise inventory (for details, see Mustonen et al. 2008).
The development of the stands was predicted by a simple stand-level simulator. It consisted of a model for the relative growth of basal area (G) as a function of stand age (T), mean height (Hgm), site index (dominant height at the age of 100 years, H100) and basal area, and of a growth model for the relative growth of dominant height (Hdom) as a function of stand age and dominant height for spruce and pine (Vuokila and Väliaho 1980). The latter model was also used to predict the site index.
Stand volume (V) was calculated with a model V=G·Hgm·f, where f is the form factor. The form factor was predicted using models by Gustavsen & Fagerström (1983) was estimated from the dataset in the beginning of the simulation.
This simulator was used to predict the development of the stands for K ( {3, 6, 9} K Î ) 5-year periods. The alternative decisions available for the DM were the different timing options for clearcut. No thinnings were included into the planning. The net revenues from the clearcut were obtained by multiplying the total volume with the price 35 €/m 3 . Thus, every produced cubic meter was assumed to be equally valuable. The clearcutting was assumed to happen at the end of the period. In the case without an end inventory constraint, the net revenues were discounted with a 3% interest to the beginning of the first period.
In the Monte Carlo simulation, correlated multinormal errors were introduced to the initial variables using the rmvnorm package of R (R Development Core Team 2011). The relative standard error of dominant height was assumed to be 10%, and that of basal area varied from 2.5% to 27.5%. Stand age was assumed to be known without error, but the error in initial height also introduced error to the predicted H100. Correlation was assumed to be 0.1. It highly depends on the method used for acquiring the dataset. In traditional visual forest inventory the correlation was about 0.l calculated from the data used by Haara and Korhonen (2004) and in two ALS-datasets used by Mäkinen et al. (2010) it varied from -0.22 to 0.11. These assumptions resulted with a standard error for volume from 5.2% to 38.2%. All errors were assumed unbiased. For each of the error levels, I = 100 scenarios of stand characteristics were simulated.

The optimization problem formulations
The optimization problem is to maximize the NPV of the forest area. Each of the stands can be measured to make better decisions. The objective is where cjik is the net income in stand j (€/ha), scenario i and period k, aj is the stand area (ha), and 5k d is the discount factor for period k. The decision variables for the decisions without forest inventory information are denoted by xj.k and those for decisions with forest inventory information by xjik, to emphasise the fact that the decisions without the information are common for all scenarios while the decisions with information only concern one specific scenario i. The decision variables here are proportions of stands harvested at each period, which can have only value 0 or 1, meaning that we have integer variables. The variable yj describes the inventory decision, which has value 0 if the stand is not measured and 1 if it is measured, and the inventory has a cost p (€/ha). The constraint (3) indicates that if the stand is not measured, the decision is made without the information and (4) indicates that if the stand is measured a separate decision for each stand j and for each scenario i is made. The total number of planning periods is K, the total number of stands is J and the total number of scenarios is I.
The problem can of course involve also other constraints, for instance, an end inventory constraint as .1 1 11 In a more complicated case we allow the measurements to happen at the beginning of any planning period k. Then, the problem is formulated as .
where pjk is inventory cost discounted from the beginning of period k to the year 1 and yjk is the inventory decision variable, which has value 0 if the stand is not measured at period k and 1 if it is measured. The constraint (7) indicates that whether the stand is measured or not, it can only be cut once within a scenario and constraint (8) indicates that if the stand is not measured before period k, the decision for that period must be made without inventory information.

Solving the Optimization Problems
All the optimization problems were solved using either the lpSolve package in R (R Development Core Team 2011) or the IBM ILOG CPLEX Optimization Studio V12.4 (IBM® ILOG® CPLEX® Optimization Studio Information Center 2011), which is commonly known merely as Cplex. Cplex was running on a computing server with 64 cores and approx. 1032GB of Random-access memory (RAM), whenever the lpSolve package was not able to solve the problems. The R package was running on an ordinary office laptop with 3.48 GB of RAM. Cplex was able to take advantage of extensive parallelization and was simultaneously running parallel on up to 32 cores.
Overall, the R lpSolve package could not solve the problems where the measurement period was also treated as a variable. When the number of periods is K=9, the number of scenarios is I = 100 and the number of stands is J=29, then the optimization problem (2)-(3) has a reasonable number of 2929 constraints, while the optimization problem (6)-(9) has 29100 constraints, i.e., almost ten times as many constraints, even without the end inventory constraint. However, the number of scenarios is clearly the main contributor in the computational cost of the optimization problems: the problems with 100 scenarios are much harder to solve than those with 50 scenarios. Even with the optimization problem (2)-(3), lpSolve was not able to consider more than 50 scenarios because of memory exhaustion.
The Cplex integer optimizer is highly configurable: for the optimization the following non-default parameters were used: falls under 0.05 %. By integer linear optimization theory, the actual optimal value always lies between the objective values of the best integer solution and the overall best solution found so far.
2. Optimizer time limit at 5000 seconds (default 10 ), i.e., approximately 1 hour 23 minutes, which means that after this time the optimization is stopped, no matter what the solution is.
The relative gap tolerance of 0.0005 was adequately small for our analysis, since as the optimal values of e.g., the optimization problem (6) -(9) were between approximately 178 000 € (for interest 10%) and approximately 511 000 € (for interest 0%), the relative mixed integer gap tolerance meant that the optimal value was at most 256 € away. This small a difference was negligible for our analysis.
In addition, we added an extra constraint to the problem, which helped us to get slightly (but not much) better results. The problems were solvable even without this new constraint, but by including the constraint we were able to achieve a slightly better precision. Adding the constraint corresponds to the technique called "model strengthening" used in (e.g., Andalaft et al. 2003) to optimization problem (6)-(9). The meaning of the constraint is that, if the stand j is measured on period k (i.e., 1 jk y = ), then on that period, in at least one scenario, the stand has to be cut (i.e., it must be that 1 1 I jik i x = ³ å . It can be easily seen that if a feasible solution to the optimization problem does not satisfy constraint (11), then the objective function value can be improved by postponing the measuring until the constraint is satisfied. In (Andalaft et al. 2003), the model strengthening was more extensive than in our case and their improvements in computational efficiency were also more drastic than in our case, where the changes were at most 0.0002 difference in the relative gap after the time limit of 5000 seconds had been reached. Interestingly, the effects were greater in problems with a lower interest than in problems with a higher interest.
Optimization problem (6)-(9) with additional constraint (11) was solved for interests between 0 and 10%, by increasing the interest by 1% after each optimization run. A so-called warm start was used to reduce computational time, where the optimal solution of the previous run was given as an initial solution to the next run of the Cplex optimizer. Since the interest only affected the values of the objective functions in the problems solved (as the interest only appears in the objective functions), the warm start provided a feasible solution for the new run.
Cplex runs on the optimization problem (6)-(9) with interest 0%, 1% and 2% terminated within the time limit of 5000 seconds with the relative gap below the limit of 0.0005. This did not happen without the model strengthening constraint (11). The Cplex runs with other interests did not reach the relative gap. However, even in these problems the relative difference was below 0.0012 and the guaranteed absolute maximal difference between the objective value of the found solution and the optimal solution under 293 € for interest 4%. Clearly, this difference is insignificant for our analysis, as the optimal value of the optimization problem with interest at 4% was approximately 236156 €.

Results
In the first analyses (i.e., in optimization problem (2)-(4)) the measurements were only allowed at the beginning of the planning horizon and end inventory constraints were not included. Although the VOI for measuring the whole area in the previous article was as low as 2.6 €/ha with the highest initial variance of volume, measuring some stands was profitable even with costs/ha as high as 18€/ha when the initial variance of volume was assumed to be 23% (Figure 1). It is notable that even when the cost of measuring stands was 0€/ha, it was not profitable to measure 20 out of the 29 stands.
The profitability of measuring stands clearly depended on the age of the stand in the beginning of the planning horizon. It can be noted that the stands that were measured were on average below the mean age, and the stands that were not measured were above the mean age, independent of the cost of the measurements (Figure 2). The higher the measurement costs per hectare, the lower the mean age of the measured stands. As the decisions were made for the whole 30-year planning horizon with the same information, it was profitable to measure quite young stands where the harvests were to be carried out at the last periods.
It was also analyzed how the variance of volume affected the profitability of measurements. In this analysis, the cost of the measurements was fixed as 5€/ha. In this case, with the highest initial variance of volume it was profitable to measure at most five stands (Figure 3). The number of measured stands did not increase when the initial variance of volume was increased from 23% to 38%.
The rest of the analyses have been calculated with Cplex, using a dataset with 100 scenarios and an initial variance of volume 38%, i.e., the highest variance in Figure 3. The measurement costs were fixed to 5€/ha. The same problem as above was solved using a longer planning period (K=9). Lengthening the planning horizon and increasing the number of scenarios increased the number of stands profitably measured from five to six. The most probable reason is that in the youngest stands the decision can be postponed more with the longer planning horizon, and it is not clear anymore that the final harvest will happen at the 6 th period as in the previous case. The possibility of postponing the measurements is often profitable because the measurement costs are discounted the same way as incomes. When postponing the measurements was possible, it was profitable to measure nine stands, of which only one stand in the first period. Two stands were scheduled for measurement only at the 8 th period, but none at the 9 th period: measuring at the last period is not profitable as the only possible decision then is to cut at the last period (Figure 4). This same result can be seen from Figure 5, where the measurement period is plotted with the stand age. In general, measuring the oldest stands was not profitable. The younger the stands were, the longer it was profitable to postpone the measurements. Two young stands make an exception here, as for these stands the best decision was always to harvest as late as possible and, therefore, measuring them was never profitable ( Figure 5).
When the strict end inventory constraint was included, the number of stands that could be profitably measured increased markedly. Using the same interest rate, measurement cost and number of planning periods as without the end inventory constraint, the number of measured stands increased from nine to 18 ( Figure 6). Measuring the additional stands was required to meet the end inventory constraint in all scenarios, as the end inventory constraint was strict. Moreover, many of the measured stands were measured in the beginning of the planning horizon already. In this case, the profitability of measuring the young stands could be seen even more clearly than in the problem without any end inventory constraint: measuring stands over 80 years old was never profitable ( Figure 7). However, when the planning period was only 15 years, i.e. K=3, it turned out that it was profitable to measure the oldest stands, i.e., the result was just the opposite to the result obtained with nine periods (Figure 8).
The interest rate did not much affect the number of measured stands, which varied from 18 to 21 when the interest rate varied from 0% to 5% and the planning horizon was K=9. With the shorter planning horizon K=3, the number of stands to be measured was also smaller: it varied from 15 to 19. In both cases the highest number occurred with the interest rate 2%. However, the interest rate had a marked effect on the timing of the measurements. When the interest rate was zero and there was no end inventory constraint, it was not profitable to measure any of the stands, as the optimal decision was always to harvest in the last period. With the end inventory constraint and a long planning horizon, however, it was profitable to measure 20 stands, and each of them was optimal to measure at the first period. The obvious reason for the result is that since the measurement costs are not discounted, it is equally profitable to measure at any period. When the interest rate was 1%, on the other hand, in ten stands it was profitable to measure the stand in the last period.
Without the end inventory constraint, no stands were measured at the last period, but no end inventory constraint means that it is still useful to measure the stands in the last period, as it has to be decided whether these stands can be harvested at all or not. When the interest rate increases, the measurements tend to move closer to the first period again. For instance, in one example stand the measurement period moved from nine to one as the interest rate increased from 1% to 5% ( Figure   9). The probable reason is that maximizing the net present value with a higher interest rate tends to move also the harvests closer to period one and the measurements need to be taken before the decisions in order to benefit from them.
When the harvest and measurement decisions are carried out at the same time, the VOI is not explicitly calculated. However, when we solved the optimization problem (2)-(4) with a long planning horizon (K=9) and an end inventory constraint so that measurements were not possible, the objective function values were clearly lower than when the measurements were possible. The VOI calculated from the difference then varied from 309.6 €/ha (interest rate 1%) to 430.0 €/ha (interest rate 0%). These values are quite high, because the end inventory constraint was strict. It is notable that the VOI is this high, even though it only comes from a part of the stands, but is averaged over the whole estate. Moreover, the measurement costs have already been deducted from the objective function value, i.e., this VOI is the net worth of the measurements. (1994) was the first to propose a system where the survey design -which dictates the quality of information -was to be decided simultaneously with the harvest decisions. However, due to the complexity of the problem and calculation capacity of computers at the time, no practical applications have been proposed until now. In this paper, we have examined the possibility of making the measurement decisions simultaneously with the actual harvest decisions. In the studied case, we assumed that the measurement decision for each stand is made independently of the others.

Ståhl
Moreover, we assumed that the decision maker can decide the timing of the measurements as well as the timing of the harvests.
The results showed that even when the measurement cost was negligible, it was not profitable to measure all the stands. When the net present value with a 3% interest rate was maximized, the stands in which measurements were not profitable were mostly either young stands with a good growth rate or old stands with a low growth rate. In old stands, the reason was that it was clear that the optimal decision would be to harvest immediately. In young stands, the reason was that it was clear that the final harvest should be postponed as long as possible. Thus, stands with an age below the mean age were most profitable to measure. Including the end inventory constraint made this relationship even more clear, if the planning horizon was long.
With a long planning horizon, the younger the stand, the later the measurements were to be postponed. In one very young stand, however, it was optimal to measure the stand at the 8 th period when the planning horizon was 9 periods, as the stand was in some scenarios mature for the final harvest before the end of the planning horizon. The optimal measurement period was the period in which for at least one or more scenarios the harvest was profitable.
When the planning horizon was short and the problem included the end inventory constraint, the results were just the opposite: it was profitable to measure the oldest stands. One reason for this is that when the planning horizon is short, decisions of timings of harvests are (in relative terms) less important than with a longer planning horizon and, on the other hand, the end inventory constraint is more important than with the longer planning horizon. Moreover, in young stands the harvests will be carried out after the planning horizon and, therefore, it is not profitable to measure them. In that case, it was sensible to measure the old stands in order to ensure meeting the end inventory constraint. Duvemo et al. (2012) presented an application where the stands that were deemed as most probable selections for harvest in the next three-year period were selected to additional measurements. Our study confirms their result to some extent. It is only rational to measure the stands before the potential harvest time. However, if the measurements have no effect on the harvest timing as in the oldest stands in the problem with no end inventory constraint, measurements are not profitable.
Thus, there has to be uncertainty about the decision to make the measurement profitable. Likewise, if harvests are profitable in none of the scenarios in the planning horizon as was the case for the young stands with a 15-year planning horizon, the measurements are not profitable. Thus, if no harvest decisions are made, measurements are not profitable, either. Holmströn et al. (2003) compared 2% and 4% interest rates on a cost-plus-loss analysis. In their study, increasing the interest rate could either reduce or increase the losses. In this study, it was obvious that increasing the interest rate moved both the measurement and harvest decisions closer to the beginning of the planning horizon, which would explain increased losses in the cost-plus-loss analysis.
In our study, we assumed a prior distribution for the initial volume, but assumed that after the stand is measured, the information will be perfect. In a real life situation, this is not the case, but also after the additional measurements, there would be some uncertainty included. Even if the initial volume could be measured perfectly, there would always be uncertainty due to growth and yield predictions involved (Pietilä et al. 2010, Mäkinen et al. 2012). However, the remaining uncertainty does not affect the decisions, as we can interpret that the perfect information used here is the expected value under the remaining uncertainty. In the future, it would be possible to analyse the profitability of an inventory with varying accuracies. To solve that problem, Ståhl (1994) proposed using Bayesian multitemporal decision making. Then it will be possible to analyse the potential posterior distributions of measurements after the measurement. Such formulation would make the optimization problem again harder to solve than in the studied case.