A New Paradigm in Interactive Evolutionary Multiobjective Optimization

.


Introduction
Many multiobjective optimization problems (MOPs) are encountered in real life applications.Due to the conflicting nature of objectives in these problems, often there does not exist a single optimal solution.Instead, there exists a set of Pareto optimal solutions which represent trade-offs among the various objectives.
One family of methods, known as a posteriori methods, solve MOPs by finding a set of solutions which adequately represents the entire set of Pareto optimal solutions [18].Evolutionary algorithms (EAs) have been employed for this with a varying degree of success.An example of such methods is NSGA-II [6], which works well for solving problems with a low number of objectives, but the performance degrades as the number of objectives increases [9].Recent a posteriori EAs have tackled this problem in various ways [14].
However, increasing the number of objectives brings forth new challenges.As the number of objectives increases, the number of solutions required to adequately represent the set of Pareto optimal solutions (which may have an infinite number of solutions) increases exponentially [9,14].Regardless of the number of objectives, only one or few of these solutions are useful to a decision maker (DM) who wants to find and implement the desirable solution.Hence, when using a posteriori methods, computational resources are wasted on finding solutions that are not relevant.If objective function evaluations require time-consuming simulations or physical experiments, this problem is compounded and may lead to a waste of monetary resources as well.Moreover, these algorithms leave the task of choosing the final solution to the DM.As each solution is a vector in a high-dimensional objective space, comparing potentially thousands of solutions is a difficult task.This process can set a high cognitive load on the DM.
As DMs are experts in their domain, they usually have opinions or preferences regarding which solutions are desirable or undesirable to them.The preference information may be elucidated in the form of desirable objective function values, ranking of importances of objectives, pair-wise comparison of solutions and many other techniques [16].Recent advances in EAs try to incorporate this information to limit the scope of search of the EA.As the DM may learn new information about the problem during the solution process, allowing them to change their preferences during the solution process is desirable [11].Methods which allow such change are known as interactive methods [17][18][19]30].Ideally, this leads to less waste of resources as only solutions that are preferable to the DM are focused upon.Moreover, as only a small subset of the Pareto optimal solutions is to be represented at a time, the number of solutions to be shown to the DM is smaller, hence reducing the cognitive load.However, many interactive EAs have problems ranging from addition of hyperparameters to lack of diversity in the population, which can impair the optimization process [1,10].
The concept of utilizing the preferences of a DM in the solution process of an MOP is very popular in the field of multiple criteria decision making [18,19].One of the methods adopted is to use scalarization functions [18,20].These functions utilize the preferences of the DM to map the objective function values of solutions to scalar values, hence converting the MOP to one or more single objective optimization problems.Different scalarization functions interpret the same preference information differently, and may lead to different results [20].Different solutions can hence be obtained by solving multiple scalarization functions with the same preference information, as done in synchronous NIMBUS [21], or by slightly modifying the preference information multiple times and optimizing the same scalarization function, as done in the reference point algorithm [28].
In this paper, we explore the concept of using multiple scalarization functions to create a new space: Preference Incorporated Space (PIS).First, we study the mathematical properties of this new space.More specifically, we study the effect of optimizing in the PIS, introducing a new paradigm in preference based optimization: Interactive Optimization using Preference Incorporated Space (IOPIS) algorithm.The IOPIS algorithm enables us to make use of preference informa-tion with any a posteriori EA in an interactive way, as the preference information is encoded directly in the optimization problem in the PIS.It also enables us to control the dimension of the space in which dominance is judged, equal to the number of chosen scalarization functions.We then introduce the IOPIS algorithm, a modular algorithm that takes a given number of specified scalarization functions, and uses a DM's preferences to convert a generic MOP to an MOP in the preference incorporated space.This can then be solved interactively with any appropriate non-interactive EA together with DM's preferences.As examples, we implement two versions of the new algorithm: IOPIS/RVEA and IOPIS/NSGA-III, where the new problem in the PIS is optimized using decomposition based EAs RVEA [4] and NSGA-III [5], respectively.
The rest of the paper is organized as follows.Section 2 discusses the background of multiobjective optimization, EAs, and scalarization functions.Section 3 discusses the mathematical properties of the PIS and introduces the IOPIS algorithm with a visual explanation.In Section 4, we conduct an experimental study to compare the performances of the two implementations of the IOPIS algorithm with state of the art a posteriori EAs and their interactive variants and discuss the results.Finally, we draw conclusions in Section 5.All implementations and experimental data presented in this paper are open source and publicly available at https://desdeo.it.jyu.fi as a part of the DESDEO framework.

Multiobjective Optimization
An MOP can be defined as: where x = (x 1 , . . ., x n ) T are vectors of decision variables belonging to the feasible set S ⊂ R n .The k (≥ 2) objective functions f i map vectors of S to R. The objective function values f (x) = (f 1 (x), . . ., f k (x)) form objective vectors in the objective space R k .A solution x 1 ∈ S of problem (1) is said to dominate another solution x 2 ∈ S (written as f ) for all i = 1, . . ., k and f j (x 1 ) < f j (x 2 ) for at least one j = 1, . . ., k. Pareto optimal solutions are solutions of the MOP which are not dominated by any other solution in S. For this reason, they are also referred to as non-dominated solutions.Sometimes, it is desirable for DMs to consider a subset of Pareto optimal solutions with bounded trade-offs [18].Such solutions are called properly Pareto optimal solutions.We can define the set of solutions of problem (1), known as a Pareto set, as: where the subscript OS refers to the fact that the set was obtained by considering the objective vectors in the objective space.We can now define an ideal point and a nadir point of problem (1).These points represent the lower and upper bounds of the ranges of the objective function values among the Pareto optimal solutions, respectively.The ideal point z * = (z * 1 , . . ., z * k ) can be calculated as z * i = min x∈S f i (x).The nadir point z nad = (z nad 1 , . . ., z nad k ) can be calculated as z nad i = max x∈P S OS f i (x).It should be noted that calculating the nadir point requires the calculation of the P S OS .Hence, the calculation of the nadir point is tricky in problems with more than two objectives and needs to be estimated [8,18].Any objective vector z is defined to be achievable if z belongs to the set: By definition, the nadir point is an achievable point, while the ideal point is not.

Evolutionary Algorithms
Decomposition-based methods such as NSGA-III [5], RVEA [4], and many variants of MOEA/D [31] have become popular in the evolutionary multiobjective optimization community.These methods decompose the objective space into sections using directional vectors called reference vectors, reference points, or weights.For simplicity, in what follows, we will be using the term reference vectors (RVs).These RVs, usually spread uniformly in the objective space, represent individual single-objective optimization problems.The RVs are typically generated using a simplex lattice design, and the number of RVs is equal to where l is a parameter controlling the density of the RVs.Subsets of the population which lie in the decomposed region associated with an RV (in the objective space) evolve in the direction of that RV based on scalar fitness values calculated using the RV and their objective function values.
As mentioned in the introduction, EAs which approximate the entire Pareto front exhibit many downsides.Methods have been proposed to get around those downsides by incorporating the preferences of the DM in an interactive fashion, see, e.g.[23,26,30].One of the ways to incorporate a DM's preferences in decomposition-based EAs is to manipulate the spread of the RVs to account for the preferences [4,15].In many such methods, the DM is required to provide their preferences in the form of a reference point in the objective space [12,26,27].The components of a reference point are desirable values of each objective function, which may or may not be achievable.Then, uniformly spread RVs are translated towards this point.This translation introduces a new scalar hyperparameter which controls the final spread of the RVs around the reference point.This method introduces a few new problems, though.Firstly, the effect of changing the value of the newly introduced hyperparameter may be difficult for a DM to understand.But an appropriate value for this hyperparameter is important as it has been observed that a small spread of RVs may lead to a degradation in population diversity, which prohibits the convergence of the EA [1,10].

Achievement Scalarizing Functions
As mentioned, scalarization functions are functions that map a vector to a realvalued scalar.The weighted sum function and the Chebyshev function used by MOEA/D and the angle-penalized distance function used by RVEA are examples of scalarization functions [4,31].To be regarded as a good scalarization function, it must have some desirable properties [25].Firstly, the solutions obtained by optimizing the scalarization function should be Pareto optimal.Secondly, these solutions should be satisfactory according to the preferences of a DM, if the preferences are feasible.Finally, any Pareto optimal solution should be discoverable by changing the preferences provided by the DM.
Unfortunately, no single scalarization function satisfies the three conditions concurrently [25].However, if we relax the conditions to only account for properly Pareto optimal solutions, rather than all Pareto optimal solutions, then all three conditions can be satisfied by some scalarization functions.In this paper, we focus on a subclass of scalarization functions, known as achievement scalarizing functions (shortened to achievement function) [29].An achievement function is a continuous function s : R k → R. Achievement functions are characterized by either being strictly increasing and order-representing, or strongly increasing and order-approximating [29].We will focus on the latter kind as they satisfy all three relaxed desirable properties.
Then for any order-approximating achievement function s : R k → R, we have From Theorem 1, it can be concluded that solving the following problem: will lead to a Pareto optimal solution of problem (1) [28,29].
A general formulation of an (order-approximating) achievement function is: where ρ is a small positive scalar and µ i are positive scalars and z ∈ R k is a reference point provided by the DM [24].Minimizing s(f (x), z) has the effect of optimizing problem (1) by sliding a cone along the line z + λµ, where λ ∈ R, so that a minimum (> 0) number of solutions lie in the cone [20].Bounds of the trade-offs in the solutions obtained by ( 5) can be controlled by changing ρ [28].
The general formulation (6) represents achievement functions that can take preferences in other forms, not just reference points [24].Different achievement functions differ in how µ is set, which means they are optimizing along different directions, albeit starting from the same reference point z.Hence, they may lead to different solutions even if the same reference point is provided to them.For the implementation of the IOPIS algorithm, we focus on the GUESS [2] and STOM [22] scalarization functions (based on e.g., [3,20]).For the GUESS function, µ i = z nad i −z i and for the STOM function, µ i = zi −z * i .As µ i > 0 for all i = 1, . . ., k, it follows that for these two achievement functions, z * i < zi < z nad i for all i = 1, . . ., k.Another achievement function of note is the achievement scalarizing function (ASF) used in the reference point method [28], which is used in the experimental study section.For ASF, 3 Optimization in Preference Incorporated Space

Properties of Preference Incorporated Space
Let there be a set of achievement functions s = {s 1 , . . ., s q } with q ≥ 2. Then we can define a PIS as the set {s(f (x), z) ∈ R q }, and a new MOP in the PIS as: Two solutions x 1 and x 2 can now be compared in two spaces.As stated in Section 2.1, a solution x 1 is said to dominate another solution x 2 in the objective space if f (x 1 ) f (x 2 ).A solution x 1 is said to dominate x 2 in the PIS if s(f (x 1 ), z) s(f (x 2 ), z).Similar to (2), we can define the solutions to problem (7), i.e., the Pareto set obtained by optimizing in the PIS as: We modify the desirable properties of scalarization functions as stated in [25] to reflect properties related to the PIS as: 1. Pareto optimal solutions in the PIS remain Pareto optimal in the objective space.2. Pareto optimal solutions in the PIS follow the preference given by the DM in the objective space.3. Any properly Pareto optimal solution of problem (1) can be discovered by changing the reference point of problem (7).
It can be shown that the first condition is true regardless of the choice or number of the achievement functions.
Theorem 2. Let P S P IS be the set of Pareto optimal solutions of problem (7).Let P S OS be the set of Pareto optimal solutions of problem (1).Then, Proof.Suppose x ∈ P S P IS but x ∈ P S OS .Therefore, there exists some x * such that f (x * ) f (x).Thus, according to Theorem 1, s i z(f (x * )) < s i z(f (x)) for all i ∈ {1, . . ., q}.Hence, s z(f (x * )) s z(f (x)), which contradicts x ∈ P S P IS .
The set P S P IS represents the trade-offs between the values of the various achievement functions in problem (7).As the different achievement functions are different interpretations of the same preference information obtained from a DM, the solutions in the set P S P IS represent the trade-offs between those interpretations.Hence, it can be said that solutions obtained by solving problem (7) follow the preferences given by the DM.Moreover, as P S P IS includes solutions which minimize individual achievement functions present in PIS, and as any properly Pareto optimal solution in the objective space can be found using the achievement functions by changing the reference points, it follows that the third condition also holds.Note that these results are valid for all order-approximating scalarization functions, and not just STOM and GUESS functions.
Solving the MOP in the PIS has a few benefits compared to solving the MOP in the objective space.Firstly, we can control the dimension of the PIS, which is equal to the number of achievement functions chosen.This means that, given some number (≥ 2) of achievement functions, we can use any multiobjective EA (or biobjective EA, as PIS can be a two dimensional space) to solve problem (7), regardless of the number of objectives in the original problem.Secondly, controlling the dimension of the optimization problem also gives us an indirect control over the number of function evaluations needed by the EA during the optimization process.This is because the number of solutions required to adequately represent the set of Pareto optimal solutions increases with increasing the dimension of the objective space (for problem (1)) or PIS (for problem (7)).Hence, choosing fewer achievement functions than k is an easy way to reduce the number of function evaluations needed by an EA to solve an optimization problem.Thirdly, by incorporating the preference information in the PIS, we gain the ability to use any non-interactive EA in an interactive fashion.This modularity enables easy use of well-tested EAs without needing to change them to enable interaction with a DM.

The IOPIS algorithm
The IOPIS algorithm takes a formulation of the optimization problem of the form (1) as input.The algorithm also takes as its input a set of achievement functions s.The ideal point z * and the nadir point z nad of the problem are also taken as inputs.As the calculation of the nadir point can be tricky in problems with more than two objectives, approximate values of the nadir point (and ideal point) can also be used.The solutions are generated between these two points.Hence, the DM can use their expertise to give the approximate values of the points within which to search.The interactive solution process begins when these points are shown to the DM.The following four steps are repeated iteratively until the DM has received a satisfactory solution: 1. Preference elicitation: The DM is asked to give their preferences as a reference point based on the information currently available to them.2. Problem creation: Using the original objectives, the known estimates of the ideal and nadir points, the reference point, and the set of achievement functions, a new optimization problem is created in the PIS, as shown in (7). 3. Problem solution: Solve the problem created in the previous step with an EA.If this is the first iteration of the algorithm, start the EA with a new population, generated in a manner specific to the selected EA.In subsequent iterations, the population from the previous iteration is used as the starting population.4. Display solutions: Display the solutions obtained in step 3 to the DM.The DM can indicate the maximum number of solutions to be shown at a time in step 4. If the number of solutions generated in step 3 exceeds the limit, e.g., clustering can be applied before displaying solutions.

Visual Interpretation
Even though the IOPIS algorithm is designed for optimization problems with more than two objectives, a biobjective problem is easily visualizable to demonstrate the algorithm.Here we use the ZDT1 problem [32] to study the effect of the choice of the reference point on the solutions returned by one implementation of the IOPIS algorithm.In this implementation, STOM and GUESS scalarization functions are used with NSGA-III to solve the resulting MOP in the PIS.Four different reference points are given to the algorithm.Each subfigure in Figure 1 shows the corresponding objective vectors returned by the IOPIS algorithm.In each subfigure, the blue dashed curve represents the true Pareto front of the problem and the red point is the reference point.The green line represents the direction along which the STOM function optimizes, whereas the red line represents the direction along which the GUESS function optimizes.1a): There is no solution that achieves values close to the reference point.Minimizing each achievement function individually returns the solution corresponding to the point of intersection of the line representing that achievement function and the Pareto front.As can be seen, the solutions returned by the algorithm include those solutions, and nondominated solutions in between.2. Reference point is on the line joining the ideal and nadir point (Figure 1b):

The reference point is not achievable (Figure
Due to the nature of the chosen achievement functions, only a single solution is returned by the algorithm.This is because if the reference point is on the line connecting the ideal and nadir, both achievement functions optimize along the same line.This behaviour can be changed by choosing a different set of achievement functions to form the PIS, or by shifting the reference point slightly to increase the diversity of the solutions.3. The reference point is achievable and dominated (Figure 1c): The algorithm returns a set of solutions that satisfy the given reference point.As in the first case, optimal solutions of the individual achievement functions are included.4. The reference point is close to the front (Figure 1d): Bringing the reference point closer to the front has the effect of reducing the spread of the solutions returned by the algorithm, hence solutions are returned in a narrower region.
The spread of the solutions is controlled by the position of the reference point.A DM who does not know very well what is realistic may provide a reference point far from the front.In such cases, the algorithm will return a diverse set of solutions (with an exception and possible resolutions discussed in point 2. above).After being provided with such solutions, the DM will have more knowledge about the trade-offs involved among the solutions, and may want to fine-tune their search in a narrow region.This is easily accomplished by providing a reference point closer to the now known region of the front.This methodology of control is similar to the one proposed in the reference point method [28].

Experimental Setup
In this study, two versions of the interactive IOPIS algorithm were implemented.IOPIS/NSGA-III uses NSGA-III to solve the problem in the PIS, while IOPIS/ RVEA uses RVEA.In both implementations, the STOM and GUESS functions are used as achievement functions to form the PIS.These algorithms were compared against a posteriori RVEA [4] and NSGA-III [5].Interactive versions of the two a posteriori algorithms (iRVEA and iNSGA-III) were also implemented and included in this study.The details of iRVEA can be found in [12] and iNSGA-III was implemented in a similar manner.RVEA and NSGA-III were chosen for this study as they have been shown to work well in problems with k > 2 [4,5].Even though the problem in the PIS here is biobjective, the implementations of the IOPIS algorithm use RVEA and NSGA-III to ensure that only the effect of optimizing in the PIS is reflected in the results, and not the choice of the EA.
The algorithms were compared using the DTLZ{2-4} [7] and WFG{1-9} [13] problems, with 3-9 objectives each.The number of variables was kept as 10+k−1, as recommended in [7].For the IOPIS EAs, each component of the nadir point was randomly generated from a halfnormal distribution with the underlying normal distribution centered around 1 and having a scale of 0.15, then being scaled up by a factor equal to the true nadir point components (1 for the DTLZ problems, varying values for the WFG problems).This led to the generation of nadir points with components up to 50% worse than the true nadir point.This was done to test the performance of the IOPIS EAs in cases where only approximate values of the nadir point are available.The true ideal point was provided to the IOPIS EAs, as the calculation of it is relatively simpler.
Each EA was run for four iterations.For each EA, an iteration consisted of a constant number of generations: One of {100, 150, 200, 250} for the DTLZ problems, 100 for the WFG problems (The reason for using only 100 generations per iteration for WFG problems will be discussed in the next subsection.).All other hyperparameters, such as the number of solutions or algorithm specific hyperparameters, were set to values recommended in their respective papers.In each iteration, all interactive EAs received a common reference point randomly generated in a hyperbox with the ideal and nadir points as opposing vertices.The non-interactive EAs were ran through the iterations uninterrupted.Hence 336 tests with the DTLZ problems 1 and 252 tests with the WFG problems 2 were conducted for each of the six EAs.The algorithms were compared based on the optimality and the preferability of the solutions returned at the end of each iteration, and the number of function evaluations conducted.

Experimental Results
The Pareto optimal solutions of the DTLZ2-4 problems form a hypersphere in the objective space, centered around the ideal point, which is the origin, and a radius of one.Hence, calculating the Euclidean norm of the objective vectors of the solutions returned by the EAs is a measure of optimality, with lower values of the norm being closer to the Pareto front.However, these values cannot be compared between problems, nor can they be compared for the same problem but with a different number of objectives.Instead, the median of the norm of the solutions returned at the end of each test was calculated for each of the six EAs.These values were then used to rank each EA from 1 to 6 for every test, lower ranks being given to better (lower) median norm values.A similar procedure was followed for comparing methods based on the preferability of the solutions returned by the EAs.The achievement function used in the reference point method (ASF) [28] was chosen as the metric of preferability.The median ASF values of the solutions were then used to rank the different EAs.The true nadir point was used in the calculation of the ASF values.For the tests involving the WFG 1-9 problems, ranks were only calculated based on median ASF values.The Pareto fronts for these problems are not spherical, hence ranks based on median norm values are not relevant.Heatmaps of the ranks based on ASF and norm are shown in Figures 2a and  2b, respectively.A paired colormap was used in the creation of the heatmaps which gave ranks 1 and 2 a blue hue, ranks 3 and 4 a green hue, and ranks 5 and 6 a red hue.This choice brings forward a clear clustering in the rankings of the 6 EAs.As seen in Figure 2a, iRVEA and iNSGA-III tend to return more preferable solutions than their non-interactive counterparts.This behaviour is expected as RVEA and NSGA-III focus on the entire Pareto front, whereas iRVEA and iNSGA-III focus on a limited region.However, as seen in Figure 2b, the solutions returned by iRVEA were farther away from the Pareto front compared to RVEA.This is because as iRVEA has a much lower diversity of solutions compared to RVEA, which hampers the optimization process.
In both heatmaps, the PIS based EAs get ranks 1 or 2 in most tests, i.e., these algorithms returned solutions that were more preferable, and closer to the Pareto front than the other four algorithms.It should also be noted that IOPIS/NSGA-III performed better than IOPIS/RVEA in most cases.Further investigation of the PIS is required on this.The results obtained on the DTLZ3 problem are also interesting.RVEA returned solutions which were closer to the Pareto front, compared to the other methods.While the IOPIS EAs still outperformed RVEA based on the preferability of the solutions, RVEA outperformed an interactive method iNSGA-III.This is happening as iNSGA-III failed to converge to the Pareto front because of the lack of diversity of the solutions, and got stuck on one of the local fronts of the problem.While there was a correlation between the problem type and the performance of the methods (IOPIS EAs got ranks one or two more often in the WFG problems compared to the DTLZ problems), there was no correlation between the performance of the method and the number of objectives.In the case of the DTLZ problems, the performance was also not dependent on the number of generations per iteration, i.e., there was no improvement in the results after a hundred generations (per iteration).This is why the number of generations per iterations was fixed to 100 for the tests involving the WFG problems.
The final metric of comparison is the number of function evaluations conducted.Given a constant number of generations, the number of function evaluations is linearly correlated with the population size, which is equal to the number of RVs in the EAs considered in this paper.For RVEA, NSGA-III, iRVEA and iNSGA-III, the RVs (and hence the number of function evaluations) increase exponentially with an increasing number of objectives.As the IOPIS algorithms operate in the low-dimensional PIS, the number of reference vectors, and hence the number of function evaluations, is independent of the number of objectives, and significantly lower than that for the other algorithms considered in the study.It should also be noted that for all of the tests, only an approximate nadir point was provided to the IOPIS EAs, and yet the IOPIS EAs obtain better results than the current state of the art algorithms.

Conclusions
A new space PIS, where preferences are incorporated, was proposed as a new paradigm of solving MOPs interactively.This new space makes the creation of interactive EAs very modular, as the algorithm only needs to modify the problem to enable interactivity, rather than the EA itself.As examples, this enabled easy creation of the IOPIS/NSGA-III and IOPIS/RVEA implementations.
The results obtained in the numerical experiments were very promising.The new interactive EAs outperformed standalone NSGA-III and RVEA, as well as their interactive versions.The solutions obtained by the IOPIS EAs were closer to the Pareto optimal front, more preferable based on the reference point and spent less computational resources in the form of function evaluations.Further study of the landscape of the PIS is needed.The effect of choosing different achievement functions, their implications on the interaction mechanism by a DM and the solutions returned by the algorithm also needs to be studied.

Fig. 1 .
Fig. 1.Solutions obtained for various reference points for the ZDT1 problem.

Fig. 2 .
Fig. 2. Heatmaps of ranks of algorithms based on the median ASF value or median norm value of the solutions obtained.