Supporting Flexible Processes Through Recommendations Based on History Helen Schonenberg, Barbara Weber, Boudewijn van Dongen and Wil van der Aalst Eindhoven University of Technology, Eindhoven, The Netherlands im hschonenberg, b f v dongen, w.m. p v.d. aalst@tue. nl] otue nl Department of Computer Science, University of Innsbruck, Austria Barbara. Weberquibkacat Abstract. In today' s fast changing business environment flexible Pro- cess Aware Information Systems(PAISs) are required to allow companies to rapidly adjust their business processes to changes in the environment However, increasing flexibility in large PAISs usually leads to less guid- ance for its users and consequently requires more experienced users. To allow for flexible systems with a high degree of support, intelligent user ssistance is required. In this paper we propose a recommendation ser- vice, which, when used in combination with fexible PAISs, can support end users during process execution by giving recommendations on pos- sible next steps. Recommendations are generated based on similar past process executions by considering the specific optimization goals. In this paper we also evaluate the proposed recommendation service, by means of experiments 1 Introduction In todays fast changing business environment, flexible Process Aware Informa- tion Systems(PAISs)are required to allow companies to rapidly adjust thei business processes to changes in the environment 7. PAISs offer promising ment[13), case handling systems [16] and declarative processes [11, 12(or.e perspectives and there are several paradigms, e. g, adaptive process manag overview see 3, 18, 20) In general, in flexible PAis it occurs frequently that users working on a case, i.e., a process instance, have the option to decide between several activities that are enabled for that case. However, for all flexibility approaches, the user support provided by the PAIS decreases with increasing flexibility(cf. Fig. 1) since more options are available, requiring users to have in-depth knowledge bout the processes they are working on. Traditionally, this problem is solved by educating users(e.g, by making them more aware of the context in which a case is executed), or by restricting the PAIS by introducing more and more constraints on the order of activities and thus sacrificing flexibility. Both options however, are not satisfactory and limit the practical application of flexible PAIss
Supporting Flexible Processes Through Recommendations Based on History Helen Schonenberg, Barbara Weber, Boudewijn van Dongen and Wil van der Aalst Eindhoven University of Technology, Eindhoven, The Netherlands {m.h.schonenberg, b.f.v.dongen, w.m.p.v.d.aalst@tue.nl}@tue.nl Department of Computer Science, University of Innsbruck, Austria Barbara.Weber@uibk.ac.at Abstract. In today’s fast changing business environment flexible Process Aware Information Systems (PAISs) are required to allow companies to rapidly adjust their business processes to changes in the environment. However, increasing flexibility in large PAISs usually leads to less guidance for its users and consequently requires more experienced users. To allow for flexible systems with a high degree of support, intelligent user assistance is required. In this paper we propose a recommendation service, which, when used in combination with flexible PAISs, can support end users during process execution by giving recommendations on possible next steps. Recommendations are generated based on similar past process executions by considering the specific optimization goals. In this paper we also evaluate the proposed recommendation service, by means of experiments. 1 Introduction In todays fast changing business environment, flexible Process Aware Information Systems (PAISs) are required to allow companies to rapidly adjust their business processes to changes in the environment [7]. PAISs offer promising perspectives and there are several paradigms, e.g., adaptive process management [13], case handling systems [16] and declarative processes [11, 12] (for an overview see [3, 18, 20]). In general, in flexible PAIS it occurs frequently that users working on a case, i.e., a process instance, have the option to decide between several activities that are enabled for that case. However, for all flexibility approaches, the user support provided by the PAIS decreases with increasing flexibility (cf. Fig. 1), since more options are available, requiring users to have in-depth knowledge about the processes they are working on. Traditionally, this problem is solved by educating users (e.g., by making them more aware of the context in which a case is executed), or by restricting the PAIS by introducing more and more constraints on the order of activities and thus sacrificing flexibility. Both options, however, are not satisfactory and limit the practical application of flexible PAISs
In this paper, we present an approach for intelligent user assistance which llows PAISs to overcome this problem and to provide a better balance be- tween flexibility and support. We use event logs of PAISs to gain insights into the process being supported without involving a process analyst and we pro- pose a tooling framework to provide continuously improving support for users of flexible PAISs. At the basis of our approach lie so-called recommendations A recommendation provides information to a user about how he should proceed with a partial case (i.e, a case that was started but not completed yet ),to chieve a certain goal(e.g, minimizing cycle time, or maximizing profit). In this paper we discuss several methods for calculating log-based recommendations. In addition, we describe the implementation of our approach as recommendation service and its evaluation. The remainder of this paper is structured as follows In Section 2, we present the requirements and an overview of the recommenda- tion service. Then, in Section 3 we define a log-based recommendation service. In Section 4, we describe the experiment we conducted to evaluate whether recom- mendations indeed help to achieve a particular goal. Finally, we discuss related work in Section 5 and provide conclusions in Section 6 2 Overview Fig. 2 illustrates the envisioned support of users of flexible PAISs through a recommendation service. In general, each business process to be supported is described as process model in the respective PAIS. We consider both imperative and declarative process models. In fact, our approach is most useful when the process model provides the user a lot freedom to manoeuvre, i.e., multiple activ- ities are enabled during execution of a case. At run-time, cases are created and executed considering the constraints imposed by the process model. In addition the PAiS records information about executed activities in event logs. Typically, event logs contain information about start and completion of activities, their ordering, resources which executed them and the case they belong to 1 As illustrated in Fig. 2, the recommendation service is initiated by a request from the user for recommendations on possible next activities to execute. In this request, the user sends the recommendation service information about the par- tially executed case, i.e.,(1)the currently enabled activities, and(2 )the history of executed activities, which we call the partial trace. Information about the partial trace is required because the decision which activities to perform next for a particular case usually depends on the activities already performed for his case. In addition, only enabled activities are considered to ensure that no recommendations are made that violate the constraints imposed by the process model. The recommendation service then provides the PAis a recommendation result, ie. an ordering of recommendations where each recommendation refers to one activity and some quality attributes(e. g, expected outcome) explaining the recommendation. Recommendations are ordered such that the first recom- mendation in the list is most likely to help the user achieving his goal,i.e
In this paper, we present an approach for intelligent user assistance which allows PAISs to overcome this problem and to provide a better balance between flexibility and support. We use event logs of PAISs to gain insights into the process being supported without involving a process analyst and we propose a tooling framework to provide continuously improving support for users of flexible PAISs. At the basis of our approach lie so-called recommendations. A recommendation provides information to a user about how he should proceed with a partial case (i.e., a case that was started but not completed yet), to achieve a certain goal (e.g., minimizing cycle time, or maximizing profit). In this paper we discuss several methods for calculating log-based recommendations. In addition, we describe the implementation of our approach as recommendation service and its evaluation. The remainder of this paper is structured as follows. In Section 2, we present the requirements and an overview of the recommendation service. Then, in Section 3 we define a log-based recommendation service. In Section 4, we describe the experiment we conducted to evaluate whether recommendations indeed help to achieve a particular goal. Finally, we discuss related work in Section 5 and provide conclusions in Section 6. 2 Overview Fig. 2 illustrates the envisioned support of users of flexible PAISs through a recommendation service. In general, each business process to be supported is described as process model in the respective PAIS. We consider both imperative and declarative process models. In fact, our approach is most useful when the process model provides the user a lot freedom to manoeuvre, i.e., multiple activities are enabled during execution of a case. At run-time, cases are created and executed considering the constraints imposed by the process model. In addition, the PAIS records information about executed activities in event logs. Typically, event logs contain information about start and completion of activities, their ordering, resources which executed them and the case they belong to [1]. As illustrated in Fig. 2, the recommendation service is initiated by a request from the user for recommendations on possible next activities to execute. In this request, the user sends the recommendation service information about the partially executed case, i.e., (1) the currently enabled activities, and (2) the history of executed activities, which we call the partial trace. Information about the partial trace is required because the decision which activities to perform next for a particular case usually depends on the activities already performed for this case. In addition, only enabled activities are considered to ensure that no recommendations are made that violate the constraints imposed by the process model. The recommendation service then provides the PAIS a recommendation result, i.e., an ordering of recommendations where each recommendation refers to one activity and some quality attributes (e.g., expected outcome) explaining the recommendation. Recommendations are ordered such that the first recommendation in the list is most likely to help the user achieving his goal, i.e
eng ne Fig. 1. PAIS trade-offs 5. Fig. 2. An overview of the recommendation service. optimizing a certain target, such as profit, cost, or cycle time. Different users can have different targets, resulting in different recommendations As an example we describe a fictive process of applying for a building permit at a town hall. Initially, the employee has to do several tasks: (A) bill registration fee(B)register the application details, (C) initiate permission procedure, (D) announce the application in local newspaper, and(E) inform applicant. Th employee can decide in which order to execute these tasks. Ideally, the employee finishes these as soon as possible. All tasks have a fixed duration, however, tasks B and C use the same database application and if is directly followed by C, then the combined duration of the tasks is much shorter, since there is no closing- time for B an not set-up time for C, moreover C can use the data provided by B, without data re-entry. The recommendation service can guide employees to execute in the faster order of tasks In this simple example the use of recommendations seems to be an overkill the user only has to select among a limited set of options. In the presence of real life flexible processes, with increasing complexity there are so ptions for users, that user support becomes fundamental. At the same time, giving recommendations based on extracted knowledge from execution logs can provide knowledge that was not available during the design of the process 3 Log- Based Recommendation Service In this section, we present a concrete definition of a log-based recommendation service for providing users with recommendations on next possible activities to execute. Recommendations for an enabled activity provide predictive informa- tion about the user goal, based on observations from the past, i.e., fully com- ed by their target value(e or pr that have been stored in an event log. The log-based recommendation service requires the presence of an event log that contains such information about cases that have been executed for a certain process
ad-hoc workflow groupware production workflow case handling low high flexibility support Fig. 1. PAIS trade-offs [5]. Recommendation Request partial case enabled activities Recommendation Result ordering enabled activities Optimization goal Event log logs uses Fig. 2. An overview of the recommendation service. optimizing a certain target, such as profit, cost, or cycle time. Different users can have different targets, resulting in different recommendations. As an example we describe a fictive process of applying for a building permit at a town hall. Initially, the employee has to do several tasks; (A) bill registration fee (B) register the application details, (C) initiate permission procedure, (D) announce the application in local newspaper, and (E) inform applicant. The employee can decide in which order to execute these tasks. Ideally, the employee finishes these as soon as possible. All tasks have a fixed duration, however, tasks B and C use the same database application and if B is directly followed by C, then the combined duration of the tasks is much shorter, since there is no closingtime for B an not set-up time for C, moreover C can use the data provided by B, without data re-entry. The recommendation service can guide employees to execute in the faster order of tasks. In this simple example the use of recommendations seems to be an overkill as the user only has to select among a limited set of options. In the presence of real life flexible processes, with increasing complexity there are so many options for users, that user support becomes fundamental. At the same time, giving recommendations based on extracted knowledge from execution logs can provide knowledge that was not available during the design of the process. 3 Log-Based Recommendation Service In this section, we present a concrete definition of a log-based recommendation service for providing users with recommendations on next possible activities to execute. Recommendations for an enabled activity provide predictive information about the user goal, based on observations from the past, i.e., fully completed traces accompanied by their target value (e.g., cost, cycle time, or profit), that have been stored in an event log. The log-based recommendation service requires the presence of an event log that contains such information about cases that have been executed for a certain process
3.1 Preliminaries Let A be a set of activities. A'" denotes a set of finite sequences over A. A trace o E A'is a finite sequence of activities, where o = n is the length of the sequence Sequences are denoted as a=(a1, a2, -. an) and we denote On traces, we define the standard set of operators Definition1( Trace operators).Leta:{1,…,n}→ A and o′:{1,…,m}→ b be traces with a=(a1, a2,..., an)and o'=(b1, b2,..., bm) Prefix 0≤a→n≤m∧V1<k<na1=b Concatenation o=(a1,a2,……,an,b1,b2,……,bm) Membership a∈a→1<i<na;=a Parikh vector par(o(a)=#o<isn ai=a The Parikh vector par(o)(a) denotes the number of occurrences of a in race o, e. g, par((a, b, c, a, b, c, d)(a)=2 For multi-sets(bags), we introduce standard notation to denote the universe of multi-sets over a given set. Let s be a set, then the universe of multi-sets over S is denoted by with X∈B(S), denoted as X:S→ n is a multi set,where for all s E S holds that X(s) denotes the number of occur of s in X(s). We will use [a, b2, el to denote the multi-set of one a, two b's and three cs as a shorthand for the multi-set X E B(A)where A= a, b, ck X(a)=l, X(b)=2, X(c)=3. Furthermore, multi-set operators such as for for union b, intersection A, and submulti-set C, c are defined in a straightforward way and can handle a mixture of sets and multi-sets Definition 2(Event log). Let a be a set of activities. An event log L E B(4*) a multi-set of traces referring to the activities in A Recall that each recommendation contains predictive information regarding the user goal. For now, we assume that this goal can be captured by a function on a trace, i.e., each trace o in an event log has a target value(e. g, cost, cycle time, or profit )attached to it Definition 3(Target Function). Let a be a set of activities and o E A'a sequence of activities. We define T(o)ER+ to represent the target value of the sEquence o ote that T is not a function, as similar sequences might have different values attached to them. However T is total, i.e. it provides a value for all sequences 3.2 Recommendations A recommendation is initiated by a recommendation request, which consists of a partial trace and a set of enabled activities. Formally, we define a recommen- dation request as follows
3.1 Preliminaries Let A be a set of activities. A∗ denotes a set of finite sequences over A. A trace σ ∈ A∗ is a finite sequence of activities, where |σ| = n is the length of the sequence. Sequences are denoted as σ = ha1, a2, . . . , ani and we denote ∀1≤i≤n σ(i) = ai . On traces, we define the standard set of operators. Definition 1 (Trace operators). Let σ : {1, . . . , n} → A and σ 0 : {1, . . . , m} → B be traces with σ = ha1, a2, . . . , ani and σ 0 = hb1, b2, . . . , bmi. Prefix σ ≤ σ 0 ⇐⇒ n ≤ m ∧ ∀1≤i≤n ai = bi Concatenation σaσ 0 = ha1, a2, . . . , an, b1, b2, . . . , bmi Membership a ∈ σ ⇐⇒ ∃1≤i≤n ai = a Parikh vector par(σ)(a) = #0≤i≤n ai = a. The Parikh vector par(σ)(a) denotes the number of occurrences of a in a trace σ, e.g., par(ha, b, c, a, b, c, di)(a) = 2. For multi-sets (bags), we introduce standard notation to denote the universe of multi-sets over a given set. Let S be a set, then the universe of multi-sets over S is denoted by B(S), with X ∈ B(S), denoted as X : S → N is a multiset, where for all s ∈ S holds that X(s) denotes the number of occurrences of s in X(s). We will use Ja, b2 , c3 K to denote the multi-set of one a, two b’s and three c’s as a shorthand for the multi-set X ∈ B(A) where A = {a, b, c}, X(a) = 1, X(b) = 2, X(c) = 3. Furthermore, multi-set operators such as for for union ], intersection C, and submulti-set v, ❁ are defined in a straightforward way and can handle a mixture of sets and multi-sets. Definition 2 (Event log). Let A be a set of activities. An event log L ∈ B(A∗ ) is a multi-set of traces referring to the activities in A. Recall that each recommendation contains predictive information regarding the user goal. For now, we assume that this goal can be captured by a function on a trace, i.e., each trace σ in an event log has a target value (e.g., cost, cycle time, or profit) attached to it. Definition 3 (Target Function). Let A be a set of activities and σ ∈ A∗ a sequence of activities. We define τ (σ) ∈ R + to represent the target value of the sequence σ. Note that τ is not a function, as similar sequences might have different values attached to them. However, τ is total, i.e. it provides a value for all sequences. 3.2 Recommendations A recommendation is initiated by a recommendation request, which consists of a partial trace and a set of enabled activities. Formally, we define a recommendation request as follows
Definition 4(Recommendation request). Let A be a set of activities and P* a partial trace. Furthermore, let e c a be a set of enabled activities We call r=(p, E) An activity accompanied by predictive information regarding the user goal is called a recommendation. For each enabled activity, we determine the expected target value when doing this activity(do), and the expected target value for alternatives of the enabled activity, i.e., other enabled activities( dont ) Precise definitions of do and dont are given in Definitions 10 and 11. A recommendation result is an ordering over recommendations. Definition 5(Recommendations). Let a be a set of activities and L E quest with E SA, E=n and e e e an enabled activity -(e, do(e, P, L), dont(e, P, L))E EXRXR is a recommendation. We use to denote the universe of recommendations A recommendation result R=((el, do(e1, P, L), dont(e1, P, L)), (e2, do(e2, P, L),dont(e2, P, L)),.,(en, do(en, P, L), dont(en, P, L)))is a sequence of rec ommendations, such that RESi and v, The nature of the ordering over recommendations is kept abstract, however, provide a possible ordering for a recommendation result in Example 1, Section 3.6. In the next section we describe how recommendations are generated by the recommendation service based on an existing event log L 3.3 Trace abstraction When generating log-based recommendations only those traces from the event log should be considered, which are relevant for determining the predictive infor- mation of an enabled activity. From those traces the ones with a high degree of matching with the partial trace execution should be weighted higher than those with small or no match To determine which log traces are relevant to provide recommendations for a given partial trace and to weight them according to their degree of match ng we need suitable comparison mechanisms for traces. Our recommendation service provides three different trace abstractions based on which traces can be compared, namely, prefix, set and multi-set abstraction. The prefix abstraction basically allows for a direct comparison between the partial trace and a log trace In practice such a direct comparison is not always relevant, e. g, when the order ing, or frequency of activities is not important. Therefore we provide with set and multi-set two additional abstractions. They are independent of the domain context, e. g, they do not assume the process to be a procurement process or an invoice handling process 17 Definition 6(Trace abstraction). Let A be a set of activities, LE B(Abe an event log and oEL be a trace. o, o denotes the prefer abstraction of o, os =al aeo denotes the set abstraction of o and om =par(o) denotes the multi-set abstraction of o, i.e., for all a e o holds that m(a)=par(o(a)
Definition 4 (Recommendation request). Let A be a set of activities and ρ ∈ A∗ a partial trace. Furthermore, let E ⊆ A be a set of enabled activities. We call r = (ρ, E) a recommendation request. An activity accompanied by predictive information regarding the user goal is called a recommendation. For each enabled activity, we determine the expected target value when doing this activity (do), and the expected target value for alternatives of the enabled activity, i.e., other enabled activities (dont). Precise definitions of do and dont are given in Definitions 10 and 11. A recommendation result is an ordering over recommendations. Definition 5 (Recommendations). Let A be a set of activities and L ∈ B(A∗ ) an event log over A. Furthermore, let (ρ, E) be a recommendation request with E ⊆ A, |E| = n and e ∈ E an enabled activity. – e, do(e, ρ, L), dont(e, ρ, L) ∈ E × R × R is a recommendation. We use R to denote the universe of recommendations. – A recommendation result R = h e1, do(e1, ρ, L), dont(e1, ρ, L) , e2, do(e2, ρ, L), dont(e2, ρ, L) , . . . , en, do(en, ρ, L), dont(en, ρ, L) i is a sequence of recommendations, such that R ∈ R∗ and ∀1≤i<j≤n ei 6= ej . The nature of the ordering over recommendations is kept abstract, however, we provide a possible ordering for a recommendation result in Example 1, Section 3.6. In the next section we describe how recommendations are generated by the recommendation service based on an existing event log L. 3.3 Trace Abstraction When generating log-based recommendations only those traces from the event log should be considered, which are relevant for determining the predictive information of an enabled activity. From those traces the ones with a high degree of matching with the partial trace execution should be weighted higher than those with small or no match. To determine which log traces are relevant to provide recommendations for a given partial trace and to weight them according to their degree of matching we need suitable comparison mechanisms for traces. Our recommendation service provides three different trace abstractions based on which traces can be compared, namely, prefix, set and multi-set abstraction. The prefix abstraction basically allows for a direct comparison between the partial trace and a log trace. In practice such a direct comparison is not always relevant, e.g., when the ordering, or frequency of activities is not important. Therefore we provide with set and multi-set two additional abstractions. They are independent of the domain context, e.g., they do not assume the process to be a procurement process or an invoice handling process [17]. Definition 6 (Trace abstraction). Let A be a set of activities, L ∈ B(A∗ ) be an event log and σ ∈ L be a trace. σp = σ denotes the prefix abstraction of σ, σs = {a | a ∈ σ} denotes the set abstraction of σ and σm = par(σ) denotes the multi-set abstraction of σ, i.e., for all a ∈ σ holds that σm(a) = par(σ)(a)
In Section 3.4 we explain how we determine which log traces are or obtaining predictive information of an enabled activity. In Section 3.5 we describe how we calculate the weighting of log traces 3.4 Support The relevance of log traces for a recommendation is determined on basis of su port. Typically, traces that are relevant are those that support the enabled ac- tivity for which the recommendation is computed. What support exactly means here, depends on the trace abstraction used For the prefix abstraction, we say that a log trace o supports enabled activity e, if and only if e occurs in g at the same index as in the partial trace p, when this activity is executed For set abstraction, we consider a log trace o to support the enabled activity e whenever activity e has been observed at least once in the log race. To support an enabled activity e in multi-set abstraction of trace o, the the frequency of activity e in the partial trace p must be less than the frequency in the log trace o, i.e., by executing e after p, the total number of e's does not exceed the number of es in a Definition 7(Activity support functions ). Let A be a set of activities g E A' and enabled activity e E A. We use the predicate s(p, o, e) to state that log trace a supports the execution of e after partial trace p. The predicate is defined for the three abstractions b sp(p, o,e) op(l4+1) ss(p, o, e) sm(P,a,e)←→pm(e)<om(e) The support predicate is used to filter the event log by removing all traces that do not support an enabled activity Definition 8(Support filtering). Let A be a set of activities and L E B(A' an event log over A. Furthermore, let (p, e)be a recommendation request with E A and e c A. we define the log filtered on support of enabled activity e∈ E and partial trace p as Lie.el)=l∈L|s(po,e) Log traces from L(p,e) support enabled activity e and are used for the recom- mendation of e. Next, we define a weighing function() to express the relative importance of each of these log traces for the recommendation of an activity e. Note that a ranges over a multi-set traces
In Section 3.4 we explain how we determine which log traces are relevant for obtaining predictive information of an enabled activity. In Section 3.5 we describe how we calculate the weighting of log traces. 3.4 Support The relevance of log traces for a recommendation is determined on basis of support. Typically, traces that are relevant are those that support the enabled activity for which the recommendation is computed. What support exactly means here, depends on the trace abstraction used. For the prefix abstraction, we say that a log trace σ supports enabled activity e, if and only if e occurs in σ at the same index as in the partial trace ρ, when this activity is executed. For set abstraction, we consider a log trace σ to support the enabled activity e whenever activity e has been observed at least once in the log trace. To support an enabled activity e in multi-set abstraction of trace σ, the the frequency of activity e in the partial trace ρ must be less than the frequency in the log trace σ, i.e., by executing e after ρ, the total number of e’s does not exceed the number of e’s in σ. Definition 7 (Activity support functions). Let A be a set of activities, ρ, σ ∈ A∗ and enabled activity e ∈ A. We use the predicate s(ρ, σ, e) to state that log trace σ supports the execution of e after partial trace ρ. The predicate is defined for the three abstractions by: sp(ρ, σ, e) ⇐⇒ σp(|ρ| + 1) = e ss (ρ, σ, e) ⇐⇒ e ∈ σs sm(ρ, σ, e) ⇐⇒ ρm(e) < σm(e) The support predicate is used to filter the event log by removing all traces that do not support an enabled activity. Definition 8 (Support filtering). Let A be a set of activities and L ∈ B(A∗ ) an event log over A. Furthermore, let (ρ, E) be a recommendation request with ρ ∈ A∗ and E ⊆ A. We define the log filtered on support of enabled activity e ∈ E and partial trace ρ as L s (ρ,e) = Jσ ∈ L | s(ρ, σ, e)K 1 Log traces from L s (ρ,e) support enabled activity e and are used for the recommendation of e. Next, we define a weighing function (ω) to express the relative importance of each of these log traces for the recommendation of an enabled activity e. 1 Note that σ ranges over a multi-set traces
3.5 Trace Weight The support of an enabled activity determines the part of the log that serves as a basis for a recommendation. However, from the traces supporting an enabled activity, not every one is equally important, i.e., some log traces match the partial trace better than others. Hence, we define weighing functions that assign a weight to each log trace. The weight of a trace can be between l and 0, where a value of 1 indicates that two traces fully match and 0 that they do not match at all. The calculation of the degree of matching depends on the trace abstraction For prefixes, the weight of a log trace is 1 if the partial trace is a prefix of the log trace, otherwise, the weight is 0. For the set abstraction, the weight of the log trace is defined as the fraction of distinct partial trace activities that the partial trace abstraction and log trace abstraction have in common. The weight of a trace for the multi-set abstraction is similar to the set-weight, however, the frequency of a Definition 9(Weight functions). Let A be a set of activities and o,PE A* We define w(p, a),i.e, the relative importance of a log trace o when considerin the partial trace p as follows 4()={n.m≤ ws (p, a)= Ips I 3.6 Expected Outcome Definition 5 states that a recommendation for enabled activity e contains pre- dictive information about the target value. We define the expected outcome of the target value( do value), when e is executed in the next step, as a weighted average over target values of log traces from L(e,e), the log filtered on support of e. The target value of each trace from Lie, e) is weighted(w)on basis of the degree of matching with the partial trace. Definition 10(do calculation). Let A be a set of activities, T a target func- tion,p,∈A·,L∈B(A*)ande∈ E C A an enabled activity. The expected target value when p is completed by the user after performing activity e next is defined u(p,a)·T() Similarly, we define the expected target value of not doing an enabled activit e2. The dont function determines the weighted average over all alternatives of e, i.e., all traces that do not support the execution of e after p, but do support any of the alternatives e after Note that in both do and dont 2 ranges over a multi-set of traces
3.5 Trace Weight The support of an enabled activity determines the part of the log that serves as a basis for a recommendation. However, from the traces supporting an enabled activity, not every one is equally important, i.e., some log traces match the partial trace better than others. Hence, we define weighing functions that assign a weight to each log trace. The weight of a trace can be between 1 and 0, where a value of 1 indicates that two traces fully match and 0 that they do not match at all. The calculation of the degree of matching depends on the trace abstraction. For prefixes, the weight of a log trace is 1 if the partial trace is a prefix of the log trace, otherwise, the weight is 0. For the set abstraction, the weight of the log trace is defined as the fraction of distinct partial trace activities that the partial trace abstraction and log trace abstraction have in common. The weight of a trace for the multi-set abstraction is similar to the set-weight, however, the frequency of activities is also considered. Definition 9 (Weight functions). Let A be a set of activities and σ, ρ ∈ A∗ . We define ω(ρ, σ),i.e., the relative importance of a log trace σ when considering the partial trace ρ as follows: ωp(ρ, σ) = 1 , if ρp ≤ σp 0 , otherwise , ωs (ρ, σ) = |ρs ∩ σs | |ρs | , ωm (ρ, σ) = |ρm C σm | |ρm | 3.6 Expected Outcome Definition 5 states that a recommendation for enabled activity e contains predictive information about the target value. We define the expected outcome of the target value (do value), when e is executed in the next step, as a weighted average over target values of log traces from L s (ρ,e) , the log filtered on support of e. The target value of each trace from L s (ρ,e) is weighted (ω) on basis of the degree of matching with the partial trace. Definition 10 (do calculation). Let A be a set of activities, τ a target function, ρ, σ ∈ A∗ , L ∈ B(A∗ ) and e ∈ E ⊆ A an enabled activity. The expected target value when ρ is completed by the user after performing activity e next is defined as: do(e, ρ, L) = P σ∈Ls (ρ,e) ω(ρ, σ) · τ (σ) P σ∈Ls (ρ,e) ω(ρ, σ) Similarly, we define the expected target value of not doing an enabled activity e 2 . The dont function determines the weighted average over all alternatives of e, i.e., all traces that do not support the execution of e after ρ, but do support any of the alternatives e 0 after ρ. 2 Note that in both do and dont Σ ranges over a multi-set of traces
I Log Weight and support ABC 900 0 DBC5000.5 T BC5000.5 DFA10001 DFB15001 DFC20001 DFH12601 CCA 1680 Fig 3. Example log, with weight and support values for P=(D, F) Definition 11 (dont calculation). Let A be a set of activities, T a target func tion,p,∈A*,L∈B(A*)ande,e'∈ ECA enabled activities. The expected target value when p is completed by the user after not performing activity e nert is defined as: dont(e, P, L)= Next, we provide an example calculation for a recommendation, based on a concrete partial trace, a set of enabled events and a log Example 1(Recommendation ) Suppose p=(D, F) is a partial trace and E A, B, CI is the set of enabled activities. Together, they form a recommenda- tion request(p,E). The log is given by L=[(A,B,C),⑦D,B,C),……],with T(A, B, C))=900, T((D, B, C))=500, etc.(cf Fig 3). For convenience, we also provide the values for support(ss(p, o, e)) and trace weight(ws(p, a)). For each log trace, support is denoted by T. The user wants to minimize the cost and uses set abstraction The do and dont values for the recommendation are calculated as follow 0+1+0 do(B,(D,F),L)=990+05:500+05:500+1:150=100 do(c,(D,F),L)=9∞0+05:500+05:50+1:20+0:160=1250 don(A.(D,F),L)=(0.5.500+05500+1.1500+(0.5.500+0.5500+1.2000 0.5+0.5+1+0.5+0.5+1 =1125 2000+0·1680) dont(C, (D, F), L) 1:1000+(1:1500 The implementation of our recommendation service orders the enabled ac- tivities on the difference between do and dont, i. e, the bigger the difference
Log Weight and support σ cost ωs (ρ, σ) ss (ρ, σ, e) e = A e = B e = C ABC 900 0 > > > DBC 500 0.5 > > FBC 500 0.5 > > DFA 1000 1 > DFB 1500 1 > DFC 2000 1 > DFH 1260 1 CCA 1680 0 > > Fig. 3. Example log, with weight and support values for ρ = hD, Fi. Definition 11 (dont calculation). Let A be a set of activities, τ a target function, ρ, σ ∈ A∗ , L ∈ B(A∗ ) and e, e0 ∈ E ⊆ A enabled activities. The expected target value when ρ is completed by the user after not performing activity e next is defined as: dont(e, ρ, L) = P e 0∈E\{e} P σ∈Ls (ρ,e0) \Ls (ρ,e) ω(ρ, σ) · τ (σ) P e 0∈E\{e} P σ∈Ls (ρ,e0) \Ls (ρ,e) ω(ρ, σ) Next, we provide an example calculation for a recommendation, based on a concrete partial trace, a set of enabled events and a log. Example 1 (Recommendation). Suppose ρ = hD, Fi is a partial trace and E = {A, B, C} is the set of enabled activities. Together, they form a recommendation request (ρ, E). The log is given by L = JhA, B, Ci,hD, B, Ci, . . .K, with τ (hA, B, Ci) = 900, τ (hD, B, Ci) = 500, etc. (cf. Fig. 3). For convenience, we also provide the values for support (ss (ρ, σ, e)) and trace weight (ωs (ρ, σ)). For each log trace, support is denoted by >. The user wants to minimize the cost and uses set abstraction. The do and dont values for the recommendation are calculated as follows. do(A, hD, F i, L) = 0 · 900 + 1 · 1000 + 0 · 1680 0 + 1 + 0 = 1000 do(B, hD, F i, L) = 0 · 900 + 0.5 · 500 + 0.5 · 500 + 1 · 1500 0 + 0.5 + 0.5 + 1 = 1000 do(C, hD, F i, L) = 0 · 900 + 0.5 · 500 + 0.5 · 500 + 1 · 2000 + 0 · 1680 0 + 0.5 + 0.5 + 1 + 0 = 1250 dont(A, hD, F i, L) = (0.5 · 500 + 0.5 · 500 + 1 · 1500) + (0.5 · 500 + 0.5 · 500 + 1 · 2000) 0.5 + 0.5 + 1 + 0.5 + 0.5 + 1 = 1125 dont(B, hD, F i, L) = (1 · 1000 + 0 · 1680) + (1 · 2000 + 0 · 1680) 1 + 0 + 1 + 0 = 1500 dont(C, hD, F i, L) = (1 · 1000) + (1 · 1500) 1 + 1 = 1250 The implementation of our recommendation service orders the enabled activities on the difference between do and dont, i.e., the bigger the difference
1)Log Creation Based on Re dations (for Object Set-Up Time Moder") 小 ycle Time ()Levels used for Log Sze k=5, 30.60. 2)Random Log Creation(for Object, Set-Up Time Model" Create Traces Cycle Time th 30 observations Fig. 4. The experiment desig the more attractive the activity is. The recommendations for the enabled ac- tivities are(A, 1000, 1125), (B, 1000, 1500)and(C, 1250, 1250), with the dif- ferences of-125,-500 and 0 respectively. Thus, the recommendation result ((B, 1000, 1500), (A, 1000, 1125), (C, 1250, 1250). If the user goal would be to maximize costs. the order will be reversed 4 Evaluation Based on a Controlled Experiment To evaluate the effectiveness of our recommendation service we conducted a controlled experiment. Section 4.1 describes the design underlying our experi- ment and Section 4.2 describes the preparatory steps we conducted. Section 4.3 explains the experiment procedure including data analysis. The results of our experiment are presented in Section 4.4. Factors threatening the validity of our experiment are discussed in 4.5 our experiment we use the recommendation service to support the business process, that has been explained in Section 2. The process has five activities (A, B, C, D, E)that have to be executed exactly once and can be executed in any order. Each activity has a cycle time of 10 time units, however, if C is directly executed after B, then the cycle time of the trace will be 35 time units of 50. For the experiment we assume that the user goal is to minimize the cycle time and that the recommendation service is used for support .1 Experiment Design This section describes the design underlying our Object: The object to be studied in our experiment are the traces created for the set-up time model with the help of our recommendation service
Prefix Set Multi Set Pref60 Sample Set60 Sample Multi60 Sample Pref30 Sample Set30 Sample Multi30 Sample Pref5 Sample Set5 Sample Multi5 Sample 5 30 60 120 Pref120 Sample Set120 Sample Multi120 Sample Abstraction abs Log Size k Recommendations Random A.) Experiment Procedure B.) Experiment Design Random Sample 2.) Random Log Creation (for Object „Set-Up Time Model“) Create Traces randomly Trace 1 Random Sample with 30 observations Mean Cycle Time ? Trace 1 .. 30 Log L with size k Trace 1 Create Traces using Recommendation Service from log L with abstraction abs Mean Cycle Time ? Abstraction abs Sample abs / k with 30 observations (*) Trace 1 .. 30 (*) Levels used for Log Size k = {5,30,60,120} and abstractions abs = {prefix, set, multiSet} resulting in 12 samples = {Pref5, Pref30, …. , MultiSet60, MultiSet120} 1.) Log Creation Based on Recommendations (for Object „Set-Up Time Model“) Fig. 4. The experiment design. the more attractive the activity is. The recommendations for the enabled activities are (A, 1000, 1125),(B, 1000, 1500) and (C, 1250, 1250), with the differences of -125, -500 and 0 respectively. Thus, the recommendation result is h(B, 1000, 1500),(A, 1000, 1125),(C, 1250, 1250)i. If the user goal would be to maximize costs, the order will be reversed. 4 Evaluation Based on a Controlled Experiment To evaluate the effectiveness of our recommendation service we conducted a controlled experiment. Section 4.1 describes the design underlying our experiment and Section 4.2 describes the preparatory steps we conducted. Section 4.3 explains the experiment procedure including data analysis. The results of our experiment are presented in Section 4.4. Factors threatening the validity of our experiment are discussed in 4.5. In our experiment we use the recommendation service to support the business process, that has been explained in Section 2. The process has five activities (A, B, C, D, E) that have to be executed exactly once and can be executed in any order. Each activity has a cycle time of 10 time units, however, if C is directly executed after B, then the cycle time of the trace will be 35 time units of 50. For the experiment we assume that the user goal is to minimize the cycle time and that the recommendation service is used for support. 4.1 Experiment Design This section describes the design underlying our experiment. – Object: The object to be studied in our experiment are the traces created for the set-up time model with the help of our recommendation service
Independent Variables: In our experiment we consider the log abstraction and the log size as independent variables. For variable log abstraction we consider levels abs E Prefer, set, multiset(cf Section 3.3). Variable Log size k represents the number of instances in the event log, i. e, the amount of learning material based on which recommendations are made. As levels k E 5, 30, 60, 120) are considered Response Variable: The response variable in our experiment is the cycle time of a trace created by the recommendation service using a log of a given size and a given abstraction. Experiment Goal: The main goal of our experiment is to investigate whether changes in the log significantly effect the cycle time of the cre- ated traces given an abstraction. Another goal is to investigate whether the traces created by our recommendation service yield significantly better re- sults than randomly created traces 4.2 Experiment Preparation This section describes the preparatory steps we conducted for the experiment Implementing the Recommendation Service in ProM. As a prepa- ration for our experiment we implemented the recommendation service de- scribed in Section 3 as a plug-in for the(Pro)cess(M)ining framework ProM ProM is a pluggable framework that provides a wide variety of plug-ins to ex tract information about a process from event logs [19, e. g, a process model an organizational model or decision point information can be discovered. To plement the recommendation service we had to make several extensions to ProM as the recommendation service, in contrast to other plug-ins, is not a posteriori mining technique, but recommendations are provided in real-time during process execution. The implementation of our recommendation ser vice is able to provide a process engine with recommendations on possible next steps knowing the enabled activities and the partial trace. In addition, the recommendation service provides means to add finished cases to the event log to make them available for recommendations in future executions Implementing a Log Creator and Log Simulator. In addition to the recommendation service we implemented a log creator and log simulator. While the log creator allows us to randomly create logs of size k for a given process model, the log simulator can be used to create traces using the rec- ommendation service with a log of size k and an abstraction abs. The log simulator takes the constraints imposed by the process model into consid eration and ensures that no constraint violations can occur. Thus. the log simulator can be seen as a simulation of a process engine. Both the log cre- ator and the log simulator have been implemented in Java using Fitnesse Note that our approach can also be used for costs, quality, utilization, etc. However for simplicity we focus on the cycle time only. ThePromframeworkcanbedownloadedfromwww.processmining.org Fitness Acceptance Testing Framework fitnesse. org
– Independent Variables: In our experiment we consider the log abstraction and the log size as independent variables. For variable log abstraction we consider levels abs ∈ {prefix , set, multiset} (cf Section 3.3). Variable Log size k represents the number of instances in the event log, i.e., the amount of learning material based on which recommendations are made. As levels k ∈ {5, 30, 60, 120} are considered. – Response Variable: The response variable in our experiment is the cycle time of a trace created by the recommendation service using a log of a given size and a given abstraction. – Experiment Goal: The main goal of our experiment is to investigate whether changes in the log significantly effect the cycle time3 of the created traces given an abstraction. Another goal is to investigate whether the traces created by our recommendation service yield significantly better results than randomly created traces. 4.2 Experiment Preparation This section describes the preparatory steps we conducted for the experiment. – Implementing the Recommendation Service in ProM. As a preparation for our experiment we implemented the recommendation service described in Section 3 as a plug-in for the (Pro)cess (M)ining framework ProM4 . ProM is a pluggable framework that provides a wide variety of plug-ins to extract information about a process from event logs [19], e.g., a process model, an organizational model or decision point information can be discovered. To implement the recommendation service we had to make several extensions to ProM as the recommendation service, in contrast to other plug-ins, is not a posteriori mining technique, but recommendations are provided in real-time during process execution. The implementation of our recommendation service is able to provide a process engine with recommendations on possible next steps knowing the enabled activities and the partial trace. In addition, the recommendation service provides means to add finished cases to the event log to make them available for recommendations in future executions. – Implementing a Log Creator and Log Simulator. In addition to the recommendation service we implemented a log creator and log simulator. While the log creator allows us to randomly create logs of size k for a given process model, the log simulator can be used to create traces using the recommendation service with a log of size k and an abstraction abs. The log simulator takes the constraints imposed by the process model into consideration and ensures that no constraint violations can occur. Thus, the log simulator can be seen as a simulation of a process engine. Both the log creator and the log simulator have been implemented in Java using Fitnesse5 3 Note that our approach can also be used for costs, quality, utilization, etc. However, for simplicity we focus on the cycle time only. 4 The ProM framework can be downloaded from www.processmining.org. 5 Fitness Acceptance Testing Framework fitnesse.org