L'objectiu del grup és la producció de contribucions rellevants en les àrees d'expertesa dels components del grup i la seva disseminació en revistes i conferències internacionals de prestigi reconegut. És voluntat del grup que les contribucions tinguin un impacte significatiu a llarg termini. La transferència de tecnologia és considerada com una conseqüència de l'excel·lència en la recerca i s'ha de portar a terme com un mitjà per incrementar l'impacte dels resultats, obtenir recursos per al grup i explorar nous temes per a la recerca en el futur.
The study of partial match queries on random hierarchical multidimensional data structures dates back to Ph. Flajolet and C. Puech’s 1986 seminal paper on partial match retrieval. It was not until recently that fixed (as opposed to random) partial match queries were studied for random relaxed K-d trees, random standard K-d trees, and random 2-dimensional quad trees. Based on those results it seemed
natural to classify the general form of the cost of fixed partial match queries into two families: that of either random hierarchical structures or perfectly balanced structures, as conjectured by Duch, Lau and Martínez (On the Cost of Fixed Partial Queries in K-d trees Algorithmica, 75(4):684–723, 2016). Here we show that the conjecture just mentioned does not hold by introducing relaxed K-dt trees and providing the average-case analysis for random partial match queries as well as some advances on the average-case analysis for fixed partial match queries on them. In fact this cost –for fixed partial match queries– does not follow the conjectured forms.
We introduce Max celebrity games a new variant of Celebrity games defined in . In both models players have weights and there is a critical distance ß as well as a link cost a. In the max celebrity model the cost of a player depends on the cost of establishing links to other players and on the maximum of the weights of those nodes that are farther away than ß (instead of the sum of weights as in celebrity games). The main results for ß>1 are that: computing a best response for a player is NP-hard; the optimal social cost of a celebrity game depends on the relation between a and wmax; ne always exist and ne graphs are either connected or a set of r=1 connected components where at most one of them is not an isolated node; for the class of connected ne graphs we obtain a general upper bound of 2ß+2 for the diameter. We also analyze the price of anarchy (PoA) of connected ne graphs and we show that there exist games G such that PoA(G)=T(n/ß); modifying the cost of a player we guarantee that all ne graphs are connected, but the diameter might be n-1. Finally, when ß=1, computing a best response for a player is polynomial time solvable and the PoA=O(wmax/wmin).
Coma-Puig, B.; Carmona, J.; Gavaldà, R.; Alcoverro, S.; Martín, V. IEEE International Conference on Data Science and Advanced Analytics p. 120-129 DOI: 10.1109/DSAA.2016.19 Data de presentació: 2016-10-17 Presentació treball a congrés
Data from utility meters (gas, electricity, water) is a rich source of information for distribution companies, beyond billing. In this paper we present a supervised technique, which primarily but not only feeds on meter information, to detect meter anomalies and customer fraudulent behavior (meter tampering). Our system detects anomalous meter readings on the basis of models built using machine learning techniques on past data. Unlike most previous work, it can incrementally incorporate the result of field checks to grow the database of fraud and non-fraud patterns, therefore increasing model precision over time and potentially adapting to emerging fraud patterns. The full system has been developed with a company providing electricity and gas and already used to carry out several field checks, with large improvements in fraud detection over the previous checks which used simpler techniques.
Beek, M.; Carmona, J.; Kleijn, J. International Symposium of Leveraging Applications of Formal Methods, Verification and Validation p. 784-805 DOI: 10.1007/978-3-319-47166-2_55 Data de presentació: 2016-10 Presentació treball a congrés
We consider systems composed of reactive components that collaborate through synchronised execution of common actions. These multi-component systems are formally represented as team automata, a model that allows a wide spectrum of synchronisation policies to combine components into higher-level systems. We investigate the
correct-by-construction engineering of such systems of systems from the point of view of correct communications between the components (no message loss or deadlocks due to indefinite waiting). This leads to a proposal for a generic definition of compatibility of components relative to the adopted synchronisation policy. This definition appears
to be particularly appropriate for so-called master-slave synchronisations by which input actions (for `slaves') are driven
by output actions (from `masters').
A generalized method to define the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a
parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data
and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected
whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide & Conquer
paradigm, named pipeline. The main features of pipeline are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes.
An empirical evaluation is conducted on graphs of different sizes and densities. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting
triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.
In this paper we tackle the problem of extending the logic of nested graph conditions with paths. This means, for instance, that we may state properties about the existence of paths between some given nodes. As a main contribution, a sound and complete tableau method is defined for reasoning about this kind of properties.
The holy grail in process mining is an algorithm that, given an event log, produces fitting, precise, properly generalizing and simple process models. While there is consensus on the existence of solid metrics for fitness and simplicity, current metrics for precision and generalization have important flaws, which hamper their applicability in a general setting. In this paper, a novel approach to measure precision and generalization is presented, which relies on the notion of antialignments. An anti-alignment describes highly deviating model traces with respect to observed behavior. We propose metrics for precision and generalization that resemble the leave-one-out cross-validation techniques, where individual traces of the log are removed and the computed anti-alignment assess the model’s capability to describe precisely or generalize the observed behavior. The metrics have been implemented in ProM and tested on several examples.
This work presents a set of methods to improve the understandability of process models. Traditionally, simplification methods trade off quality metrics, such as fitness or precision. Conversely, the methods proposed in this paper produce simplified models while preserving or even increasing fidelity metrics. The first problem addressed in the
paper is the discovery of duplicate tasks. A new method is proposed that avoids overfitting by working on the transition system generated by the log. The method is able to discover duplicate tasks even in the presence of concurrency and choice. The second problem is the structural simplification of the model by identifying optional and repetitive tasks. The tasks are substituted by annotated events that allow the removal of silent tasks and reduce the complexity of the
model. An important feature of the methods proposed in this paper is that they are independent from the actual miner used for process discovery.
Sanchez, D.; Muntés, V.; Carmona, J.; Sole, M. International Conference on Business Process Management p. 141-158 DOI: 10.1007/978-3-319-45468-9_9 Data de presentació: 2016-09 Presentació treball a congrés
The automated comparison of process models has received increasing attention in the last decade, due to the growing existence of process models and repositories, and the consequent need to assess similarities between the underlying processes. Current techniques for process model comparison are either structural (based on graph edit
distances), or behavioural (through activity profiles or the analysis of the execution semantics). Accordingly, there is a gap between the quality of the information provided by these two families, i.e., structural techniques may be fast but inaccurate, whilst behavioural are accurate but complex. In this paper we present a novel technique, that is based on a well-known technique to compare labeled trees through the notion of Cophenetic distance. The technique lays between
the two families of methods for comparing a process model: it has an structural nature, but can provide accurate information on the differences/similarities of two process models. The experimental evaluation on various benchmarks sets are reported, that position the proposed technique as a valuable tool for process model comparison.
The alignment of observed and modeled behavior is a crucial problem in process mining, since it opens the door for conformance checking and enhancement of process models. The state of the art techniques for the computation of alignments rely on a full exploration of the combination of the model state space and the observed behavior (an event log), which hampers their applicability for large instances. This paper presents a fresh view to the alignment problem: the computation of alignments is casted as the resolution of Integer Linear Programming models, where the user can decide the granularity of the alignment steps. Moreover, a novel recursive strategy is used to split
the problem into small pieces, exponentially reducing the complexity of the ILP models to be solved. The contributions of this paper represent a promising alternative to fight the inherent complexity of computing alignments for large instances.
Atserias, A.; Torunczyk, S. Annual Conference of the European Association for Computer Science Logic p. 1-16 DOI: 10.4230/LIPIcs.CSL.2016.16 Data de presentació: 2016-08-29 Presentació treball a congrés
Homogenization is a powerful way of taming a class of finite structures with several interesting applications in different areas, from Ramsey theory in combinatorics to constraint satisfaction problems (CSPs) in computer science, through (finite) model theory. A few sufficient conditions for a class of finite structures to allow homogenization are known, and here we provide a necessary condition. This lets us show that certain natural classes are not homogenizable: 1) the class of locally consistent systems of linear equations over the two-element field or any finite Abelian group, and 2) the class of finite structures that forbid homomorphisms from a specific MSO-definable class of structures of treewidth two. In combination with known results, the first example shows that, up to pp-interpretability, the CSPs that are solvable by local consistency methods are distinguished from the rest by the fact that their classes of locally consistent instances are homogenizable. The second example shows that, for MSO-definable classes of forbidden patterns, treewidth one versus two is the dividing line to homogenizability.
Este trabajo analiza cuantitativamente las diferentes medidas adoptadas en la evaluación contínua del curso de Programación-1 de la Facultat d’Informàtica de Barcelona de la Universitat Politècnica de Catalunya desde 2010, año en que comienza la impartición del nuevo grado en Ingeniería Informática. Se analiza el efecto de dichas medidas sobre las categorías de calificaciones obtenidas por los estudiantes en relación al incremento de carga docente del profesorado. El estudio de la evolución de estos dos valores a lo largo del tiempo usa una aproximación de coste-beneficio marginal,
clásica en ámbitos de economía. Usando herramientas
provenientes de las ciencias sociales, este estudio se
complementa con un análisis preliminar de desigualdad.
Ambos enfoques pretenden introducir nuevas técnicas para el análisis objetivo del impacto de medidas docentes.
Este trabajo analiza cuantitativamente las diferentes medidas adoptadas en la evaluación contínua del curso de Programación-1 de la Facultat d’Informàtica de Barcelona de la Universitat Politècnica de Catalunya desde 2010, año en que comienza la impartición del nuevo grado en Ingeniería Informática. Se analiza el efecto de dichas medidas sobre las categorías de caliﬁcaciones obtenidas por los estudiantes en relación al incremento de carga docente del profesorado. El estudio de la evolución de estos dos valores a lo largo del tiempo usa una aproximación de coste-beneﬁcio marginal, clásica en ámbitos de economía. Usando herramientas provenientes de las ciencias sociales, este estudio se complementa con un análisis preliminar de desigualdad. Ambos enfoques pretenden introducir nuevas técnicas para el análisis objetivo del impacto de medidas docentes.
This paper analyzes quantitatively the different measures taken in the continuous assessment of the CS1 course at the Facultat d’Informàtica de Barcelona of the Universitat Politècnica de Catalunya since 2010, when the new degree in Informatics Engineering began. The effect of those measures on the marks obtained by the students is analyzed with respect to the increment of the course load for the faculty members. These two values are compared along time through a classical marginal cost-beneﬁt approach. Using tools commonly accepted in the social sciences, this study is complemented with a preliminary analysis of inequality. Both approaches aim at introducing new techniques for an objective analysis on the impact of teaching measures.
In this paper we tackle the problem of extending the logic of nested graph conditions with paths. This means, for instance, that we may state properties about the existence of paths between some given nodes. As a main contribution, a sound and complete tableau method is defined for reasoning about this kind of properties.
We consider two collective decision-making models associated with influence games , the oblivious and non-oblivious influence decision models. The difference is that in oblivious influence models the initial decision of the actors that are neither leaders nor independent is never taken into account, while in the non-oblivious it is taken when the leaders cannot exert enough influence to deviate from it. Following  we consider the power measure, for an actor in a decision-making model. Power is ascribed to an actor when, by changing its inclination, the collective decision can be changed. We show that computing the power measure is #P-hard in both oblivious and non-oblivious influence decision models. We present two subfamilies of (oblivious and non-oblivious) influence decision models in which the power measure can be computed in polynomial time.
We analyze the computational complexity of the power measure in models of collective decision: the generalized opinion leader-follower model and the oblivious and non-oblivious in uence models. We show that computing the power measure is #P- hard in all these models, and provide two subfamilies in which the power measure can be computed in polynomial time.
Elffers, J.; Johannsen, J.; Lauria, M.; Magnard, T.; Nordström, J.; Vinyals, M. International Conference on Theory and Applications of Satisfiability Testing p. 160-176 DOI: 10.1007/978-3-319-40970-2_11 Data de presentació: 2016-07 Presentació treball a congrés
A long line of research has studied the power of conflict- driven clause learning (CDCL) and how it compares to the resolution proof system in which it searches for proofs. It has been shown that CDCL can polynomially simulate resolution even with an adversarially chosen learning scheme as long as it is asserting. However, the simulation only works under the assumption that no learned clauses are ever forgot- ten, and the polynomial blow-up is significant. Moreover, the simulation requires very frequent restarts, whereas the power of CDCL with less frequent or entirely without restarts remains poorly understood. With a view towards obtaining results with tighter relations between CDCL and resolution, we introduce a more fine-grained model of CDCL that cap- tures not only time but also memory usage and number of restarts. We show how previously established strong size-space trade-offs for resolution can be transformed into equally strong trade-offs between time and memory usage for CDCL, where the upper bounds hold for CDCL with- out any restarts using the standard 1UIP clause learning scheme, and the (in some cases tightly matching) lower bounds hold for arbitrarily frequent restarts and arbitrary clause learning schemes.
Teaching computer science in degrees that are not computer science related presents an important challenge: to motivate the students and to achieve good average grades. The student’s complaint is always based on his lack of motivation: What is this subject useful for? and this is specially relevant when this subject is not easy to learn by the student. In this paper we show the case of computer courses in the Statistics degree taught in the Universitat de Barcelona (UB) and the Universitat Politècnica de Catalunya (UPC) (two Catalan universities). We initially tried to reduce the complexity of their contents in order to obtain better average grades. Yet, it did not workout as expected. Therefore, we changed our strategy and instead of making the contents easier (less complex), we changed the tools that were used to teach and tried to adapt them to the students’ interests. In this particular case, we decided to use the R programming language, a language widely used by statisticians, in order to explain the basics of programming. Therefore, we changed our strategy from less (simpler contents) to different (more elaborated and nontrivial contents adapted to meet their expectations).
Chatain, T.; Carmona, J. International Conference on Application and Theory of Petri Nets and Concurrency p. 240-258 DOI: 10.1007/978-3-319-39086-4_15 Data de presentació: 2016-06-19 Presentació treball a congrés
Conformance checking techniques asses the suitability of a process model in representing an underlying process, observed through a collection of real executions. These techniques suffer from the wellknown state space explosion problem, hence handling process models exhibiting large or even infinite state spaces remains a challenge. One important metric in conformance checking is to asses the precision of the model with respect to the observed executions, i.e., characterize the ability of the model to produce behavior unrelated to the one observed. By avoiding the computation of the full state space of a model, current techniques only provide estimations of the precision metric, which in some situations tend to be very optimistic, thus hiding real problems a process model may have. In this paper we present the notion of antialignment as a concept to help unveiling traces in the model that may deviate significantly from the observed behavior. Using anti-alignments, current estimations can be improved, e.g., in precision checking. We show how to express the problem of finding anti-alignments as the satisfiability of a Boolean formula, and provide a tool which can deal with large models efficiently.
In this paper we follow an alternative approach named pipeline, to implement a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To be concrete, we implement a dynamic pipeline of processes and an ad-hoc MapReduce version using the language Go. We explote the ability of Go language to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different size and density. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.
This paper studies the complexity of computing a representation of a simple game as the intersection (union) of weighted majority games, as well as, the dimension or the codimension. We also present some examples with linear dimension and exponential codimension with respect to the number of players.
Imposing access control onto workflows considerably reduces the set of users authorized to execute the workflow tasks. Further constraints (e.g. Separation of Duties) as well as unexpected unavailabilty of users may finally obstruct the successful workflow execution. To still complete the execution of an obstructed workflow, we envisage a hybrid approach. If a log is provided, we partition its traces into "successful" and "obstructed" ones by analysing the given workflow and its authorizations. An obstruction should then be solved by finding its nearest match from the list of successful traces. If no log is provided, we flatten the workflow and its authorizations into a Petri net and encode the obstruction with a corresponding "obstruction marking". The structural theory of Petri nets shall then be tweaked to provide a minimized Parikh vector, that may violate given firing rules, however reach a complete marking and by that, complete the workflow.
Imposing access control onto workflows considerably reduces the set of users authorized to execute the workflow tasks. Further constraints (e.g. Separation of Duties) as well as unexpected unavailabilty of users may finally obstruct the successful workflow execution. To still complete the execution of an obstructed workflow, we envisage a hybrid
approach. If a log is provided, we partition its traces into “successful” and “obstructed” ones by analysing the given workflow and its authorizations. An obstruction should then be solved by finding its nearest match from the list of successful traces. If no log is provided, we flatten the workflow and its authorizations into a Petri net and encode the obstruction with a corresponding “obstruction marking”. The structural theory of Petri nets shall then be tweaked to provide a minimized Parikh vector, that may violate given firing rules, however reach a complete marking and by that, complete the workflow.
Conformance checking confronts process models with real process executions to detect and measure deviations between modelled and observed behaviour. The core technique for conformance checking is the computation of an alignment. Current approaches for alignment computation rely on a shortest-path technique over the product of the state-space of a model and the observed trace, thus suffering from the well-known state explosion problem. This paper presents a fresh alternative for alignment computation of acyclic process models, that encodes the alignment problem as a Constraint Satisfaction Problem. Since modern solvers for this framework are capable of dealing with large instances, this contribution has a clear potential. Remarkably, our prototype implementation can handle instances that represent a real challenge for current techniques. Main advantages of using Constraint Programming paradigm lie in the possibility to adapt parameters such as the maximum search time, or the maximum misalignment allowed. Moreover, using search and propagation algorithms incorporated in Constraint Programming Solvers permits to find solutions for problems unsolvable with other techniques.
Cortadella, J.; Lupon, M.; Moreno, A.; Roca, A.; Sapatnekar, S. IEEE International Symposium on Asynchronous Circuits and Systems p. 19-26 DOI: 10.1109/ASYNC.2016.14 Data de presentació: 2016-05 Presentació treball a congrés
How much margin do we have to add to the delay lines of a bundled-data circuit? This paper is an attempt to give a methodical answer to this question, taking into account all sources of variability and the existing EDA machinery for timing analysis and sign-off. The paper is based on the study of the margins of a ring oscillator that substitutes a PLL as clock generator. A timing model is proposed that shows that a 12% margin for delay lines can be sufficient to cover variability in a 65nm technology. In a typical scenario, performance and energy improvements between 15% and 35% can be obtained by using a ring oscillator instead of a PLL. The paper concludes that a synchronous circuit with a ring oscillator clock shows similar benefits in performance and energy as those of bundled-data asynchronous circuits.
San Pedro, J. de; Bourgeat, T.; Cortadella, J. IEEE International Symposium on Asynchronous Circuits and Systems p. 107-114 DOI: 10.1109/ASYNC.2016.10 Data de presentació: 2016-05 Presentació treball a congrés
The paper presents a first effort at exploring a novel area in the domain of asynchronous controllers: specification mining. Rather than synthesizing circuits from specifications, we aim at doing reverse engineering, i.e., discovering safe specifications from the circuits that preserve a set of pre-defined behavioral properties (e.g., hazard freeness). The specifications are discovered without any previous knowledge of the behavior of the circuit environment.
This area may open new opportunities for re-synthesis and verification of asynchronous controllers. The effectiveness of the proposed approach is demonstrated by mining concurrent specifications (Signal Transition Graphs) from multiple implementations of 4-phase handshake controllers and some controllers with choice.
Consider the following geometric infection model: Individuals are randomly placed points according to a Poisson point process in some appropriate metric space. Each individual has two states, infected or healthy. Any infected individual passes the infection to any other at distance d according to a Poisson process, whose rate is a function f(d) of d that decays as d increases. Any infected individual heals at rate 1. An epidemic is said to occur when, starting from one infected individual placed at some point, the infection has positive probability of lasting forever. Otherwise, we say extinction occurs. We investigate conditions on and under which the function f(d) = (d + 1)¿ is epidemic.
Joint work with Xavier Perez and Nick Wormald.
Visualization is essential for understanding the models obtained by process mining. Clear and efficient visual representations make the embedded information more accessible and analyzable. This work presents a novel approach for generating process models with structural properties that induce visually friendly layouts. Rather than generating a single model that captures all behaviors, a set of Petri net models is delivered, each one covering a subset of traces of the log. The models are mined by extracting slices of labelled transition systems with specific properties from the complete state space produced by the process logs. In most cases, few Petri nets are sufficient to cover a significant part of the behavior produced by the log.
Chirita, C.; Fiadeiro, J.; Orejas, F. International Conference on Fundamental Approaches to Software Engineering p. 359-376 DOI: 10.1007/978-3-662-49665-7_21 Data de presentació: 2016-04 Presentació treball a congrés
We advance a general technique for enriching logical systems with soft constraints, making them suitable for specifying complex software systems where parts are put together not just based on how they meet certain functional requirements but also on how they optimise certain constraints. This added expressive power is required, for example, for capturing quality attributes that need to be optimised or, more generally, for formalising what are usually called service-level agreements. More specifically, we show how institutions endowed with a graded semantic consequence can accommodate soft-constraint satisfaction problems. We illustrate our approach by showing how, in the context of service discovery, one can quantify the compatibility of two specifications and thus formalise the selection of the most
promising provider of a required resource.
Quad-K-d trees were introduced by Bereczky et al.  as a generalization of several well-known hierarchical multidimensional data structures such as K-d trees and quad trees. One of the interesting features of quad-K-d trees is that they provide a unified framework for the analysis of associative queries in hierarchical multidimensional data structures. In this paper we consider partial match, one of the fundamental associative queries, and prove that the expected cost of a random partial match in a random quad-K-d tree of size n is of the form T(#a), with 0
In this paper we give graph-semantics to a fundamental part of the semantics of the service modeling language SRML: business configurations. To achieve this goal we use symbolic graph transformation systems. We formalize the semantics using this graph transformation system and illustrating it with a simple running example of a trip booking agent.
By graphical logics, we mean logics whose formulas are not text, but they consist of graphs and graph morphisms, that are used to express graph properties. In this presentation, I will review previous work on this area and, moreover, I will present the main problems found when extending the logic to allow for the specification of the existence of paths in given graphs. In particular, I will present a proof calculus for this extension that is shown to be sound and it is conjectured to be complete.
Cortadella, J.; Lavagno, L.; López, P.; Lupon, M.; Moreno, A.; Roca, A.; Sapatnekar, S. IEEE International Conference on Computer Design p. 511-518 DOI: 10.1109/ICCD.2015.7357159 Data de presentació: 2015-10-20 Presentació treball a congrés
The growing variability in nanoelectronic devices, due to uncertainties from the manufacturing process and environmental conditions (power supply, temperature, aging), requires increasing design guardbands, forcing circuits to work with conservative clock frequencies. Various schemes for clock generation based on ring oscillators and adaptive clocks have been proposed with the goal to mitigate the power and performance losses attributable to variability. However, there has been no systematic analysis to quantify the benefits of such schemes and no sign-off method has been proposed for timing correctness. This paper presents and analyzes a Reactive Clocking scheme with Variability-Tracking Jitter (RClk) that uses variability as an opportunity to reduce power by continuously adjusting the clock frequency to the varying environmental conditions, and thus, reduces guardband margins significantly. Power can be reduced between 20% and 40% at iso-performance and performance can be boosted by similar amounts at iso-power. Additionally, energy savings can be translated to substantial advantages in terms of reliability and thermal management. More importantly, the technology can be adopted with minimal modifications to conventional EDA flows.
Ponce de León, H.; Rodríguez , C.; Carmona, J.; Heljanko, K.; Haar, S. International Symposium Automated Technology for Verification and Analysis p. 31-47 DOI: 10.1007/978-3-319-24953-7_4 Data de presentació: 2015-10-12 Presentació treball a congrés
This paper presents a novel technique for process discovery. In contrast to the current trend, which only considers an event log for discovering a process model, we assume two additional inputs: an independence relation on the set of logged activities, and a collection of negative traces. After deriving an intermediate net unfolding from them, we perform a controlled folding giving rise to a Petri net which contains both the input log and all independence-equivalent traces arising from it. Remarkably, the derived Petri net cannot execute any trace from the negative collection. The entire chain of transformations is fully automated. A tool has been developed and experimental results are provided that witness the significance of the contribution of this paper.
Navarro, M.; Orejas, F.; Pino, E. Logic, Rewriting, and Concurrency Festschrift Symposium in Honor of José Meseguer p. 539-561 DOI: 10.1007/978-3-319-23165-5_25 Data de presentació: 2015-09-24 Presentació treball a congrés
Jose Meseguer is one of the earliest contributors in the area of Algebraic Specification. In this paper, which we are happy to dedicate to him on the occasion of his 65th birthday, we use ideas and methods coming from that area with the aim of presenting an approach for the specification of the structure of classes of XML documents and for reasoning about them. More precisely, we specify the structure of documents using sets of constraints that are based on XPath and we present inference rules that are shown to define a sound and complete refutation procedure for checking satisfiability of a given specification using tableaux.
San Pedro, J. de; Carmona, J.; Cortadella, J. International Conference on Business Process Management p. 457-474 DOI: 10.1007/978-3-319-23063-4_30 Data de presentació: 2015-09-03 Presentació treball a congrés
The visualization of models is essential for user-friendly human-machine interactions during Process Mining. A simple graphical representation contributes to give intuitive information about the behavior of a system. However, complex systems cannot always be represented with succinct models that can be easily visualized. Quality-preserving model simplifications can be of paramount importance to alleviate the complexity of finding useful and attractive visualizations.
This paper presents a collection of log-based techniques to simplify process models. The techniques trade off visual-friendly properties with quality metrics related to logs, such as fitness and precision, to avoid degrading the resulting model. The algorithms, either cast as optimization problems or heuristically guided, find simplified versions of the initial process model, and can be applied in the final stage of the process mining life-cycle, between the discovery of a process model and the deployment to the final user. A tool has been developed and tested on large logs, producing simplified process models that are one order of magnitude smaller while keeping fitness and precision under reasonable margins.
Current process discovery techniques focus on the derivation of a process model on the basis of the activity relations extracted from
an event log. However, there are situations where more knowledge can be provided to the discovery algorithm, thus alleviating the discovery challenge. In particular, the (partial) characterization of the independence or concurrency between a pair of activities may be well-known. In this paper we present POD, a tool for discovery of Petri nets that can incorporate this type of additional information. We believe requiring independence/concurrency information is very natural in many scenarios, e.g., when there is some knowledge of the underlying process. We show with one example how this extra information can effectively deal with problems such as log incompleteness.
Lizárraga, E.; Blesa, M.; Blum, C.; Raidl, G. International Symposium on Innovations in Intelligent SysTems and Applications p. 1-8 DOI: 10.1109/INISTA.2015.7276795 Data de presentació: 2015-09 Presentació treball a congrés
The most strings with few bad columns problem is an NP-hard combinatorial optimization problem from the bioinformatics field. This paper presents the first integer linear programming model for this problem. Moreover, a simple greedy heuristic and a more sophisticated extension, namely a greedy-based pilot method, are proposed. Experiments show that, as expected, the greedy-based pilot method improves over the greedy strategy. For problem instances of small and medium size the best results were obtained by solving the integer linear programming model by CPLEX, while the greedy-based pilot methods scales much better to large problem instances.
Ponce de León, H.; Carmona, J.; vanden Broucke, S.K.L.M. International Conference on Business Process Management p. 126-143 DOI: 10.1007/978-3-319-23063-4_8 Data de presentació: 2015-09 Presentació treball a congrés
The discovery of a formal process model from event logs describing real process executions is a challenging problem that has been studied from several angles. Most of the contributions consider the extraction of a model as a semi-supervised problem where only positive information is available. In this paper we present a fresh look at process discovery where also negative information can be taken into account. This feature may be crucial for deriving process models which are not only simple, fitting and precise, but also good on generalizing the right behavior underlying an event log. The technique is based on numerical abstract domains and Satisfiability Modulo Theories (SMT), and can be combined with any process discovery technique. As an example, we show in detail how to supervise a recent technique that uses numerical abstract domains. Experiments performed in our prototype implementation show the effectiveness of the techniques and the ability to improve the results produced by selected discovery techniques.
Stewart, A.; Gabarro, J.; Keenan, A. Symbolic and Quantitative Approaches to Reasoning with Uncertainty p. 141-150 DOI: 10.1007/978-3-319-20807-7_13 Data de presentació: 2015-07-16 Presentació treball a congrés
Uncertainty profiles are used to study the effects of contention within cloud and service-based environments. An uncertainty profile provides a qualitative description of an environment whose quality of service (QoS) may fluctuate unpredictably.
For example, the performance of an application running on a virtual machine can be affected both by the way in which resources are allocated at run-time as well as by hardware contention issues. The aim of this paper is to model the influence that cloud (or service-based) environments can have on an application's performance.
Uncertain environments are modelled by strategic games with two agents; a daemon is used to represent overload and high resource contention; an angel is used to represent an idealised resource allocation situation with no underlying contention.
%An assessment of mixed-stress situations is found by finding the Nash equilibria of games constructed from uncertainty profiles.
Assessments of uncertainty profiles are useful in two ways: firstly,
they provide a broad understanding of how environmental stress can effect an application's performance
(and reliability); secondly, they allow the effects of introducing redundancy into a computation to be assessed.
Castro, J.; Gabarro, J.; Serna, M.; Stewart, A. European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty p. 129-140 DOI: 10.1007/978-3-319-20807-7_12 Data de presentació: 2015-07-16 Presentació treball a congrés
A framework for assessing the robustness of long-duration repetitive orchestrations in uncertain evolving environments is proposed. The model assumes that service-based evaluation environments are stable over short time-frames only; over longer periods service-based environments evolve as demand fluctuates and contention for shared resources varies.
The behaviour of a short-duration orchestration E in a stable environment is assessed by an uncertainty profile U and a corresponding zero-sum angel-daemon game Gamma (U).
Here the angel-daemon approach is extended to assess evolving environments by means of a subfamily of stochastic games. These games are called strategy oblivious because their transition probabilities are strategy independent. It is shown that the value of a strategy oblivious stochastic game is well defined and that it can be computed by solving a linear system. Finally, the proposed stochastic framework is used to assess the evolution of the Gabrmn IT system.
In this paper we show that the logical framework proposed by Becker et al.  to reason about security policy behavior in a trust management context can be captured by an operational framework that is based on the language proposed by Miller in 1989 to deal with scoping and/or modules in logic programming. The framework of Becker et al. uses propositional Horn clauses to represent both policies and credentials, implications in clauses are interpreted in counterfactual logic, a Hilbert-style proof system is defined and a system based on SAT is used to prove whether properties about credentials, permissions
and policies are valid, i.e. true under all possible policies. Our
contributions in this paper are three. First, we show that this kind
of validation can rely on an operational semantics (derivability
relation) of a language very similar to Miller’s language, which is
very close to derivability in logic programs. Second, we are able
to establish that, as in propositional logic, validity of formulas
is a co-NP-complete problem. And third, we present a provably
correct implementation of a goal-oriented algorithm for validity.
This work is a follow up of results given in . Here we present some computational complexity results for specific problems and simple games. For instance, we consider the complexity of determining trade
robustness for a given simple game in the four natural explicit representations: winning, losing, minimal winning, and maximal losing. Our results show that the problem is solvable in polynomial in some cases but in other it is NP-hard depending on the input and the output.
We also define the j-trade application for a given simple game and we analyze how to find such j-trade application in those natural forms of representation.
We conclude stating some conjectures and open problems. For instance, given a simple game, we consider how to compute the dimension and the co-dimension [2,3], and how to represent such game by a union or an intersection of some weighted games.
An opinion leader-follower (OLF) model is a two-action collective decision-making model for societies, including opinion leaders, exerting certain influence over the decision of other actors; followers, that might be convinced to modify their decisions, and independent actors. We extended the OLF model by introducing two collective decision-making models associated with influence games , the oblivious and non-oblivious influence decision models. The difference is that in oblivious influence models the initial decision of the
actors that are neither leaders nor independent is never taken into account while in the non-oblivious it is taken when the leaders cannot exert enough influence to deviate from it.
The satisfaction measure, for an actor in a decision-making model, is the number of society's initial decisions for which the collective decision coincides with the initial decision of the actor. We show that computing the satisfaction measure is #P-hard in both oblivious and non-oblivious influence decision models. We present two subfamilies, fpr both types, of influence decision models in which the satisfaction measure can be
computed in polynomial time.
We study a natural variant of the implicational fragment of propositional logic. Its formulas are pairs of conjunctions of positive literals, related together by an implicational-like connective, the semantics of this sort of implication is defined in terms of a threshold on a conditional probability of the consequent, given the antecedent: we are dealing with what the data analysis community calls confidence of partial implications or association rules. Existing studies of redundancy among these partial implications have characterized so far only entailment from one premise and entailment from two premises. By exploiting a previously noted alternative view of this entailment in terms of linear programming duality, we characterize exactly the cases of entailment from arbitrary numbers of premises. As a result, we obtain decision algorithms of better complexity, additionally, for each potential case of entailment, we identify a critical confidence threshold and show that it is, actually, intrinsic to each set of premises and antecedent of the conclusion.
Given an event log L, a control-flow discovery algorithm f, and a quality metric m, this paper faces the following problem: what are
the parameters in f that mostly influence its application in terms of
m when applied to L? This paper proposes a method to solve this problem, based on sensitivity analysis, a theory which has been successfully applied in other areas. Clearly, a satisfactory solution to this problem will be crucial to bridge the gap between process discovery algorithms and final users. Additionally, recommendation techniques and meta-techniques like determining the representational bias of an algorithm may benefit from solutions to the problem considered in this paper. The method has been evaluated over a set of logs and the flexible heuristic miner, and the preliminary results witness the applicability of the general framework described
in this paper.
Process mining techniques rely on event logs: the extraction of a process model (discovery) takes an event log as the input, the adequacy of a process model (conformance) is checked against an event log, and the enhancement of a process model is performed by using available data in the log. Several notations and formalisms for event log representation have been proposed in the recent years to enable efficient algorithms for the aforementioned process mining problems. In this paper we show how Conditional Partial Order Graphs (CPOGs), a recently introduced formalism for compact representation of families of partial orders, can be used in the process mining field, in particular for addressing the problem of compact and
easy-to-comprehend visualisation of event logs with data. We present algorithms for extracting both the control flow as well as the relevant data parameters from a given event log and show how CPOGs can be used for efficient and effective visualisation of the obtained results. We
demonstrate that the resulting representation can be used to reveal the hidden interplay between the control and data flows of a process, thereby opening way for new process mining techniques capable of exploiting this interplay.