# 2nd Estonian Summer School in Computer and Systems Science (ESSCaSS'03) II Eesti Arvuti- ja Süsteemiteaduse Suvekool (EASTS'03)

Taagepera Castle, August 10-14, 2003.

## Student Talks

### Master Thesis Work in Progress: Practical Mobility in Python Based on the Join-calculus

#### Jørgen Hermanrud Fjeld

References
[1] Join-calculus, C. Fournet and G. Gonthier. The Join-calculus: a language for distributed mobile programming, 2000, http://join.inria.fr/
[2] Python, Guido van Rossum. The Python language. http://www.python.org/
[5] Pyro, http://pyro.sourceforge.net/

### Heuristics for Speeding Up Agglomerative Hierarchical Clustering

#### Meelis Kull

The main idea of clustering is to split a set of objects into a few classes so that similar objects appear in the same class. Hereby we run into several problems. First we have to define what similarity actually means. But more serious problem is how to determine the appropriate number of classes. This is a matter of taste in some sense and any solution can be considered correct. Though, some solutions conform better to our intuitive understanding about what a cluster is. Hierarchical clustering faces the problem quite differently. It does not specify the number of classes and gives answers for all possible numbers at the same time. The result of hierarchical clustering is a tree that can be cut at different heights to obtain any number of classes. The resulting tree can also be very useful when visualising the data containing complex objects, for example high-dimensional vectors.

Hierarchical clustering is accomplished in two steps. Firstly, the distance between every pair of objects is calculated (small distance means similar objects). Secondly, the clustering is performed starting from the leaves of the tree (every object is a cluster) and constructing the tree by connecting iteratively the two nearest clusters to a new bigger cluster.

Hierarchical clustering can become quite time consuming, when the number of objects exceeds the order of ten thousand. Things get even worse, when the data is high-dimensional and the calculation of the distance between two objects takes a considerable amount of time. Unfortunately, this is the case in some applications, for example when clustering the genes according to their expression levels in different experiments.

Many efforts have been made to speed up the clustering process, but when the data is high-dimensional then the calculation of distances becomes a bottleneck. One solution could be calculating only the distances between some random pairs of objects. Unfortunately, this reduces the quality of the results because some of the similar objects are not found to be similar. Quality can be enhanced using an algorithm which finds more pairs of similar objects than the random algorithm would find with the same number of distances calculated. The presentation will be about speeding up hierarchical clustering by making use of this algorithm in the process of calculating distances.

### Feature Matching in Model-Based Software Engineering

#### Alar Raabe

Today the business processes become more and more dependent on the software, and must change very rapidly in response to the market changes. Initial results from software development should be delivered with a very short delay, and have to be deployable with minimal costs. When the business volume grows, or the business processes change, supporting software systems must be able to grow and change along, without impeding the business process, and without a major reimplementaion effort. The achievement of different non-functional requirements (e.g. qualities of service), required for business information systems, requires different implementation technologies, which themselves are rapidly evolving, to be used and combined.
As a result, there is a growing need to make business information systems development cycle shorter, and independent of underlying technologies, which themselves often evolve without offering backward compatibility. Therefore main body of reusable software assets of an enterprise should be independent of implementation technologies, and from them the implementation artifacts should be automatically synthesized.
Model driven synthesis of software (e.g. OMG MDA), where main developed artifacts are the implementation technology independent models of required software, deals with these problems.
In the article we describe a method, applicable for synthesising business software implementations from technology independent business models. This method is based on establishing a common feature space for problem and solution domains for the business information systems, and on the extended meta-models and separate step of solution domain analysis, described in the previous articles. In the proposed method synthesis of business software implementation from the technology independent business analysis model is performed in two steps.
First a solution domain or software architecture style is selected by matching the explicitly required features of given software system and implicitly required features of given problem domain to the features provided by the software architecture style. Next all the elements of given business analysis model are transformed into elements or sets of interconnected elements of selected architecture style, by matching their required features to the features provided by the elements of selected architecture style. During this step the structure of common feature model drives the decomposition of implementation.
In both steps it is possible to define the cost functions for selecting between different alternatives, which provide same features. Presented method is applicable to OMG MDA for transformation or mapping of platform independent model (PIM) to platform specific models (PSMs). The difference of our method from other domain specific and model-based methods is the separate step of solution domain analysis, and using common feature space to select architecture style and specific implementations.

### Towards Extendable UML and Integrating UML Modeling Tools into MDA Suites

#### Darius Silingas

Model Driven Architecture (MDA) is a new way of developing applications based on implementation platform independent model. MDA specification includes a suite of various standards (UML, XMI, MOF, CWM), where UML is a key technology enabling to define platform independent models of applications. Therefore MDA suite tools require a very good support for recent UML specification. Since UML has been widely used for a number of years, multiple UML tools have been developed: Rational Rose, TogetherJ, MagicDraw UML and many others. Therefore MDA suite vendors are able to make use from these products and integrate them into MDA suites rather than implementing UML themselves. Support for open API is essential for integration, but it is not a big problem since most UML tools provide it. A more severe problem is that most UML tools provide just partial support of standard UML features, while MDA need a very good support for all standard UML features. MDA suites also need support for UML profiles that can be implemented based on stereotypes, which is a standard UML extension mechanism. Every single profile can be reused among different models (partitioning functionality of MagicDraw). MagicDraw allows assigning several stereotypes to model elements. Graphical representation for different stereotypes should be customizable. An effective way to cope with this is a capability to define icons and tag definitions for stereotypes. Icons can specified in scalable image format (svg). As a sample for demonstrating these features, we provide a web application profile. Another issue that is very important for performance reasons is capability to split model into multiple files. Since MDA suites usually work with large models, this is necessary for several reasons:
a) reusability of model part;
b) teamwork support (another way to solve this problem is Teamwork Server);
c) scalability of model ? working on smaller part requires less system resources. Having solved the above mentioned issues - a good support for standard UML, a support for UML profiles based on stereotypes with customizable graphical presentation, and capability to partition model into multiple files - UML tools can be successfully integrated into MDA suites.

### Fault Tolerance in MPI parallel programs

#### Konstantin Skaburskas

Current HPC systems increase in size with higher potential levels of individual node failure. Thus, the need of having fault tolerant implementations of long-running parallel computational programs(packages) rises. Massage Passing Interface (MPI) is a straightforward and effective way for writing parallel programs for distributed computer. Currently, from the relationship between MPI Standard, its implementations and a parallel program it turns out that the fault tolerance must be a property of the latter. MPI offers the programmer a flexibility by specifying master/slave and client/server models for dynamic process management. Attempt to achieve some degree of fault-tolerant behavior by using dynamic process model of MPI in implementation of DOUG computational package is discussed. We expect that given approach can also be utilized for computations on Grid in the future. We give also an overview of other methods of fault tolerant MPI programs.

### Fuzzy-information-Based Risk Assessment

#### Aleksandrs Vali&shacek;evskis

The results that I would like to present are that of my Master’s Thesis, which I will be defending in the beginning of June. In my Thesis I propose a methodology for fuzzy-information-based risk analysis. In other words, we propose a methodology, which allows us to evaluate risk of alternatives in a case when the latter are described using constructions called fuzzy granules. Entropy is used as a measure of risk in our approach. Thus, first we present the risk analysis task in information theoretic form. We show how we can describe each alternative with the help of a set of systems. Each system has a set of states. Moreover, for each state we know what is the probability that the system is in this state. Hence, we can calculate entropy for these systems. However, the probabilities are interval, so we present a possible way of generalizing Shannon’s entropy to the case when probabilities are interval.
Roughly the methodology proposed can be divided into four stages. Firstly we have to determine a set of criteria that are of interest for us. Secondly, we describe each alternative with the help of fuzzy granules. Thirdly, we present our task in information theoretic form. We do it by presenting different criteria as different systems and different values of criteria as different states of these systems. What is particularly important is that we can evaluate what is the probability that a criterion will take some value, which is equivalent to the probability that a system, corresponding to the criterion, will be in a state, corresponding to the value. Fourthly, we evaluate entropy for each system obtained, which is considered as an evaluation of risk.
We can use this risk assessment approach in cases when we the information that is available about the alternatives is incomplete or imprecise. Actually, our approach towards risk analysis is based on measuring how little we know about each alternative. However, this approach is not meant to be a substitute for other risk evaluation methods, as the idea behind risk in our approach is not the same as, for example, in dispersion-based risk assessment.

### Computer-aided Modeling and Control Design of a Laboratory Scale Helicopter

#### Xianghua Wei

Helicopter toy is a laboratory and research pilot manufactured by Quanser Consulting, Inc. [3]. The pilot provides a wealth of control system design and analysis features that make it very useful for education and research purposes. The mathematical model of the helicopter exhibits natural nonlinearity, instability and significant cross-coupling between three control channels. The range of possible laboratory research works covers such areas as computer-aided study, design, modeling, and analysis of dynamical systems, building the nonlinear mathematical models, linearization, simplification, identification, state feedback design etc. In this paper, a nonlinear dynamic mathematical based on the physical system is derived via Lagrange dynamics equations and Newton?s second law. Because the model helicopter has significant cross-couplings, a state-space multi-input multi-output (MIMO) model is preferred, a appropriate linearized model is generated to take place of the original nonlinear model. Both of those two modeling approaches are experimentally identified. In addition a LQR controller was used to tune for optimal pilot flight which can be achieved by simulation and its performance was also evaluated.

### Modeling and Analyzing Railway Systems Using Coloured Petri Nets

#### Ingrid Chieh Yu

Coloured Petri Nets is a modeling language, used for systems where synchronization, communication, and resource sharing are important. Design/CPN is a graphical computer tool supporting the use of Coloured Petri Nets. The tool supports construction, simulation and functional and performance analysis of CPN models.
A Coloured Petri Net (CP-net or CPN) model is a description of the modeled system, and can be used as a specification of a system to be built or as a representation of an existing system. The behaviour of a CP-net model can be analyzed by means of informal and formal methods. By creating a CP-net model, we can investigate a new system before we construct it. This is an advantage since design errors in some systems can be expensive to correct and may even jeopardize security. Through the process of creating a description and performing an analysis, a model designer can also achieve a better understanding of the modeled system.
Railway systems (RWS) are complex and large concurrent systems. They consist of different components; track arrangement, signaling equipment, locomotives etc. which represent the "hardware" of a railway system whereas the operating rules and procedures for a safe and efficient operation may be seen as the "software" of a railway system. Since CP-nets have a semantic which builds upon true concurrency and can be extended with time concept, all conceivable behaviour of the railway system can be modelled and simulated in a natural manner. The performance of the system can be investigated. The elements of RWSs can be constructed using CPN because of its generalicity, and by using hierarchies we can model the RWS in a tractable way. By using CPN we may model both the "hardware" and the "software" aspects of a railway system. I will show how such systems can be modelled using Petri Net components with a graphical representation. The ability to visualize structure and behaviour of an RWS model promotes understanding of the system, and the model of an RWS may be executed and the dynamic behaviour can be observed graphically.
Simulation is often used in the design phase and during the early investigation of system design. This can reveal design errors at an early stage, but can never sufficiently prove the correctness of a railway system. Safety, deadlock and fairness are some of the properties we would like to analyze. To prove and validate a RWS CPN model, we can use more formal analysis methods like State Space Analysis and Place Invariants.
State Space Analysis, also called occurrence graphs or reachability graphs, are often complemented with simulations. The basic idea underlying state spaces is (in it's simplest form) to compute all reachable states, all possible occurrence sequences and state changes of the system, and represent these as a directed graph. The main problem of state space analysis is the size of the state spaces. The use of symmetries in RWS models may reduce the size and complexity in the analysis of railway systems. Such techniques, along with showing how an RWS can be modeled using CP-Nets are the main objects of this lecture.

## Student Posters

### Obuscation

#### Anna Brikmane

Regarding modern platform-independent software projects, unauthorized reverse-engineering has become an acute problem because of facile analysis of the code decompiled into the original programming language. Obfuscation is suggested among other methods of avoiding the situation and making such analysis as costly as building new application from scratch. Or maybe software developers are being mislead and proposed obfuscator solutions do not lead to the desired result? The presentation draws contemporary obfuscation fruit and attempts to answer the question as stated above.

### Approaches for modeling adaptive workflows

#### Carsten Eichholz

[1] Dittmar, Anke: "Ein formales Metamodell für den Aufgabenbasierten Entwurf interaktiver Systeme", PhD. Thesis, University of Rostock, Department of Computer Science, 2001
[2] Jablonski, S.; Böhm, M.: "Workflow Management - Entwicklung von Anwendungen und Systemen", dpunkt-Verlag, 1997.

### Shared resources: "5 hungry philosophers problem"

The management of shared resources in the systems sensitive to time is tried to be analysed in multy-agent environment - JADE. As a model for that – “5 hungry philosophers” problem (firstly formulated by Deikstra) was taken to be investigated. Classical description of the “5 hungry philosophers” problem assumes that there are five philosophers who work and feel hungry from time to time. When a philosopher feels hungry he goes to the table where there are 5 plates with rise and only one stick between neighbour plates. One of the possible (worst) states of such system is achieved when all 5 philosophers go to the table simultaneously, take one stick each to eat rise and no one can eat (we also assume that to be able to eat rise person needs 2 sticks).

As there is no need to be strictly followed by classical description, we have slightly modified the task and refused from ?sticks?. Instead we have table with maximum 3 simultaneously available free places for 5 philosophers. More parameters can be added to the system in order to simulate some real object. For instance, the level of unconsciousness can be added to the philosopher to represent the state when he is unable to go for the lunch by himself and external help (like hospital) is needed.

There are many goals which we may achieve solving this problem. How to increase time the philosophers are working? How to avoid the worst situation (when all philosophers die)? How to keep satisfaction of the philosopher at the maximum level?

In the The main goal is to build up the system to be capable to investigate the behaviour of the agents under different conditions. Since the environment is represented also by the agent it is possible to model not only static but also highly dynamic environment. Basically agents are not capable to learn and don?t have the memory of passed events, but the system can be extended to do that in the future.

### Modelling of co-operation in an organisation using a multi-agent system

#### Raul Savimaa

Formal organisations are designed for fulfilment of certain tasks. Often the outcome does not correspond to the desires of the stakeholders. There can be multiple reasons for this. In many cases there is missed out that the members of the organisation are real people with own interests, desires, and views. Usually only work processes and general capabilities of employees are considered during planning, personal interests are not modelled and therefore the resulting behaviour is predicted on partial basis.

The current paper analyses the key issues for modelling of co-operation between actors in an organisation. For this an organisation is considered as a multi-agent system and its employees as well as structural units are presented as agents. This paper also illustrates an implemented example of agent co-operation and interactions, based on a multi-agent system.

The organisations, considered in the paper, must be able to fulfil its tasks in a dynamic environment. Resources (personnel, technical, etc.) are used to achieve several sub-goals. Involvement of employees plays great role in organisation efficiency. Each individual have to choose in which amount to contribute to the production of a common good and to which amount to satisfy own interests according to personal motivations, attitudes, etc. Methods of co-operation, considered here, are coordination of actions and collaboration by sharing tasks and resources. Co-ordination of actions is necessary for managing a group of agents on order to make them together to act for fulfilment of a task. Collaboration by sharing tasks and resources is common method of co-operation in human organisations. In this case several agents work together on a common task. Tasks and resources can be shared centrally or in decentralised way.

An agent-oriented approach is suggested for modelling of previously described organisations. An organisation can be seen as a multi-agent system. Employees and/or structural units are modelled as agents. An agent (an employee, a unit, or the organisation) have methods that can be seen as job or work processes for that agent (or multiple agents). Co-operation between agents is implemented by sending messages and performing co-operative actions.

An example of a multi-agent system is introduced and demonstrated. The system consists of 6 agents, implemented in the JADE environment (see JADE, 2003). The agents has own tasks and responsibilities and they can communicate with each other by sending messages. One agent has more sophisticated behaviour that depends on its motivations and current internal state. The example illustrates how attitude and readiness for co-operation determines the actual outcome of the work processes. The general notice is that if personal characteristics as motivation, attitude, etc are modelled, the outcome is more realistic: work processes are performed more slowly and actual behaviour with different disturbing factors will come out. The concrete agents used in the example are composed on the basis of described work processes. The description of work processes is made using use case diagrams and activity diagrams in the UML. An agent class diagram is developed from a use case diagram. Sequence diagrams are used to model inter-agent behaviour and interaction protocols. Because using JADE, the executable code must be implemented manually and looked by modeller that it corresponds to the UML diagrams.