Ongoing PhD Projects

Title: The effect of firms reporting to the Carbon disclosure project on their CO2 emissions
Student: Adéla Turková (Applied Statistics and Modelling Group)
Description: (An empirical study based on the synthetic control approach) Nowadays, we are more aware of climate change and so increasingly sensitive to the “green” economics in order to assure the future. Consequently, companies are pushed to rapidly and significantly cut down their CO2 emissions and review their policies. The objective of our study is to assess the pertinence of green policy introduction at the business level. For our analysis, we are using unique data sets of the firm’s CO2 emissions. We built our data by adding several firms’ characteristics to an initial database provided by South Pole Company. Based on particular companies’ specificities we were then able to select the suitable treated and control groups. In our study we intend to evaluate whether signing up to the Carbon disclosure project, as one of the binding reporting standards, has a positive effect on the firms’ emissions. It is a typical causal effect evaluation problem that we solve using a relatively new method called “synthetic control approach” developed by Abadie (2001). This approach is an econometric method used for program or treatment evaluation. Almer and Winkler (2012) used this method in environmental problematic, but to our knowledge it has never been applied to evaluate firms’ politics and indeed we will use this approach to analyse the environmental programme at a company level. We chose this method because it allows researchers to analyse phenomena that occur in a limited population or that apply to only a small number of firms, which is perfectly suited to our problematic. We are able to provide and present preliminary results of our study.
Title: Building a Knowledge Carrier Finder System
Student: Aleksandar Drobnjak (Information Systems Group)
Description: The aim of my research is to develop a Knowledge Carrier Finder System, which is based on the use of a Granular Knowledge Cube. As such it is possible to identify and suggest Knowledge Carriers to Seekers based on their degree of knowledge in particular domains. For the decision and suggestion process, an Adaptive Neuro-Fuzzy Inference System is being used, which consists out of a combination of a learning algorithm and Sugeno-type Fuzzy Inference Engine.
Title: A granular knowledge cube
Student: Marcel Wehrle (Information Systems Group)
Description: I apply the concept of granular computing on knowledge structures. With this, I build a so-called Granular Knowledge Cube, whose aim is to hierarchically structure information from the Web. To achieve this, I developed an algorithm that is capable of granulating information. With this framework, humans are given an instrument to enrich their knowledge in a specific domain.
Title: Efficient, Scalable, and Provenance-Aware Management of RDF Data
Student: Marcin Wylot (eXascale Infolab Group)
Description: The proliferation of heterogeneous semantic data on the Web requires RDF database systems to constantly improve their scalability and transactional efficiency. Despite recent advances in distributed RDF data management, processing large-amounts of RDF data in a scalable way is still very challenging. In spite of their seemingly simple data models, RDF and RDFS actually encode rich and complex graphs mixing both instance and schema-level data. At the same time, users are increasingly interested in investigating or visualizing large collections of online data by performing complex analytic queries. The heterogeneity of Linked Data on the Web also poses new challenges to database systems. The capacity to store, track, and query provenance data is becoming a pivotal feature of modern triple stores. We tackled challenges arising from both those areas: processing queries on big unstructured RDF graphs and on heterogeneous Linked Data. We introduced a new hybrid storage model for RDF data based on recurring graph patterns and RDF graph partitioning to enable complex cloud computation on massive webs of data. We also extended our techniques to efficiently handle the problem of storing, tracking, and querying provenance in RDF data.
Title: Properties of k-counting automata
Student: Samuel Vonlanthen (Foundations of Dependable Systems Group)
Regular languages play an important role in linear-time temporal veri cation and can be represented using diff erent frameworks such as Büchi automata or S1S. The subject of my thesis is to elaborate and investigate frameworks that describe languages beyond !-regularity. The main focus lies on so called k-counting automata, loosely spoken equipping Büchi automata with k blind counters that are potentially in uenced by any transition within a run and allow to enhance regular Büchi acceptance with logical formulae expressing their (un-)boundedness. The class of !-regular languages is a proper subset of the class accepted by k-counting automata. So far, k-counting automata are known to be closed under boolean operations (union, intersection and complement), the emptiness check has been proved to be decidable and by increasing the number of counters, we create an in finite hierarchy of !-languages.
Title: DIVAServices – A RESTful Web Service for Document Image Analysis Methods
Student: Marcel Würsch (Document, Image and Voice Analysis Group)
Description: In my research I try to build a bridge between computer science and the humanities. In my project I try to make document image analysis (DIA) algorithms easier to use for people without much domain knowledge.
For this I introduce DIVAServices a RESTful web service API that makes it possible to access DIA algorithms over the web without the need of any local installation. Further plans include building a tool for a semi-automatic way to add new algorithms and the possibility to run algorithms on big datasets. Read more...
Title: Entity-centric data processing
Student: Alberto Tonon (eXascale Infolab Group)
Description: «Using Semantic Web technologies together with IR and NLP techniques to provide information and to help people understanding Web content. 
In this context, I prefer to work on retrieving, elaborating and integrating data from different knowledge bases rather than in the way it is stored or transferred.»
Title : Quality of Service in Crowd-Powered Systems
Student: Djellel Difallah (eXascale Infolab Group)
Description: My research topic focuses on combining a) the intelligence of humans in solving complex problems and, b) the scalability of machines to process large amounts of data. In particular, I try to mix the two worlds by creating solutions to manage crowd-workers efficiently and to deliver timely inputs to machine requests. In my research, I developed a hybrid human-machine system for entity extraction, disambiguation and linking to external knowledge bases. In my system, the crowd is leveraged to provide machine learning labels as well as disambiguating cases where the machine based approaches have low confidence.
Title : HisDoc 2.0: Towards Computer-Assisted Palaeography
Description: My PhD project “HisDoc 2.0” is positioned at the cross roads of computer vision (modelling human vision of perceiving and understanding an image), pattern recognition (recognising patterns and irregularities in data), and palaeography (study of historical handwriting). The objective of HisDoc 2.0 is computational palaeographical analysis of historical manuscripts, more specifically the analysis of scripts, personal writing styles, and identifying writers. Existing approaches presume laboratory conditions (e.g. high-quality binarization, or pre-segmented texts regions) and focus on sub-tasks, treating interrelated tasks independently. In real-world applications, for example reliable script analysis depends on the exact localization of text regions on a page, which in presence of various kinds of scripts or personal writing styles then again depends on discriminating scripts. HisDoc 2.0 will analyze, develop, and integrate methods for text localization, script discrimination, and scribe identification into one holistic approach in order to obtain a flexible, robust, and generic approach for historical manuscript analysis in presence of complex layouts. Read more...
Title: Document Analysis and Representation
Description: My research topic is the analysis and representation of documents. I work on electronic (InDesign, PDF) and bitmap documents; extract various features about the content (text, image, graphics, fonts), layout, and visual appearance (script features such as outline orientation); and represent them as three-dimensional and colorbar visualizations. The potential applications cover the entire spectrum of document users: designer and publishers, readers, libraries and archives, historians, and computer scientists. Read more...
Title: Scalable HDFS storage for GPS trajectories
Student: Ruslan Mavlyutov (eXascale Infolab Group)
Description: My long-lasting research interests lay in the field of Natural Language Processing and Text Mining. Since 2004 I create algorithms which solve tasks related to Fact extraction, Named-entity recognition, text classification and Web-search. @Exascale Infolab my goal is to create a solid expertise in modern Big Data infrastructures and leverage this knowledge for Text Mining tasks. The research topic is about scalable low-latency data structures which reside in the Cloud (distributed tries, hash tables, interval indices, etc.)
Title: Search Assistance, Personalized Categorization and Visual Exploration of Big Data
Student: Artem Lutov (eXascale Infolab Group)
  • Multi-Scale Hierarchical Community Detection with Overlaps in Evolving Networks
  • Automatic building of the Ontology/Taxonomy for sematic datasets
  • Search and Navigation assistance in the ontology and underlying data
  • Community Structure Discovery and Clustering algorithms
  • Graph Editor for ontologies
Title: Collaborative visualizations for self-impacted data 
Student: Pierre Vanhulst (Human-IST)
Description: Commercial sensors are now common in people's life: monitoring heart rate, quality of sleep or energy consumption is within everyone's reach. However, few people know how to get the best out of all these personal data. This thesis focuses on the way we interact with those data: how can we analyze, improve and measure our performances? The project aims to build new visualizations and systems that would allow groups of people to work together and make advances findings.
Title: Template-Based Semi-Automatic Web Service Composition for Smart Environments (e.g. Smart Home)
Student: Abdaladhem Albreshne (Software Engineering Group)
Description: The presence of software systems in every aspect of our life results in increased requirements and expectations addressed by new development projects. Our research aims to propose an approach based on semantic description, generic process templates and ontologies in order to discover and compose services in a “Smart Environment” (e.g. Smart Home). Furthermore, our goal is to offer a partial automation of web service composition, with a human controller. We propose to enable customers to select and configure available services to meet their requirements and reach a specific goal.
Title: Complementation of Büchi Automata 
Student: Joël Allred (Foundations of Dependability Group)
Description: Automata on infinite structures — and in particular Büchi automata, which operate on sequences of infinite length — are used in the area of formal verification to model reactive systems, namely systems with infinite running time. The principle of the verification of a system S against a property P is to check whether the language accepted by the automaton AS representing the system is included in the language accepted by the automaton AP representing the property. The inclusion problem L(AS)⊂L(AP) can be translated into the following problem L(AS)∩L(AP)C≠∅, which involves a complementation operation. It happens that Büchi complementation is a PS-complete problem, and that it yields an automaton with 2O(n log n) states.
The goal of the thesis is to develop a new complementation method that — despite having doubly exponential worst case complexity — uses reasonable amounts of time and space and yields complement automata with a moderate blow-up, in common practical cases.
Title: Diva-DIA - An Integrated Framework for Historical Document Image Analysis
Description: We propose a novel Historical Document Image Analysis framework Diva-DIA. Diva-DIA aims at assisting user to easily produce a large amount of ground-truth document images. It will contain a set of basic image processing algorithms. Furthermore, different kinds of features will be available to aid the recognition and various types of classifiers will be integrated. The annotation result will be visualized in Diva-DIA. A user-friendly interface will be available for manual error correction and parameter tuning of the classifier to improve the annotation. We propose employing online learning for ground-truth creation i.e., existing and generated ground-truth data validated by the user to produce new ground-truth. Read more...
Title: Support of Human Activities in a Pervasive Environment
Description: The goal of the Pervasive Coordination project (PeCo) project is to develop a coordination model for pervasive and smart environments. The model is implemented in our middleware framework which can be used to
  • design pervasive top level applications
  • design context, motion and activity aware HMI
  • provide pervasive services in a smart environment by the use of calm and shy interfaces
  • coordinate human and machine tasks
Title: LOCUS - Local Communities and Online Tools in Syria
Description: With the increasing spread and availability of computation and the internet, Information and Communication Technology (ICT) holds promising potential for enhancing the realities of under priviledge communities. Our position is that development is served by contributing to the empowerment of local communities. Local community members experience local problems from a first-hand perspective, and they have direct cultural and working knowledge regarding the specific context of their situation. Therefore, we view that collaborating with local communities to collaborate and take action, aided by innovative ICT tools, is a promising approach towards development. The project focuses on observing the communal use of online tools in Syria, and specificially on learning how different tools are appropriated by community members to support community activities and growth. The study shall inform the design of new tools, which we aim to develop through close collaboration with the target community. Ultimately, these tools are targeted to support community building through supporting participation, capacity building and collaboration. Read more...
Title: Entity-based Knowledge Extraction and Discovery Systems
Roman Prokofyev (eXascale Infolab Group)
Description: The modern world is built around knowledge and it is impossible to think without knowledge. As the volume of accumulated knowledge continues to grow, the need for innovation in the domain of knowledge extraction and discovery is stronger than ever. This is equally important for both general and domain-specific technical knowledge, like physics or computer science. While there are many ongoing efforts to provide structured storage for such domain-specific data, a large part of it still remains in the unstructured form (e.g. text). My research is concentrated on bringing together exisiting and developing new methods to provide effective knowledge extraction from unstructured technical data, knowledge processing and discovery. Specifically, the focus is on performing named-entity semantic relationships between those entities and providing effective document discover based on the extracted entities. 
Title: Signal Processing and Modelization for Time Series in the Energy and Medical Domain
Description: The topic of my Ph.D. thesis is focused on the Smart Living domain. The aim of my thesis is the analysis and processing of data coming from sensors placed in a Smart Environment and from wearable sensors in order to optimize the energy consumption in building while preserving human comfort. The analysis includes the use of machine learning algorithms, in particular using state-based models, as the Hidden Markov Models. 
Title: FEOGARM - A Framework to Evaluate and Optimize Gesture Acquisition and Recognition Methods
Student: Simon Ruffieux (HES-FR and Document, Image and Voice Analysis Group)
Description: The goal of this thesis is to provide a powerful methodology and a facilitating framework to researchers working on the upper-body gesture recognition field in order to quantitatively compare the efficiency of their tracking and recognition algorithms as independently as possible of the technologies and devices used.
Title: An Architecture for the Web of Things
Student: Andreas Ruppen (Software Engineering Group)
Description: Technological advances allow for cheap devices with networking capabilities. These devices make the Internet of Things possible. However, there is no standard way of communication between these devices. This makes the construction of applications relying on devices of different owners difficult and out of the scope of most users. The Internet of Things is more a network of islands with very few communication between the different islands. The Web of Things tries to bring standards into this world, thus, allowing simpler creation of mashup applications and the integration of things into the Web as first class citizen.
On the other side industry has adopted WS-* and SOA approaches for heavy computational and B2B applications. At a first glance they have nothing in common with the WoT. However, by combining both of them we can leverage the power delivered by todays SOA infrastructure into the WoT. This starts with simple computational services like unit translation and ends with fully blown business processes. On the other side, this combination allows business process to communicate with smart devices and integrate them into their models. This in turn makes the models richer.
Title: Mid-air hand gesture human-machine interaction: design, recognition and evaluation
Student: Matthias Schwaller (Document, Image and Voice Analysis Group)
Description: Nowadays, tactile screens are very common. Users like the very intuitive way of using them (IPhone, Android, etc.) mainly because there is no new knowledge to operate them. We would like to build up on this simplicity and go to the next generation of gestural interaction, namely mid-air hand gestures. The goal of this thesis is to explore mid-air hand gesture for human-computer interaction. For this purpose research works concentrate on the design, development and evaluation of mid-air hand gestures recognizers with different kind of gestures. Furthermore, the thesis will explore the use of various forms of feedback to augment usability and precision, and also to recover from recognition errors. Finally, our evaluations will concentrate not only on the performance of the recognizer systems, but also on usability and effort metrics, so that gestures can be performed during a long time period, avoiding movements tiring or uncomfortable for users.
Title: Printed Multi-Font and Multi-Size Arabic Word Recognition at Ultra-Low Resolution
Student: Fouad Slimane (Document, Image and Voice Analysis Group)
Descrpition: The objective of this thesis is to develop a multi-font and multi-size recognition system for Arabic text images at ultra-low resolution and to extend it easily to the recognition of Arabic handwritten or scanned documents. This system is based on hidden Markov models and Gaussian mixture models. The goal is also to bring together a wide vocabulary to develop a recognition system for open vocabulary that can recognize any Arabic words. The system is benchmarked on the Arabic Printed Text Image (APTI) database. Read more...
Title: HisDoc--Historical Document Analysis, Recognition, and Retrieval
Description: The project brings together three different research groups in the fields of document image analysis, handwriting recognition, and information retrieval, respectively. Its aims are to offer tools to support cultural heritage preservation by making historical documents, particularly medieval documents, electronically available for access via the Internet. It intends to propose generic tools that can be adapted, without effort, to other types of documents and languages. My work is to use machine learning methods to segment historical images. In detail, the machine learning methods which I am using include support vector machine, neural network, etc.