Rashedur Rahman joined SystemX in 2015 to work on his thesis on knowledge base population based on entity graph analysis. He looks back on his PhD carried out within SystemX and supervised by LIMSI.
What is the subject of your thesis?
My thesis subject is automatic Knowledge Base Population (KBP) from texts. A Knowledge Base (KB) is a structured collection of information which represents the facts of the world. A fact in a KB refers to a relationship between two real-world entities. A KB provides us precise information about different types of entities such as person, organization, location etc. Thus, it facilitates factual question answering. KBP is the task of constructing a KB. Since text format information on the web is increasing every moment, KBP task has to be done automatically by extracting the new facts (relation between two entities) from the free texts. My PhD thesis is focused on validating the relation hypotheses which are generated by multiple systems for KBP task. I consider relationship validation as a binary classification task. The objective is to discard a large number of false relations by keeping the correct ones to construct a more accurate KB.
I explored how to use the surrounding (neighbor) entities in relationship validation task. A neighbor of a particular entity is an entity that co-exists with that particular entity in a sentence. The neighbors of two entities in a relation hypothesis give some clues to justify if the pair of entities holds a true relationship. A graph facilitates the representation of different types of entities and their co-existence relationships in the texts. An entity-graph is constructed from a text corpus which helps to identify the neighbors of a particular entity. I proposed several features computed on the entity-graph to justify if two entities are in a true relationship. Moreover, some linguistic features and trustworthy features have been taken into account for validating a claimed relation. The proposed graph-based features have shown a great impact on KBP task.
What do you particularly remember about your PhD?
During my PhD, I learned how to define a research problem and to keep focused on the specific research topic. I also learned how to prepare the necessary resources to do a research work. Moreover, I had an opportunity to meet both the academic and industrial people at SystemX and some conferences that helped me to learn how an academic research is directed towards an industrial product.
What is your pleasant memory of your time at SystemX?
I have not a particular one but several great memories at SystemX. Specially I recall the IMM (Multimedia Multilingual Integration) project’s team which was very welcoming and cooperative. At the beginning of my PhD, IMM team members helped me to learn about the whole project and to prepare the necessary resources required for my research work. I also enjoyed the ThesisDay@SystemX which gave an opportunity to present my work and to meet people from different domains. Moreover, I found a nice environment at SystemX where all the people have beautiful minds.
What do you plan to do next?
I have just started working at LIMSI, CNRS as a postdoctoral researcher. Here I work on biomedical data analysis for the GoAsq project. It is a very interesting task because it makes a bridge between the information science and medical science. Moreover, it offers me the opportunity to contribute to the human lives by doing some advance research.
Find out more about Rashedur Rahman
Thesis subject: Knowledge Base Population based on Entity Graph Analysis
Last degree before PhD: Master of Engineering in Information Technology (Frankfurt University of Applied Sciences, Germany)