Personalized Text Summarization

Róbert Móro, Supervisor: Prof. Mária Bieliková



Information overload is one of the most serious problems of the present-day Web. There are various approaches addressing this problem; we are interested mainly in two: automatic text summarization and personalization.

Automatic text summarization aims to extract the most important information from the document, which can help readers (users) to decide whether it is relevant for them and they should read the whole text or not. However, conventional (generic) summarization methods summarize the content of the document without considering the differences in users, their needs or characteristics. Personalized summarization, on the other hand, uses this additional information about users’ characteristics to produce summaries more suitable for a particular user’s needs.

Our Approach

We have proposed a method of personalized summarization based on a method of latent semantic analysis (LSA). The main idea of our approach is to identify sources of adaptation and personalization and combine them to produce personalized summaries capable of extracting information from the document that is the most important or interesting for a particular user. For this purpose, we have proposed a set of specific raters and a method of their combination which allows considering various parameters or context of the summarization.

Our method is domain- and language-independent. However, we have focused on the domain of learning and the specific scenario of summarization for knowledge revision during evaluation.

In the knowledge revision scenario, we have to consider other aspects as well, such as the time of revision and the way of selecting the documents to revise from. That is why we have also proposed a personalized method of selecting the documents for revision which takes into account various characteristics, e.g. recent changes of a student’s knowledge supporting concepts, the knowledge of which the user has recently gained or, on the contrary, lost.


We have experimented with our proposed method in the educational system ALEF. The participants of our two experiments were 75 students attending the Functional and Logic Programming and Principles of Software Engineering courses.

Our experimental results suggest that considering the relevant domain terms, as well as users' personal and popular annotations (highlights), leads to to selecting representative sentences capable of summarizing the document, even for revision.