Answer Type Detection with German Text

This short note, which was originally released on my private blog,  is a part of the preparations for my master thesis.  The thesis tries to propose a Question Answering system that pursues two goals: Replace a static FAQ section with an input field to search in unstructured data Act as a part of a dialog system […]

Edit Distance revisited

Hi folks. After I’ve introduced some Natural Language Processing stuff on my blog, this article should point on a distance metric that is commonly used to correct wrong words. Some mobile phones are using this algorithm to correct the input for SMS messages and so one. There is talk of Edit Distance[1]. This metric is well-proven […]

Text Classification of Natural Language

(Declaration of incompleteness: The document is neither complete in methods for text classification nor a scientific work.) This article is as brief introduction into my research seminar at HTW Dresden. It covers the simplest algorithms used for text classification: Edit Distance, Normalized Compression Distance and a modified Edit Distance method called Substitution Distance (modified and […]

Finding Multiplier nodes without graph analysis

This article provides an overview of statistical indicators to find users that have a significant high impact of other users. In the past, this was mostly done by graph analysis. This approach uses indicators that need no graph analysis for their results. Introduction Multiplier nodes are users in social networks that have a significant number […]

Introduction to Random Forests

This text is a short extract of my research activities and can be seen as a brief introduction to the topic. If you have any questions or suggestions don’t hesitate to contact me. Random Forests are an ensemble of separately trained binary decision trees. These decision trees are trained to solve a problem together optimally. […]