Fabian Kostadinov

Introduction to Lucene 7 OpenNLP - Part 1

I’ve carried the idea to use OpenNLP to do part-of-speech tagging and index the POS tags with Lucene around with me for quite some time. Turns out Lucene 7 comes shipped with support for OpenNLP. Of course I had to try it out.

Read More 

Rhizomes as transistors

There is a natural similarity between a rhizome’s relation and how modern transistors work. As it turns out, we could actually build a rhizome (with limitations) using transistors. Have a look at the following example.

Read More 

Yes, but I'm a software engineer...

Being a software engineer is not an excuse to complain about your data science colleagues. You don’t have to become an expert in machine-learning or statistical analysis, but it’s actually a lot of fun to dive a little deeper into some of these topics and learn more about, let’s say, categorisation algorithms. And it even looks sexy in your CV.

Read More 

Yes, but I'm a data scientist...

Being a data scientist is no excuse for writing sloppy code. Yeah, I know that Java is not your first coding language, but you should really not write spaghetti code.

Read More 

How to implement a text mining engine

There are various text mining libraries, packages and tools available, many of them as freeware. Yet, when it comes to putting it all together in an enterprise environment, there is actually not too much information available on the web. This article is about how I would design a general-purpose text mining engine that is fit for today’s standard Java-stack enterprise environment and the typical problems one encounters in these environments. A lot of what I write below is from hands-on experience with existing tools and the typical difficulties I had.

Read More