Basics of vector algebra
Understanding vector algebra is a prerequisite to selecting meaningful distance metrics for text embeddings. For the fun of it, let’s recall some of the basics.
Understanding vector algebra is a prerequisite to selecting meaningful distance metrics for text embeddings. For the fun of it, let’s recall some of the basics.
I have written three blog posts about how to use Lucene 7 and OpenNLP to index part-of-speech tags and then use phrase queries to search on these tags. What I haven’t shown so far is what’s so cool about having such a capability.
Now that we can do searching on indexed part-of-speech tags what’s still missing is a way to introduce an order of search terms. Remember: All POS tags in our query are simply ORed together. So, how an we achieve this?
In my previous post I promised I’d describe how to perform searches on indexed part-of-speech data with Lucene 7 and OpenNLP. Let’s have a look. (Thanks to Koji on this one!)
I’ve carried the idea to use OpenNLP to do part-of-speech tagging and index the POS tags with Lucene around with me for quite some time. Turns out Lucene 7 comes shipped with support for OpenNLP. Of course I had to try it out.