Interpreting LSI Document Similarity

04 Nov 2016

In this post I’m sharing a technique I’ve found for showing which words in a piece of text contribute most to its similarity with another piece of text when using Latent Semantic Indexing (LSI) to represent the two documents. This has proven valuable to me in debugging bad search results from “concept search” using LSI. You’ll find the equations for the technique as well as example Python code.

Word2Vec Resources

27 Apr 2016

While researching Word2Vec, I came across a lot of different resources of varying usefullness, so I thought I’d share my collection of links and notes on what they contain.

Word2Vec Tutorial - The Skip-Gram Model

19 Apr 2016

This tutorial covers the skip gram neural network architecture for Word2Vec. My intention with this tutorial was to skip over the usual introductory and abstract insights about Word2Vec, and get into more of the details. Specifically here I’m diving into the skip gram neural network model.

Google's trained Word2Vec model in Python

12 Apr 2016

In this post I’m going to describe how to get Google’s pre-trained Word2Vec model up and running in Python to play with.

Latent Semantic Analysis (LSA) for Text Classification Tutorial

25 Mar 2016

In this post I'll provide a tutorial of Latent Semantic Analysis as well as some Python example code that shows the technique in action.