The field of natural language processing and the many topics encompassed within it (summarization, full-text search, sentiment analysis, content categorization, etc.) is one of fastest growing, most complex and most highly demanded knowledge sets in the software industry today.
From spell checking in your SMS client to programmatically evaluating what your Twitter followers think of you, there is no shortage of real-world text processing and linguistic analysis problems all around us waiting to be solved. As Rubyists and software engineers, its important for us to know what tools related to NLP are available to us and how we can make use of them most effectively.
While there are a number of really great open-source libraries for natural language processing in Ruby and many great strides have been made in recent years, there’s still often a need to leverage tools and libraries externally from the Ruby ecosystem. Some of the best open-source NLP frameworks available rely very heavily on contributions from the academic world where Ruby as a language doesn’t have the same presence as other languages like Python or Java.
In this talk, I’ll provide a beginner friendly introduction to NLP in general and I’ll give a quick overview of the tools and related projects that are currently available in the Ruby community. In addition, using real-world examples I’ll demonstrate how to painlessly leverage high performance, mature and well-established NLP libraries directly from your Ruby application using JRuby and JDK 7.