Chris McCormick    About    Membership    Blog Archive

Become an NLP expert with videos & code for BERT and beyond → Join NLP Basecamp now!

2020 NLP and NeurIPS Highlights

In this post I wanted to share some of the main themes from NLP over the past year, as well as a few interesting highlights from NeurIPS, the largest annual machine learning conference.

How to Apply BERT to Arabic and Other Languages

Up to this point, our tutorials have focused almost exclusively on NLP applications using the English language. While the general algorithms and ideas extend to all languages, the huge number of resources that support English language NLP do not extend to all languages. For example, BERT and BERT-like models are an incredibly powerful tool, but model releases are almost always in English, perhaps followed by Chinese, Russian, or Western European language variants.

Domain-Specific BERT Models

If your text data is domain specific (e.g. legal, financial, academic, industry-specific) or otherwise different from the “standard” text corpus used to train BERT and other langauge models you might want to consider either continuing to train BERT with some of your text data or looking for a domain-specific language model.