Up to this point, our tutorials have focused almost exclusively on NLP applications using the English language. While the general algorithms and ideas extend to all languages, the huge number of resources that support English language NLP do not extend to all languages. For example, BERT and BERT-like models are an incredibly powerful tool, but model releases are almost always in English, perhaps followed by Chinese, Russian, or Western European language variants.
In this blog post / Notebook, I’ll demonstrate how to dramatically increase BERT’s training time by creating batches of samples with different sequence lengths.
While working on my recent Multi-Class Classification Example, I was having trouble with running out of memory on the GPU in Colab–a pretty frustrating issue!
If your text data is domain specific (e.g. legal, financial, academic, industry-specific) or otherwise different from the “standard” text corpus used to train BERT and other langauge models you might want to consider either continuing to train BERT with some of your text data or looking for a domain-specific language model.
In conjunction with our tutorial for fine-tuning BERT on Named Entity Recognition (NER) tasks here, we wanted to provide some practical guidance and resources for building your own NER application since fine-tuning BERT may not be the best solution for every NER application.