Transformers Offers NLP For TensorFlow and PyTorch
Written by Kay Ewbank   
Monday, 07 October 2019

A Python library offering Natural Language Processing for TensorFlow 2.0 and PyTorch has been released by HuggingFace.

Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides state-of-the-art general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, and XLNet) for Natural Language Understanding (NLU) and Natural Language Generation (NLG).

transformers

Facebook developed ROBERTa, and the researchers describe their model as a robustly optimized method for pretraining natural language processing (NLP) systems that improves on Bidirectional Encoder Representations from Transformers, or BERT, the self-supervised method released by Google in 2018.

BERT and XLNet are models created by Google.  BERT uses pre-training and fine-tuning to create NLP models tasks such as answering systems, sentiment analysis, and language inference, and is designed to pre-train deep bidirectional representations from unlabeled text. XLNet is an auto-regressive language model.

OpenAI created GPT-2, a transformer-based generative language model that was trained on 40GB of curated text from the internet.

HuggingFace themselves developed DistilBERT, which is based on BERT but uses a smaller language model with about half the total number of parameters of BERT base while retaining 95 percent of BERT’s performances on the language understanding benchmark GLUE.

The PyTorch version of the library has been installed more than 500,000 Pip installs this year. The library also includes an abstraction layer for each model to make it easier to integrate the model into a project. PyTorch-Transformers is already being used by large organisations including Microsoft and Apple.

The models included in Transformers are the best options for various NLP tasks, and some are very new. Their inclusion means anyone can make use of the many hours of training and large amounts of training data that has been undertaken by the original model creators using expensive GPU hardware which would be out of reach for developers unless they are working for a big technology company or research lab. The library makes this all available to anyone.

The library comes with 32 pretrained models in more than 100 languages, and the developers say it offers deep interoperability between TensorFlow 2.0 and PyTorch.

transformers

 

More Information

Transformers On GitHub

Related Articles

Facebook Open Sources Natural Language Processing Model  

PyTorch Adds TorchScript API

NVIDA Updates Free Deep Learning Software

TensorFlow - Googles Open Source AI And Computation Engine

TensorFlow 2 Offers Faster Model Training

Rule-Based Matching In Natural Language Processing  

Zalando Flair NLP Library Updated

Intel Open Sources NLP Architect

Google SLING: An Open Source Natural Language Parser

Spark Gets NLP Library

Microsoft Expands Cognitive Services APIs

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


pg_parquet - Postgres To Parquet Interoperability
28/11/2024

pg_parquet is a new extension by Crunchy Data that allows a PostgreSQL instance to work with Parquet files. With pg_duckdb, pg_analytics and pg_mooncake all of which can access Parquet files, is  [ ... ]



The Art Of Computer Programming - A Great Present
15/12/2024

If you are looking for a programmer present this holiday season, there is one book, or set of books, that should be top of any list... Donald Knuth's The Art of Computer Programming.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info