Boris Ginsburg – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2023-03-14T23:16:05Z http://www.open-lab.net/blog/feed/ Boris Ginsburg <![CDATA[Develop Smaller Speech Recognition Models with the NVIDIA NeMo Framework]]> http://www.open-lab.net/blog/?p=16063 2023-03-14T23:16:05Z 2019-12-10T16:00:44Z As computers and other personal devices have become increasingly prevalent, interest in conversational AI has grown due to its multitude of potential...]]>

As computers and other personal devices have become increasingly prevalent, interest in conversational AI has grown due to its multitude of potential applications in a variety of situations. Each conversational AI framework is comprised of several more basic modules such as automatic speech recognition (ASR), and the models for these need to be lightweight in order to be effectively deployed on…

Source

]]>
11
Boris Ginsburg <![CDATA[Pretraining BERT with Layer-wise Adaptive Learning Rates]]> http://www.open-lab.net/blog/?p=15981 2022-08-21T23:39:41Z 2019-12-05T18:39:10Z Training with larger batches is a straightforward way to scale training of deep neural networks to larger numbers of accelerators and reduce the training time....]]>

Training with larger batches is a straightforward way to scale training of deep neural networks to larger numbers of accelerators and reduce the training time. However, as the batch size increases, numerical instability can appear in the training process. The purpose of this post is to provide an overview of one class of solutions to this problem: layer-wise adaptive optimizers, such as LARS, LARC…

Source

]]>
0
Boris Ginsburg <![CDATA[Mixed Precision Training for NLP and Speech Recognition with OpenSeq2Seq]]> http://www.open-lab.net/blog/?p=12300 2022-08-21T23:39:09Z 2018-10-09T13:00:45Z The success of neural networks thus far has been built on bigger datasets, better theoretical models, and reduced training time. Sequential models, in...]]>

Source

]]>
1
���˳���97caoporen����