👩🏿‍🔧 🌎 ⛹🏻 受控制的一代：如何遏制强大的语言模型 ✴️ 🤷🏾 🔷

介绍

如果您最近几年没有睡过头，那么您当然会听说过变压器-规范的架构是您所需要的。为什么变压器这么好？例如，他们避免了重复发生，这使他们能够有效地创建这样的数据表示形式，可以将许多上下文信息推入其中，这对生成文本的能力和无与伦比的学习转移能力产生了积极影响。

变形金刚推出工作对语言建模雪崩-任务，其模型选择下一个字，考虑到以前的话，那就是学习的概率p(x)在x当前令牌。您可能会猜到，此任务根本不需要标记，因此您可以在其中使用巨大的无注释文本数组。已经受过训练的语言模型可以生成文本，以至于作者有时拒绝布局受过训练的模型。

但是，如果我们想在文本生成中添加一些“笔”怎么办？例如，通过设置主题或控制其他属性来进行条件生成。这样的形式已经需要条件概率p(x|a)，其中a期望属性是。有趣？让我们继续前进！

即插即用语言模型：控制文本生成的简单方法

本文的作者使用沉重的预训练语言模型（以下简称LM）和几个简单的分类器，提供了一种简单的（因此即插即用）优雅的方法来生成条件，从而从视图分布中进行采样p(x|a) ∝ p(a|x)p(x)。应当注意的是，原始LM并未进行任何修改。作者提出了两种形式的分类器，在本文中称为属性模型：用于主题控制的BoW和用于音调控制的线性分类器。作者对他们的主要贡献进行了相当详细的分析，将其方法的思想和方法与其他文章进行了比较。最重要的一点是方法的易用性，也许在这里，只看一下这

张盘子：可以看出，PPLM在参数数量上胜过所有竞争对手。

加权解码2.0

Uber weighted decoding: , . , , . , . , , , .

Uber : , LM, . , , , ( , ) . ( perturb_past — , .

? log-likelihood: p(x) a attribute model p(a|x). , backward pass .

log-likelihood? , :

, , LM. , fluency LM.

, :

forward pass LM, p(a|x) — attribute model. backward pass, , attribute model, , . , .

, : “” k k forward backward pass’, n. LM forward pass. , : ( num of iterations=3 gen length=5, ).

, ( colab , ) , , , “the kitten” “military” :

The kitten is a creature with no real personality, it is just a pet. You can use it as a combat item.
The kitten that is now being called the "suspected killer" of a woman in a San Diego apartment complex was shot by another person who then shot him, according to authorities.

combat, shot, killer — , military. LM :

The kitten that escaped a cage has been rescued from a cat sanctuary in Texas.
The cat, named "Lucky," was found wandering in the back yard of the Humane Society at the time of the incident on Friday.

attribute models

, BoW discriminator. :

p_t+1 — LM, w_i — i- .
Discriminator model , BoW, , , , . , .

, LM, LM weighted decoding CTRL (conditional LM). fluency , , perplexity . PPLM :

B — baseline, GPT-2 LM;
BR — , B, r , log-likelihood ;
BC — , ;
BCR — , BC, r , log-likelihood ;
CTRL — Keskar et al, 2019;
GPT2-FT-RL — GPT2, fine-tuned RL ;
WD — weighted decoding, p(a|x);

— , LM, . , , - :)

受控制的一代：如何遏制强大的语言模型

介绍

即插即用语言模型：控制文本生成的简单方法

加权解码2.0

attribute models

More articles: