Training Verifiers to Solve Math Word Problems

数据集 GSM8K 和论文 Training Verifiers to Solve Math Word Problems

openai在2021年一篇文章，当时语言模型在多步的数学推理上表现很差，文章中采用了一种generator+verifier的方式来提升模型效果。当前使用的GSM8K（grade school math 8.5K）评测集也是这篇文章提出的。

GSM8K：共有8.5K的高质量中学数学题，7.5K的训练集和1K的测试集，每个问题的解题步骤在2-8步，通过普通的加减乘除就能算对。

考古OpenAI，Anthropic论文1 : Training Verifiers to Solve Math Word Problems

数据格式，输入形式为：

If Buzz bought a pizza with 78 slices at a restaurant and then decided to share it with the waiter in the ratio of 5:8, with Buzz's ratio being 5, what's twenty less the number of slices of pizza that the waiter ate? 

Step 1: The total ratio representing the pizza is 5+8 = <<5+8=13>>13. ки

Step 2: The waiter ate 13 x 8 / 13 = <<13*8/13=6>>6 slices of the pizza. ки

Step 3: Buzz ate 78 - 6 = <<78-6=72>>72 slices of the pizza. ки

Step 4: The waiter ate 20 less than the number of slices that Buzz ate which is 72 - 20 = 52. ки

Step 5: The waiter ate 52 slices of the pizza. The answer is: 52 ки

标签：

MathVerifiers

Training Verifiers to Solve Math Word Problems

results matching ""

No results matching ""