PitchSend

OpenAI Product Presentation Deck

Before Al progress is driven by labeled datasets. The result Unsupervised NLP (2018) 117M-parameter transformer trained by reading 7,000 self-published books which, with a small amount of supervised fine-tuning, sets state-of-the-art on a huge variety of NLP datasets After Unlabeled data can be even more important than labeled. Dataset SNLI MNLI Matched MNLI Mismatched SciTail ONLI ATE STS-B OOP MAPC RACE ROCStories COPA SST-2 COLA GLUE Task Textual Entailment Textual Entailment Textual Entailment Textual Entailment Textual Entailment Textual Entailment Semantic Similarity Semantic Similarity Semantic Similarity Reading Comprehension Commonsense Reasoning Commonsense Reasoning Sentiment Analysis Linguistic Acceptability Muls Task Benchmark SOTA 80.6 80.1 82.3 66.1 77.6 Ours 82.1 88.3 56.0 82.0 70.3 59.0 78.6 91.3 45.4

View entire presentation