Recent posts

Bert(memo)

2 minute read

BERT(Pre-training of deep bidirectional transformers for language understanding) Introduction 기존 GPT model: Transformer decoder를 이용하여 autoregressive 한 lang...