DataSet paper 리뷰 #1 CoQA : A Conversational Question Answering Challenge

REVIEW 2020. 4. 16. 11:32

abstract
- 127k Q and A , 8K conversation pasages, QA에 대한 Evidence존재.
- 대화형 질문이기에 기존의 지문보다 다른 현상을 보임
Introduction
- 전체 데이터 셋이 대화가 아니라 QA부분만 대화형식
- 첫번째 목표. 대화 속에서 자연스러운 질문을 찾는것. → 질문이 짧아도(ex. Who?) 의미를 찾을 수 있도록.
- 두번째 목표. 대화에 자연스러운 답변을 하는 것. 기존의 QA는 주어진 passage에서 부분을 찾아내는 것임. → Free From answer. (dataset ex. MS MARCO, NarrativeQA) → BLEU, ROUGE metric
- 세번째 목표. 여러 도메인에 적용 가능. 해당 데이터 셋은 Children's stories, literature, middle and high school English exam, news, wikipedia, reddit and science.
- 세가지 특징
  - It consists of 127k conversation turns collected from 8k conversations over text passages. The average conversation length is 15turns, and each turn consists of a question and an answer.
  - It contains free-form answers and each answer has a span-based rationale highlighted in the passage.
  - Its text passages are collected from seven diverse domains: five are used for in-domain evaluation and two are used for out-of-domain evaluation.
Task Definition
- 주어지는 것 : passage, conversation (사실상 질의 응답)
- question Q, answer A, evidence R(본문의 한부분)
- 만약 2번째 질문을 하면, 그 전 대답과 질문을 포함하여 대답
- ⇒ Ex. He와 같이 그 전 대답을 지칭하는 경우가 존재한다.
Dataset collection
- 두 명의 사람(질문자, 대답자)이 passage를 보고 자연스럽게 질의응답
- 막역한 질문 or 오답을 말하면 bad worker로 식별
- 두 명은 가이드라인을 논의 가능
Collection Interface
- passapge selection : MCTest, RACE, CNN, Wikipedia, Reddit, AI2 Science
Models
- conversational response generations : seq2seq, PGNet
- reading comprehension : DrQA, Augment DrQA ⇒ 일반적인 경우에서 좋음
- Combined Model : DRQA+PGNet ⇒ 복잡한 경우에 좋음(Multiple 등)
Related Work
- Knowledge source : table or graph database or image and video(human friendly)
- Naturalness : 다양한 질문 형태의 QA
- Conversational Modeling : 다양한 형태의 데이터셋 소개. 지문을 읽고, QA가 붙는 경우가 대다수
- reasoning : algebraic reasoning, logical reasoning, common sense reasoning, multi-fact reasoning
- Recent progress : FlowQA, BERT

'REVIEW' 카테고리의 다른 글

추천#1) 룰 기반의 연관 분석, Apriori, FP-Growth (0)	2021.08.07
KG#3 ) Translation Model for KC (TransE, TransR 리뷰) (0)	2021.04.11
KG#2) Knowledge Completion 개념 및 주요 TASK (0)	2020.10.04
KG #1) EmbedKGQA : Improving Multi-hop Question Answering over Knowledge Graphs usingKnowledge Base Embeddings 리뷰 (0)	2020.07.23
Attention #1 Attention의 첫 등장 (1)	2020.02.27

ABOUT ME

AnyThing AnyThing

'REVIEW' 카테고리의 다른 글

티스토리툴바

ABOUT ME

'REVIEW' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바