• Tidak ada hasil yang ditemukan

본 논문은 주어진 도메인 상에서 외부 텍스트 데이터의 추가 사용 없 이 TextQA의 성능을 향상시키기 위해 데이터 증강 기법을 활용한 Augmented TextQA 방법을 제안한다. 그 결과 모델이 주어진 데이터뿐 만 아니라 지문 내에서 답이 될 수 있는 주요 키워드를 예상하고 학습하 여 더 많은 질문에 올바르게 대답할 수 있는 모델을 만들 수 있었다.

Augmented TextQA 방법은 학습 할 때 키워드를 보고 그 예상 문제를 생각하며 공부하는 사람의 학습 방법과 유사하며, Bottou (2014)가 주장 했던 키워드 추출, 예상 질문 생성, 추가 학습의 문제 해결 과정을 모듈 을 쌓아 구현하는 Reasoning의 과정을 따르고 있다.

Augmented TextQA는 기존 TextQA에 비해 질문을 생성하는 모델을 추가로 만들어야한다는 단점이 있지만, 성능의 소수점 차이에 의해서 등 수가 변화하는 TextQA 대회에서는 충분히 시도할 가치가 있는 방법이 라고 생각된다. 그 다음으로 질문을 생성하는 모델을 학습하는 과정이나 키워드를 추출하는 과정에서 오류가 발생하겠지만 두 오류를 충분히 최 소화시킨다면 TextQA 모델이 기존 데이터에 overfitting하지 않도록 모 델을 일반화(generalization)하는 역할을 해줄 수 있을 것이라 기대한다.

TextQA에 대한 추가 연구로 앞서 제시했던 Seq2Seq 기반 모델 3개 를 적절히 혼합하여 성능이 더 뛰어난 모델을 만들고, SQuAD의 특성에 걸맞도록 모델이 생성한 답이 지문 속 어디에 해당하는지를 찾는 추가적 인 엔지니어링 과정을 통해 Exact Match와 F1 score를 향상시켜 리더보 드에 등록해보았으면 한다.

참고문헌

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., &

Kuksa, P. (2011). Natural language processing (almost) from scratch.

Journal of Machine Learning Research, 12(Aug), 2493-2537.

Chai, K. M. A., Chieu, H. L., & Ng, H. T. (2002, August). Bayesian online classifiers for text classification and filtering. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 97-104). ACM.

Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C.

(1992). Class-based n-gram models of natural language. Computational linguistics, 18(4), 467-479.

Eddy, S. R. (1996). Hidden markov models. Current opinion in structural biology, 6(3), 361-365.

Rieger, B. B. (1991). On distributed representation in word semantics.

Berkeley, CA: International Computer Science Institute.

Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury, J., Gulrajani, I., ...

& Socher, R. (2016, June). Ask me anything: Dynamic memory networks for natural language processing. In International Conference on Machine Learning (pp. 1378-1387).

Wang, W. Yang, N. Wei, F. Chang, B. & Zhou, M. (2017). Gated selfmatching networks for reading comprehension and question answering.

in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.

Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10), 1345-1359.

Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) (pp. 1188-1196).

Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove:

Global Vectors for Word Representation. In EMNLP (Vol. 14, pp.

1532-1543).

Wan, J., Wang, D., Hoi, S. C. H., Wu, P., Zhu, J., Zhang, Y., & Li, J.

(2014, November). Deep learning for content-based image retrieval: A comprehensive study. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 157-166). ACM.

Zhang, X., & LeCun, Y. (2015). Text understanding from scratch. arXiv preprint arXiv:1502.01710.

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).

Luong, M. T., Le, Q. V., Sutskever, I., Vinyals, O., & Kaiser, L. (2015).

Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114.

Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., & Bengio, Y. (2016, March). End-to-end attention-based large vocabulary speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on (pp. 4945-4949). IEEE.

Mostafazadeh, N., Brockett, C., Dolan, B., Galley, M., Gao, J., Spithourakis, G. P., & Vanderwende, L. (2017). Image-Grounded Conversations:

Multimodal Context for Natural Question and Response Generation. arXiv preprint arXiv:1701.08251.

Dong, L., & Lapata, M. (2016). Language to logical form with neural

attention. arXiv preprint arXiv:1601.01280.

Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., &

McClosky, D. (2014, June). The stanford corenlp natural language processing toolkit. In ACL (System Demonstrations) (pp. 55-60).

Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). Squad: 100,000+

questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.

LeCun, Y., & Bengio, Y. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), 1995.

Bottou, L. (2014). From machine learning to machine reasoning. Machine learning, 94(2), 133-149.

ABSTRACT

Data Augmentation Technique with Knowledge Extraction for Text Question

Answering by Deep Neural Networks

Hwiyeol Jo

School of Computer Science & Engineering The Graduate School Seoul National University

Teaching a machine that communicates with people in natural language is a dream of artificial intelligence researchers. However, due to the numerous exceptions and uncertainties in language, language modeling with traditional rule-based approach has several limitations. Recently, language modeling research has been continuing to overcome the limitation by introducing deep learning algorithms. Among the studies of natural language processing, TextQA is a study that generates an answer by looking at the context and

question, and it is suitable to test how well the language model understands the natural words. Despite many studies, it has not yet reached the performance of people. In this paper, we propose Augmented TextQA for improving the performance of TextQA models. Augmented TextQA first creates a question generation model using an answer and a context, and generates a question through the model by inputting a keyword extracted from the context. Finally, we try to improve the performance of TextQA model by adding the augmented data. Experimental results using the SQuAD (Stanford Question Answering Dataset) show that the model performance is improved in Augmented TextQA compared with original TextQA.

Furthermore, using Augmented TextQA can augment the data on the domain of train data without using external data, so it would learn the representation of the word in a specific domain more appropriately. This method also can generalize the TextQA model not to be limited to the original train data but to fit on the extended train data.

Keywords: Data Augmentation, Deep Learning, Natural Language Processing, Text Question-Answering

Student Number: 2015-22915

부 록

부록 1. TextQA 결과 예시 부록 2. 질문 생성 결과 예시

부록 3. Augmented TextQA 결과 예시

[부록 1] TextQA 결과 예시 (좋음 - 보통 - 나쁨 순)

[부록 2] 질문 생성 결과 예시 (좋음 - 보통 - 나쁨 순)

Dokumen terkait