from GPT-4o
Seq2Seq模型的发展历史涉及多个关键节点、重要论文和关键人物。以下是一些主要的里程碑:
Long Short-Term Memory
Hochreiter, S., & Schmidhuber, J. (1997). "Long Short-Term Memory." Neural Computation.
这篇论文介绍了LSTM的基本结构和原理。
https://deeplearning.cs.cmu.edu/F23/document/readings/LSTM.pdf
Sequence to Sequence Learning with Neural Networks
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). "Sequence to Sequence Learning with Neural Networks." NIPS.
这篇论文提出了Seq2Seq模型,并展示了其在机器翻译任务中的应用。
https://ar5iv.labs.arxiv.org/html/1409.3215?_immersive_translate_auto_translate=1
Neural Machine Translation by Jointly Learning to Align and Translate
Bahdanau, D., Cho, K., & Bengio, Y. (2015). "Neural Machine Translation by Jointly Learning to Align and Translate." ICLR.
这篇论文引入了注意力机制,使得Seq2Seq模型在长序列任务中的表现得到了显著提升。
https://ar5iv.labs.arxiv.org/html/1409.0473?_immersive_translate_auto_translate=1
Attention Is All You Need
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). "Attention Is All You Need." NIPS.
这篇论文提出了Transformer架构,彻底改变了Seq2Seq模型的实现方式。
https://ar5iv.labs.arxiv.org/html/1706.03762?_immersive_translate_auto_translate=1
这些关键节点、论文和人物共同推动了Seq2Seq模型的发展,使其在自然语言处理和其他序列到序列任务中得到了广泛应用和显著进步。