Learning phrase representations using RNN encoder-decoder for statistical machine translation

ByKyunghyun Cho·Bart van Merriënboer·Caglar Gulcehre·Dzmitry Bahdanau·Fethi Bougares·Holger Schwenk+1 more

Original abstract (expand)

In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlength vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder‐Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

Statistical Machine Translation for Indic Languages
Divyajoti Panda, Tapas Kumar Mishra, Bidyut Kr. Patra Sudhansu Bala Das
2026
≈ 75%
Survey on reinforcement learning for language processing
Nicolas Navarro-Guerrero, Anabel Martin-Gonzalez, Cornelius Weber, Stefan Wermter Victor Uc-Cetina
2026
≈ 74%
SAE-RNA: A Sparse Autoencoder Model for Interpreting RNA Language Model Representations
Sangdae Nam Taehan Kim
2025
≈ 73%
Improved acoustic word embeddings for zero-resource languages using multilingual transfer
Yevgen Matusevych, Sharon Goldwater Herman Kamper
2021
≈ 73%
Pairing Orthographically Variant Literary Words to Standard Equivalents Using Neural Edit Distance Models
Craig Messner and Tom Lippincott
2026
≈ 73%
Representation Learning on a Random Lattice
Aryeh Brill
2025
≈ 73%
Improving Normative Modeling for Multi-modal Neuroimaging Data using mixture-of-product-of-experts variational autoencoders
Philip Payne, Aristeidis Sotiras Sayantan Kumar
2026
≈ 73%
Transfer Learning for Improving Speech Emotion Classification Accuracy
Rajib Rana, Shahzad Younis, Junaid Qadir, and Julien Epps Siddique Latif
2020
≈ 73%
Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study
Milad Zafar Nezhad, Ratna Babu Chinnam, Dongxiao Zhu Najibesadat Sadati
2019
≈ 73%
Interpreting Language Models Through Concept Descriptions: A Survey
Laura Kopf Nils Feldhus
2026
≈ 73%
Machine Learning Construction: implications to cybersecurity
Waleed A. Yousef
2026
≈ 73%
Probing Task-Oriented Dialogue Representation from Language Models
Chien-Sheng Wu and Caiming Xiong
2020
≈ 73%
Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
in corpus
≈ 73%
Discriminative Dictionary Learning based on Statistical Methods
Atul Negi G.Madhuri
2026
≈ 72%
Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study
Milad Zafar Nezhad, Ratna Babu Chinnam, Dongxiao Zhu Najibesadat Sadati
2019
≈ 72%
Bayesian Neural Networks: An Introduction and Survey
Ethan Goan and Clinton Fookes
2026
≈ 72%
Model Alignment Search
in corpus
2025
≈ 69%
Interpreting Language Model Parameters
in corpus
2026
≈ 69%
Paper Summary: Interpreting Language Model Parameters
in corpus
≈ 68%
Relating transformers to models and neural representations of the hippocampal formation
in corpus
2021
≈ 68%
The Platonic Representation Hypothesis
in corpus
2024
≈ 68%
Addressing divergent representations from causal interventions on neural networks
in corpus
2025
≈ 68%
Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders
in corpus
2026
≈ 67%
Simulators — LessWrong
in corpus
≈ 67%
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
in corpus
2025
≈ 67%
Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders
in corpus
2026
≈ 67%
A Mathematical Framework for Transformer Circuits
in corpus
2021
≈ 67%

Similar preprints — Semantic Scholar

Cited by (2)

Model Alignment Search
Model Alignment Search (MAS) establishes bidirectional causal similarity between neural networks by learning a per-model orthogonal rotation matrix that isolates behaviorally relevant subspaces and us
Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior
Manifold steering — intervening on model activations along paths constrained to lie on a learned activation manifold M_h rather than along Euclidean linear directions — produces behavioral trajectorie