Skip to content

Add support for transductive word alignment models #322

Description

@ddaspit

The current word alignment model interface is inductive. Alignment operates in two stages: training and inference. If you want to align the entire corpus, you first train the model on the corpus and then align the corpus. For most standard statistical models, this approach does not add extra expense, since an inference pass is essentially the same as a training pass. This is not true for Eflomal, which uses Gibbs sampling. When aligning, Eflomal needs to perform multiple iterations on the sentence pair to burn-in the coupled alignment from a cold start and to average samples. This is already done as a part of the training pass, so if you want to align the training corpus, it is much cheaper to perform alignment as a part of training. Thot models now support this, so Machine needs to be updated to take advantage of transductive alignment. It would be used in the word_align_corpus function.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Fields

No fields configured for Enhancement.

Projects

Status
👀 In review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions