The current word alignment model interface is inductive. Alignment operates in two stages: training and inference. If you want to align the entire corpus, you first train the model on the corpus and then align the corpus. For most standard statistical models, this approach does not add extra expense, since an inference pass is essentially the same as a training pass. This is not true for Eflomal, which uses Gibbs sampling. When aligning, Eflomal needs to perform multiple iterations on the sentence pair to burn-in the coupled alignment from a cold start and to average samples. This is already done as a part of the training pass, so if you want to align the training corpus, it is much cheaper to perform alignment as a part of training. Thot models now support this, so Machine needs to be updated to take advantage of transductive alignment. It would be used in the word_align_corpus function.
The current word alignment model interface is inductive. Alignment operates in two stages: training and inference. If you want to align the entire corpus, you first train the model on the corpus and then align the corpus. For most standard statistical models, this approach does not add extra expense, since an inference pass is essentially the same as a training pass. This is not true for Eflomal, which uses Gibbs sampling. When aligning, Eflomal needs to perform multiple iterations on the sentence pair to burn-in the coupled alignment from a cold start and to average samples. This is already done as a part of the training pass, so if you want to align the training corpus, it is much cheaper to perform alignment as a part of training. Thot models now support this, so Machine needs to be updated to take advantage of transductive alignment. It would be used in the
word_align_corpusfunction.