How to get phrase tables from word alignments? - McMap

About

How to get phrase tables from word alignments?

Asked 26/7, 2014 at 11:15 Answered 3/8, 2014 at 21:10

Solved algorithm nlp machine-translation moses

S

1

1

The output of my word alignment file looks as such:

I wish to say with regard to the initiative of the Portuguese Presidency that we support the spirit and the political intention behind it . In bezug auf die Initiative der portugiesischen Präsidentschaft möchte ich zum Ausdruck bringen , daß wir den Geist und die politische Absicht , die dahinter stehen , unterstützen .   0-0 5-1 5-2 2-3 8-4 7-5 11-6 12-7 1-8 0-9 9-10 3-11 10-12 13-13 13-14 14-15 16-16 17-17 18-18 16-19 20-20 21-21 19-22 19-23 22-24 22-25 23-26 15-27 24-28
It may not be an ideal initiative in terms of its structure but we accept Mr President-in-Office , that it is rooted in idealism and for that reason we are inclined to support it .    Von der Struktur her ist es vielleicht keine ideale Initiative , aber , Herr amtierender Ratspräsident , wir akzeptieren , daß sie auf Idealismus fußt , und sind deshalb geneigt , sie mitzutragen .   0-0 11-2 8-3 0-4 3-5 1-6 2-7 5-8 6-9 12-11 17-12 15-13 16-14 16-15 17-16 13-17 14-18 17-19 18-20 19-21 21-22 23-23 21-24 26-25 24-26 29-27 27-28 30-29 31-30 33-31 32-32 34-33

How can I produce the phrase tables that are used by MOSES from this output?

In this pdf, it explains the consistent phrase extraction: http://www.inf.ed.ac.uk/teaching/courses/mt/lectures/phrase-model.pdf but what is the algorithm to achieve the phrases? (slide 16-21)

Sarinasarine answered 26/7, 2014 at 11:15 Comment(8)

i've tried iterating all possible sizes of cells with all possible combination. but that will give me n! * m! * n * m cells to check through for every sentence, where n and m are length of the source and target sentence. – Sarinasarine 26/7, 2014 at 11:16

I don't understand your question. Are you trying to get the alignment itself? How does your alignment work? – Meadors 27/7, 2014 at 8:27

@Daniel, word alignment != phrase table. I've found the algorithm but it's not working somehow, #25109501 – Sarinasarine 3/8, 2014 at 21:7

What do you mean by "not working somehow"? You implemented the algorithm below in the response, and it is giving wrong answers? – Meadors 4/8, 2014 at 3:9

yes, it's not giving the right output... – Sarinasarine 4/8, 2014 at 6:26

well, it seems like the alignment below is just an approximation, and not guaranteed to give consistent results. – Meadors 4/8, 2014 at 10:0

Is this a standard input format? Looks pretty ad-hoc and hard to use. – Coextensive 9/8, 2014 at 17:3

yes, it's the pharaoh output format. One could also prefer the giza output format though, e.g. rali.iro.umontreal.ca/rali/?q=en/node/1325#ali. – Sarinasarine 9/8, 2014 at 17:11

S

3

The way to get a phrase table is to first extract the phrase table with the following algorithm from Philip Koehn's Statistical MT book, pp. 133:

enter image description here

Then estimate the probabilities for the phrases with their relative frequencies, i.e.

enter image description here

Note that there is an error in the original printed version of the book but it's addressed in the errata on line 4 of the extract() function.

Also see Phrase extraction algorithm for statistical machine translation for the details.

Sarinasarine answered 3/8, 2014 at 21:10 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.