The splitta library is a sentence boundary finder. This I have to incorporate, as segmentation is an extremely important function of any translation system. So that should be Perl-ized here.
The Topia term extractor is the other thing I wanted to point out here.
Also, the fact that both of these libraries are in Python. An awful lot of natural language work ends up in Python. That's kind of interesting, actually.
No comments:
Post a Comment