Nathan D. Smith d4301ec570 Requirements upgrade 2 years ago
..
legacy 8d496db058 Refactored things a bit 5 years ago
sblgnt-corpus d053ea24b8 Add corpus files 5 years ago
.gitignore 05a7eaa40d PEP8 compliance 3 years ago
COPYING 446543f56e Added sblgnt-nltk.py 5 years ago
README d4301ec570 Requirements upgrade 2 years ago
requirements.txt 616c74ad9f Correct requirements 3 years ago
sblgnt-corpus.py f527c5f302 pep8 corections 3 years ago

README

sblgnt-corpus.py creates a tagged NLTK corpus from the MorphGNT [0]
files. It uses the py-sblgnt module for source data and will output
the files in "morphgnt-corpus/". These files can then be loaded as a
tagged, categorized corpus using something like the following:

>>> import koine, nltk
>>> sblgnt = nltk.corpus.reader.CategorizedTaggedCorpusReader('sblgnt-corpus/'
,'.*',encoding=u'utf8',tag_mapping_function=koine.simplify_tag,
cat_file='cats.txt')

Requirements: These scripts require Python 3, NLTK 3.0, and py-sblgnt.
To quickly install the requirements with pip:

$ pip install -r requirements.txt

[0] https://github.com/morphgnt/sblgnt

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see .