categories.json¶
categories.json is an optional file. It can be used to classify grammatical tags into category. Category labels are strings, just as tags. For example, N and V could belong to a category called pos, nom and acc to case, and sg and pl, to number. You will only need that file if you want CoNLL-like output (format=’conll’ when calling analysis functions).
categories.json is a dictionary. Its keys are tags and values are their categories. Here is an example:
1 {
2 "A": "pos",
3 "S": "pos",
4 "V": "pos",
5 "acc": "case",
6 "gen": "case",
7 "nom": "case"
8 }
If you have categories.json and use conll formatting, this is what you get:
The POS column is filled with the tag(s) that are categorized as
posin the file.All the rest goes to the next column as key-value pairs joined by
|. So instead ofnom,sgyou will get something likeCase=nom|Number=sg, provided bothnomandsgare in the file.