IStandardTokenizer
A tokenizer of type standard providing grammar based tokenizer that is a good tokenizer for most European language documents.
The tokenizer implements the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.
The maximum token length. If a token is seen that exceeds this length then it is discarded. Defaults to 255.