Hi Hassan, thank you for your kind words. If I understand correctly, what you are talking about is a language model trained at a sentence level. I don’t think we will be able to use this to increase the accuracy of the no-punctuation case. Also, I believe for the purely no punctuation case whatever model we implement the accuracy will not be able to cross a certain level (may be 94 -9 5). This is because some sentences make sense when combined too. If you download the model and the test data, you will see that many of the sentences which the model failed to split are ambiguous i.e: they make sense when combined too.

That being said, there is still scope for improvement in this. I trained this model on data generated from just 1million sentences because of the limitation of server time. I plan to train the same model on large amount of data from different sources (Tatoeba, UD Treebank ..etc) in the future to increase the real world generalisability of the model.

Written by

Senior NLP Engineer - DeepAffects

