Updates for the N-gram generator

— Posted in Taal by

Once in a while I receive emails from researchers all over the world with thanks and/or suggestions for the scripts I provide online, such as frequency list and n-grams generators. For this latter tool, I had a nice email conversation with a researcher from overseas, which led to the following enhancements and updates. I really enjoy these kinds of things, so if you have any suggestions or feedback – you know where to find me.

  • Slight efficiency rewrite of output rendering. (2024-01-26)
  • Added feature for respecting or ignoring sentence boundaries. (2024-01-25)
  • Added feature for including or excluding numbers. (2024-01-25)
  • Added top limits above 1.000 (2.000, 3.000, 4.000, 5.000, 10.000) to respect or ignore sentence boundaries. (2024-01-25)
  • Added feature for (virtually) unlimited results. (2024-01-22)
  • Added feature for unigrams. (2024-01-22)