It would be trivial for me to write a simple C# app to merge the words in safedict_full.txt into words.txt (and words_alpha.txt), and to assure that the words are properly sorted (since there are cases where the sort is off by a little in words.txt. I noticed in the commit history adding true and false to words_alpha.txt and yet words.txt has true but not false, or thunder, or many others, which I can see by comparing safedict_full.txt to words.txt using my favorite comparison tool, Beyond Compare. It's surprising to me that there are several words in words_alpha.txt that are not in words.txt In addition, any nuisance words (like one-letter entries except for 'a' and 'i') will be removed. All potentially offensive words are removed. In addition, any word not suitable for explaining to a baby is removed. Will not remove words that might be offensive.įull - Removes anything remotely controversial, including some references to religion (only those that could be offensive), and everything included in medium. Medium - Removes all negative slang, derogatory phrases, profanity, and references to dangerous items (substances, weapons, etc). Light - Removes all obscene words and profanity. If you guys (or some teenagers you know) come up with any other words, feel free to submit it! I believe my project is/was the most complete open-source project of its kind right now, but I may be wrong about that (I also combined other english dictionaries to ensure that it is as complete as possible). I've also got it separated into multiple levels of filtering. I simply gave a computer to a bunch of teenagers and got them to think of any/all inappropriate words. Had a project a while back that made a safe dictionary for password generation for school kids.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |