How to remove characters from a text file?
Whoever worked a lot with text files possibly had this problem before, you have text file filled with unwanted characters, some are “hidden” (another way of saying that there are characters that your system can’t represent as symbols on your screen) and some are just redundant and not needed.
For the unwanted characters, you can do search-and-replace, but that can be daunting for some text documents, and for “hidden” characters, well, that can be challenging.
My solution for this is using regular expressions with notepad++
Notepad++ is one of the most powerful tools an experienced computer user can have, it can do a lot of things, even ones that don’t seem to be with its “job description”.
Regular expressions and Notepad++
Basically, what you need to do is copy the line of code below and paste it on the “Find what” text box of the Replace screen (Ctrl + H), and click on “Replace All”.
1 |
[^a-z A-Z0-9\n!@#$%^&*{}()_+=<>\[\]\t:;'",.?/] |
This line of code tells Notepad++ to match everything that is not listed inside the square brackets.
When you do Search and replace for everything that is not there, all other characters (the unwanted ones) are replaced with empty space.
Of course, you’ll probably need to customize this line in order to keep some characters or remove others.
Please note: \t,\n,a-z in the code means the Tab, Newline, and range of characters accordingly.