Manipulating large Wordlists & Dictionary files
Manipulating large Wordlists & Dictionary files
Manipulating large wordlists
If you have ever tried to manipulate a large file to remove duplicates and junk, you have probably spent a lot of time and realized that Excel does NOT like that.
A very good way to do it is to use TextWrangler. TextWrangler is a free app by BareBones Software that can do that for you in almost no time.
You can download TextWrangler from here
How To remove duplicates:
Open the file , then open the tab “#!” and use one of the preloaded filters.
How To remove lines longer than xx characters:
Tab “Text”
Select “Process Line Containing”
Select “Use Grep”
Select one of more of the options wanted
“Copy to clipboard, Copy to new document, etc etc ....”
To remove line longer than 63 characters
type in the box .{63,}
How To Count Occurrences
Open TextWrangler
Download this file Count Occurrences.pl and save it on the Desktop
In TextWrangler, in the tab #! , click on “Unix Filters” Then on “Open Filters Folder”
Drag the file “Count Occurences.pl” into the Folder
You have now a Filter “Count Occurrences”
The filter will count the occurrences of a word , just just need to sort and reverse the sorting to have the list sorted by the number of occurrences or , in fact , having a file sorted the smart way.