Creating Custom Wordlists For Password Brute Forcing

November 25, 2008

By Paul Asadoorian
This is a nice, easy way, to build a custom dictionary for your target. I got some of the original code from SANS Security 560 by Ed Skoudis. With his permission, I’ve published some of my enhancements. The first step is to grap the entire web site:

wget -r -l 2 www.targetwebsite.com

I’m going two levels deep here, you can adjust that with the “-l” flag. How many levels deep depends on how big of a dictionary you want and how big your target site is. [Editors note: This can take you outside of the target website by following links to other sites. As Paul pointed out, this may be valuable. If the sites are linked, there is something in common and valuable between them] Next, we replace the spaces with new line characters and produce a uniq list:

grep -hr "" www.targetwebsite.com/ | tr '[:space:]' 'n' | sort | uniq > wordlist.lst

Next step is to remove the weird characters. Don’t worry, we can put them back. This primarily removes the HTML tags and such:

egrep -v '(','|';'|'}'|'{'|'<'|'>'|':'|'='|'"'|'/'|'/'|'['|']')' wordlist.lst | sort -u > wordlist.clean.lst

Note: I do not remove the parentheses characters “()”. We probably need to move to perl regex or something similar to do that. I get a syntax error when I try to remove the “(” or “)”. Also, different versions of grep (and wget) will behave differently, so you might have to tweak. Below, we append the default John the ripper password list to our custom list:

cat password.lst >> wordlist.clean.lst

Now, we might have duplicates and since we removed all special characters (Well, most of them anyhow) we need to put them back. Below we run John to re-generate our unique wordlist, apply some rules, and output to standard out:

john --wordlist=wordlist.clean.lst --rules --stdout | uniq > final.wordlist.lst

For bonus points you can modify the rules so that it does a better job of adding in special characters (such as replacing all “i” with “1”). We’ll leave this exercise up to the reader.
Passwords are just so easy to abuse…

Larry Pesce

Larry’s core specialties include hardware and wireless hacking, architectural review, and traditional pentesting. He also regularly gives talks at DEF CON, ShmooCon, DerbyCon, and various BSides. Larry holds the GAWN, GCISP, GCIH, GCFA, and ITIL certifications, and has been a certified instructor with SANS for 5 years, where he trains the industry in advanced wireless and Industrial Control Systems (ICS) hacking. Larry’s independent research for the show has led to interviews with the New York Times with MythBusters’ Adam Savage, hacking internet-connected marital aids on stage at DEFCON, and having his RFID implant cloned on stage at Shmoocon. Larry is also a Principal Instructor and Course Author for the SANS Institute for SEC617: Wireless Penetration Testing and Ethical Hacking and SEC556: IoT Penetration Testing. When not hard at work, Larry enjoys long walks on the beach weighed down by his ham radio, (DE KB1TNF), and thinking of ways to survive the impending zombie apocalypse.

Creating Custom Wordlists For Password Brute Forcing

Related

Better identity threat detection sought by new Semperis ML-based tool

UK cracks down on default passwords for smart devices

FTC urged to probe automakers’ location data sharing practices

Related Events

Identity Resilience: The Missing Piece to Securing Your Identities

Identity security and user experience – there shouldn’t be a trade-off

Detecting the Identity Trojan Horse: The Human Element of Cyber Breaches and its Paradox with Cyber Identity

Get daily email updates