Creating Custom Userlists from Document Metadata

In the past on the podcast we’ve talked about a number of tools for document metadata gathering and how we can use them for gathering good information.
docx.jpgI’ve talked about EXIFtool for examining and deleting metadata from JPEGs. This was helpful for some info, but only on images.
I’ve covered Metagoofil, where we use it to download all sorts of common data and word processing type documents and analyze them for interesting information. Unfortunatley, Metagoofil only will produce download from the web and process. We have no ability to process from our store on disk.
By accident I discovered that we can get much of the same information by using EXIFtool not on JPEGs, but on Word, Excel and PowerPoint documents! EXIFtool has the ability to parse metadata as defined by the FlashPix standard, introduced in 1996 developed by Kodak, Hewlett-Packard and Microsoft. Microsoft still uses the format for documents and storing data. We can use EXIFtool to gather usernames from the documents.
Note: This will only work on Office documents were not created with Office 2007 (.docx), as the new version relies on a different metadata storage format. I’ll have a solution for this one soon!
We can start down and dirty with getting the information on Office documents. In the directory that contains our supported office documents, we can execute the following commmand:

$ exiftool -r -h -a -u -g1 * >output.html

metastick.jpgThis will execute EXIFtool to extract all EXIF metadata recursively in the current directory (-r), with all output including duplicates (-a), organizing by EXIF tag category (–g1), for all files, with HTML friendly formatting (-h), into a file named output.html in the current directory (>output.html). With this we get a handy little report HTML report!
But, we may only want just the info on usernames/authors. We can trim the output information down to jsut the appropriate data elements:

$ exiftool -r -a -u -Author -LastSavedBy * >users.txt

We’ve removed the HTML and sorting options, as they will only serve to make any additional processing difficult. I’ve also only grabbed the Author and LastSavedBy tags, as these are the most common places for usernames. Now we can take our users.txt, and remove all of the extra information with some unix text processing:

$ strings users.txt | cut -d":" -f2 | grep -v "=" | grep -v "image files read" | tr '[:space:]' 'n' | sort | uniq  >cleanusers.txt

Now all we are left with is a list of potential user names one per line. We’ve dropped all of the extra text up to the first delimiter (:), dropped the lines that start with “=” and “image files read”, coverted spaces to newlines, sorted alphabetically and removed the duplicates. This will introduce some need for a manual culling, as sometimes the author is listed as “Firstname Lastname”, and they get kept as each name individually. However, in some smaller companies just a first or last name is perfectly acceptable as a username, so you may not want to to cull your list at all.
Now, we are left with a list of potential usernames that we can utilize for password brute force attempts for other services, such as VPNs or web based applications.

Larry Pesce

A self-professed, lifelong “tinkerer and explorer,” Larry always wanted to know how things work. “I found myself getting to engage in deep dives of technology from an early age: My dad built the family television from a kit, and I helped. It caught fire. Twice. I helped fix it both times.”

The help and advice received from the infosec community throughout his career inspired him to share what he had learned to help others secure their networks and improve their craft. Part of that ongoing sharing has been as the co-founder and co-host of the international award winning Paul’s Security Weekly podcast for more than 19 years.

Larry has spent the last 15 years as a penetration tester, spending lots of time focused on Healthcare, ICS/OT, Wireless, and IoT/IIoT/Embedded Devices, but now focuses his efforts on securing the software supply chain at Finite State.

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

By clicking the Subscribe button below, you agree to SC Media Terms and Conditions and Privacy Policy.