When you receive a suspicious PDF these days, it could be just a scam without malicious code. Let’s see how to analyze such samples with PDF Tools.
As always, we first take a look with pdfid:
There’s nothing special to see, but we have to check the content of the Stream Objects (/ObjStm):
Still nothing special to see. This could be a malicious PDF document with a pure binary exploit (e.g. without using JavaScript), but nowadays, it’s more likely that we received a PDF containing links to a malicious website, like a phishing website.
To check for URLs, use option search (-s) to search for the string uri (the search option is not case sensitive):
And indeed we find objects with URIs. These are links tied to a rectangle, thus a zone that must be clicked by the user to “activate” the URL: Adobe Reader will display a warning, and after user acceptance, the default browser will be launched to visit the given URL.
pdf-parser also has an option to select key-value pairs from dictionaries of PDF objects: option -k. This is useful to generate a quick overview. This option is case sensitive, and the full keyname must be provided:
When we open the PDF document with Adobe Reader, we get visual confirmation that it is a phishing PDF:
And this is the phishing website:
Conclusion: if pdfid reports nothing suspicious, before looking for binary exploits (for example with pdf-parser’s YARA support), search first for URIs with pdf-parser.
3 thoughts on “PDF Analysis: Back To Basics”