This is the second blog post in the Cyber Security Challenge Belgium 2015 (CSCBE) solutions series. This time, we’re taking a look at the Data Extraction challenge.
Data Extraction
The challenge
The following challenge description was given to the students:
“We messed up and contacted the wrong forensic department. They say they found data, but we can’t really make anything out of it. Can you?“
The students were also given the following image:
The challenge was designed to test the students’ out of the box thinking capabilities, as well as their ability to research a certain subject.
Analyzing the image
There are two major categories for hiding information inside an image. Either you modify the internals of the image to add some extra data, or you visually encrypt your information and just add it to the image where everyone can see it. Because the image appears to contain a certain pattern, it is most likely that the second approach was taken.
Four different kind of shapes were used to create the pattern and every shape has its own color. If you’ve payed attention in your high school biology class, you may recognize the shapes. When you learned about the birds and the bees, you should have also learned about deoxyribonucleic acid, which is just a fancy way of saying DNA.
Research
DNA consists of two biopolymer strands coiled around each other, forming the well known double helix:
![]() |
Image taken from classroom.synonym.com |
Each strand consists of many different nucleotides which lock together in a certain way. There are two base pairs: Adenine (A) matches with Thymine (T) and Guanine (G) matches with Cytosine (C).
![]() |
Image taken from thinglink.com |
The colors of the different nucleotides may be different depending on which textbook you use, but the shapes are usually the same. That means we can convert our image into a string of letters consisting of A, T, G and C using a little bit of python:
This is already looking good, but we can’t really see anything that resembles a flag.
Digging a little deeper
If we do some more research, we can find that different nucleotides are combined into amino acids when they are processed by your cells. Each triplet is combined into one of the amino acids according to the following diagram:
![]() |
Image taken from commons.wikimedia.org |
This tool also automatically solves the last hurdle of the challenge: If you start combining from the first nucleotide, you only get a bunch of random letters. However, if you start combining from the second nucleotide, you can read “THE PASS IS METAPHYSIC LIGHTYEARS”, which is the solution to this challenge.
What a great challenge! I love the fact that you utilized DNA!
I completely agree, it is a challenge that is not only difficult, but one that almost all people have the tendency to lose interest in. However, the problem is when these techniques, are utilized for malicious purposes.
This is a good read for everyone that was interested…
http://www.xylibox.com/2014/04/zeusvm-and-steganography.html
http://arxiv.org/abs/1503.05904