This is the fourth and final blog post in the Cyber Security Challenge Belgium 2015 (CSCBE) solutions series. This time, we’re taking a look at one of the more programming oriented challenges: The NVISO Lottery.
The NVISO Lottery
The students were given the following info:
“Come and throw away your money at the NViso Lottery!“
They also received the IP address for the NVISO Lottery service.
Gathering information
Once again we take out our trusty pocket knife named netcat.
We have to guess the correct number from a set of 1000 possibilities. If we guess the right number, we get $75, but each guess costs us $10. If we want to win the prize, we have to earn $1337. This means we have to guess correctly at least 20 times without making too many mistakes. Let’s try!
We weren’t able to guess the correct number. We do get an ID, which we can use to get feedback from the NVISO casino. The ID looks completely random, but the last character (=) is a typical tell-tale for Base64 encoding. The equals sign is used as extra padding when the amount of bytes to encode is not dividable by 8. Decoding this using the Base64 algorithm gives the following:
The decoded string doesn’t give us the answer to the random number, but the content does appear to be structured and further decoding may be necessary.
As was explained at the beginning of this write up, this challenge is programming oriented. If you’ve worked with the Python language a lot, you may already recognize the decoded string as being a specific python file format.
In Python, you can use the Pickle module to serialize data objects. Serializing (or marshaling) objects is the processes of converting arbitrary data to a byte stream. This byte stream can then safely be transported over a network, or stored in a file.
Serializing is a reversible process. That means we can deserialize (or ‘unpickle’) the data we got from the Base64 decode:
This is very promising. The unpickled value consists of a nested list with three random numbers.
Random number generators
Lets take a look at how random numbers are usually generated. Algorithms can not generate truly random numbers. An algorithm will always perform the exact same steps given the same input. Many software implementations therefore rely on Pseudo-Random Number Generators (PRNGs). These algorithms do not generate true random numbers, but they do share many properties with true random numbers. For example, a good PRNG will make it extremely difficult to determine the next random number based on the random number that was just received.
An example of a PRNG is a Linear Congruential Generator (LCG). The most simple LCG needs three numbers to calculate a random number sequence. These three numbers are called the seed of the LCG. Given these numbers (a, c, m), the LCG will calculate the sequence as follows:
The next number in the sequence is calculated by multiplying the current number by a, adding the result to c and taking the remainder of division by m.
From a programming point of view, PRNGs are very useful as they can be reverted to a certain state. If the application suddenly crashes based on a specific random input, it would be very hard to debug the application if the same random input can not be generated. For security critical implementations, of course, a PRNG should not be used.
Since we have to guess a random number, it may be a good guess to say that the decoded value is the seed for a PRNG.
Exploiting the vulnerability
Python allows the programmer to set the state of the random number generator. To confirm we’re on the right track, let’s print out the current state of the default random number generator:
Unfortunately, this seed appears to be a lot bigger than the seed we recovered from the lottery service. Python’s random module actually uses the Mersene Twister algorithm, which is not an LCG,
But there is good news, the output of the getstate() command is very similar to our decoded value. Python has a few other random libraries: random.SystemRandom() and random.WichmannHill(). According to the documentation, SystemRandom() doesn’t have a getstate() method. WichmannHill() does:
This is exactly what we were looking for. By using setstate() with our decoded lottery ID, we should be able to predict the number that will be generated:
Great! That was the solution we were looking for. Because we get the ticket ID before we have to enter our guess, we can predict the value that the server will expect and get our prize!
We could do this manually since there’s no timeout for our answer, but we can just as easily create a python script that does this for us:
We got the flag, which is “I’m_going_to_be_a_professional_gambler!“
Statistics
We had many different connections to the server, so a lot of teams tried to solve the challenge. Most teams told us they managed to decipher the Base64 encoding, and some teams also found the Python pickle format. In the end, only four teams were able to completely solve this challenge: HacknamStyle Jr, ISW, Turla Tech Support and Vrije Universiteit Leuven. All of these teams made it to the finals.
Final thoughts
This challenge was partly aimed at testing the student’s programming skills. Although Python is a very popular programming language, some students may have never used it before, making this challenge a little bit harder. Even so, a security researcher will often encounter unknown file formats or protocols, and finding out what the data means or how to use it may be critical to a successful security audit or forensics investigation. Being able to automate custom tasks can often save lots of time or solve problems that would be impossible to do manually. Having some experience with any programming language is an invaluable tool in every security expert’s toolkit!