Write-up on Blockchain data exfiltration (CSCBE18 qualifiers) challenge

This article describes the analysis of data exfiltration using blockchain as it was used in a challenge for the CSCBE 2018 qualifiers. The Cyber Security Challenge Belgium (CSCBE) is a typical Capture-The-Flag (CTF) competition aimed at students from universities and colleges all over Belgium. All of the CSCBE’s challenges are created by security professionals from many different organisations. This challenge was created by Kris Boulez, one of NVISO’s employees, during his commutes and it allowed him to play around with blockchain and cryptography, and improve his Golang and Perl skills.

The challenge consisted of three subchallenges for which a single pcap file was provided

Each of these subchallenges contains a separate flag, has increasing complexity and  builds on the previous one. We’ll start with a general analysis of the network traffic and then dive into each subchallenge ad describe how to get different flags.

A write-up by one of the participants was published shortly after the Qualifiers describing a solution based on enumeration. Here a solution is given solely based on reversing the algorithms.

A Github repository (KrisBoulez/CSCBE18-dataexfill) contains all the code for setting up and running this challenge. It also includes some scripts that were used to perform the analysis in this write-up.

Generic pcap analysis

All requests are GET and only two type of URL’s are accessed on the same server, paths starting with /block and /address

The UserAgent is “Satoshi:0.15.1” -> could this challenge be linked to Bitcoin ?

To list all HTTP GET requests

# tshark -Y 'http.request.method == "GET"' -r data_exfil.pcap

To extract all the results from the response packets (further analysis is based on this extraction)

# open pcap file in Wireshark / File / Export objects / HTTP

/block request – responses

A total of 11 different /block requests are made. All block indices requested, are in the range 312300 – 312399 [all other /block requests for this range would return a block from the BTC blockchain and could have been used for solving the challenges]

http://6.6.6.6/block/312373?format=json

The response is JSON formatted and a cursory examination indicates this is part of the Bitcoin blockchain. Online searching the BTC blockchain, shows this is indeed a single block.

https://blockchain.info/block/312373?format=json

The content of all the blocks served, does correspond with the online version of the blocks.

/address

[As the /blok requests/responses seem genuine, from here on we will only discuss the /address requests/response. So “all the requests” must be read as “all the /address requests”]

Blockshark (Flag 1)

One of your employees is believed to be leaking information to a competitor in return for Bitcoins. Fortunately, we were able to capture network traffic when he was exfiltrating data!

It is believed he was transferring a super secret nota which starts with the following string: Top Secret !!!

The last GET request seems special (the address part is longer then all the others)

# tshark -Y 'http.request.method == "GET"' -r data_exfil.pcap | grep address
[...]
 3154 127.666978 192.168.0.2 → 6.6.6.6 HTTP 238 GET /address/4808354973345d9b3e485f2a58210409c514042030042a3e48201f48902c58b0285d570f5d9f04?format=json HTTP/1.1
 3180 129.848436 192.168.0.2 → 6.6.6.6 HTTP 238 GET /address/5d58091f4f093d660758982e1fb82209e72f585c3e1fe51d09371c5d593c2e200c040334192116?format=json HTTP/1.1<span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span>
 3205 131.033759 192.168.0.2 → 6.6.6.6 HTTP 310 GET /address/5d5119494e31581e35585e3e04fe065d6d3504f404480e3b048c0a5d441309fb021fb00c588714042e3b58842d1f811019242f5d9e211fef3a486b2b09cd39044a255dee025d280304980c?format=json HTTP/1.1

Looking at the exported files the file extracted from this last requesst is also special (small size)

‑rw‑r‑‑r‑‑ 1 xyz zyx 19385 Feb 8 18:28 4808354973345d9b3e485f2a58210409c514042030042a3e48201f48902c58b0285d570f5d9f04%3fformat=json
‑rw‑r‑‑r‑‑ 1 xyz zyx 19385 Feb 8 18:28 5d58091f4f093d660758982e1fb82209e72f585c3e1fe51d09371c5d593c2e200c040334192116%3fformat=json
‑rw‑r‑‑r‑‑ 1 xyz zyx 206 Feb 8 18:28 5d5119494e31581e35585e3e04fe065d6d3504f404480e3b048c0a5d441309fb021fb00c588714042e3b58842d1f811019242f5d9e211fef3a486b2b09cd39044a255dee025d280304980c%3fformat=json

Examining this last result file in detail we see

{
    "hash160":"5cfb25c53220ec02913648a380e2b07fe0287ef2",
    "address":"19Ue2gkjFb7K3nodHSpz3DUqrVUNeNNkj2",
    "return_code":201
    "return_value":"RkxhZyAxOiBDU0NCRXtERkhKS0oqJigqVVlURyUjJDI0M30="
}

The return_code seems to indicate some sort of success (HTTP 200 range of return codes)

The return_value seems like a Base-64 encoded string (trailing ‘=’), testing this on https://www.base64decode.org/ we get the following result and have found the value for Flag 1

FLag 1: CSCBE{DFHJKJ*&(*UYTG%#$243} 

BlocksharkNado (Flag 2)

This challenge is the follow up of Blockshark

The server the exfiltrator was communicating with has been seized and a copy of the software has been installed on: 52.214.111.33

To obtain the flag, only the same type of traffic as the exfiltrator is needed/allowed. To prove your knowledge send the following message to the server: “gimme the second flag …” (minus the ” quotes)

To find Flag 2 we have to ask “gimme the second flag …” , so we will need to at least partly understand the communication protocol. All other (apart from the last one which we analysed above) /address requests seem very similar wrt to size of request and response. Let’s look at the first one

1393 17.270151 192.168.0.2 → 6.6.6.6 HTTP 238 GET /address/1f620e09863c3d9321492c06096b3e492012489b3b5dc03d3d6d305d082d5d9e2148301109a211?format=json HTTP/1.1

The /address requests do not seem to be in line with the info in the public BTC blcokchain. A typical address request for the real Blockchain is

https://blockchain.info/address/19Ue2gkjFb7K3nodHSpz3DUqrVUNeNNkj2?format=json

In regular BTC requests, the address part is base56 encoded, while here the address part only consists of hex chars (the real address can be found in the “address” field of the returned json file).

Comparing an /address file from the capture and comparing it to the real address response, we find two additional fields

$ diff 1f620e09863c3d9321492c06096b3e492012489b3b5dc03d3d6d305d082d5d9e2148301109a211%3fformat\=json ../../19Ue2gkjFb7K3nodHSpz3DUqrVUNeNNkj2.json
4,5d3
&lt; &quot;return_code&quot;:200
&lt; &quot;return_value&quot;:&quot;VG9wIFNlY3JldCAhIQ==&quot;

Building on the knowledge we gathered from analysing the last packet of the series which gave us  Flag 1:

  • return_code: looks like an HTTP status response
  • return_value: a Base64 econded value, which decodes to “Top Secret !!“, which coincides with the first line of text the intruder is sending

From the analysis of the “return_value” of the first /adress file we know it encodes the text

Top Secret !!  (13 characters, including spaces)

and the request is

1f620e09863c3d9321492c06096b3e492012489b3b5dc03d3d6d305d082d5d9e2148301109a211

(78 hex chars = 13 hextets)

Rewriting this for legibility

1f620e T
09863c o
3d9321 p
492c06 [space]
096b3e S
492012 e
489b3b c
5dc03d r
3d6d30 e
5d082d t
5d9e21 [space]
483011 !
09a211 !

It appears not be a simple substitution  cipher as both 492012 and 3d6d30 code for “e“.

But we now have enough ciphertext available to create the message for Flag 2 (“gimme the second flag …”). By looking at all the address requests and correlating with the return_value, one of the possible solutions is

g  5d9f04
i  3d2b33
m  3d742a
m  3d742a
e  492012
  492c06
t  5d082d
h  494e31
e  492012
  492c06
s  09371c
e  492012
c  489b3b
o  09863c
n  04ee2f
d  58b32e
  492c06
f  5d9b3e
l  58ab23
a  581e35
g  5d9f04
  492c06
.  5d973b
.  5d973b
.  5d973b

Concatenating all these values and submitting it against the server

$ curl ‑A Satoshi:0.15.1 http://6.6.6.6/address/5d9f043d2b333d742a3d742a492012492c065d082d494e31492012492c0609371c492012489b3b09863c04ee2f58b32e492c065d9b3e58ab23581e355d9f04492c065d973b5d973b5d973b?format=json

{
 "hash160":"5cfb25c53220ec02913648a380e2b07fe0287ef2",
 "address":"19Ue2gkjFb7K3nodHSpz3DUqrVUNeNNkj2",
 "return_code":202
 "return_value":"RmxhZyAyOiBDU0NCRXtzZGZJVWVydzg5MzQ3NSMkJlkmI30="
}

And by Base64 decoding the return_value we get the following result and have found the value for Flag 2

 Flag 2: CSCBE{sdfIUerw893475#$&Y&#}  

BlocksharkNado vs BlockSharcopus (Flag 3)

This challenge is the follow up of Blockshark & BlocksharkNado

The server the exfiltrator was communicating with has been seized and a copy of the software has been installed on: 52.214.111.33

To obtain the flag, only the same type of traffic as the exfiltrator is needed/allowed. To prove your knowledge send the following message to the server: “Dear 0r&cl# (what is Flag3)” (minus the ” quotes)

So, to get Flag 3 we need to send “Dear 0r&cl# (what is Flag3)”  ( minus the quotes) to the server. As a lot of these characters are not present in the available plain/ciphertext we need to reverse the communication protocol

The address part (leaving of the “format=json” part) is 78 chars long, except for the last one which is 150 chars long.

  • 78 can only be divided by 13, 3 and 2;
  • while 150 is also divisable by 3 and 2

The following command allows extracting all the address parts

# tshark ‑Y 'http.request.method == "GET"' ‑r data_exfil.pcap | grep address | awk '{print $9}' | sed ‑e 's/\/address\///' | sed ‑e 's/\?format=json//' &gt; address_reqs

Frequency analysis of the duplets (i.e. two consecutive hex chars) shows that the distribution is not random, we see monotonously increasing numbers, while the duplets which occur more then 40 times are rare.

Duplet #occ
[...]
08  18
21  18
20  19
2f  21
19  23
2e  32
49  37
48  48
3d  49
5d  67
1f  70
04  90
58  91
09  93

Indicating where the duplets, which occur more then 40 times, appear in the request we get (only three requests shown)

04ad1e3428051459124be42b04d6321418213f792a2da10a45953c3f24162d363e477a1145eb04
^                 ^     ^           ^     ^     ^     ^     ^     ^     ^   ^
04da3945f7263f39054b08072d590047590845b71204380647620b4bcb3d141c0947900e04b828
^     ^     ^     ^     ^     ^     ^     ^     ^     ^           ^     ^
04e326474d2e4b5d304b7a08453d3214100b47441247f0102d7c384b703b4b080747a20747632a
^     ^     ^     ^     ^           ^     ^     ^     ^     ^     ^     ^

Looking at the output, clearly a pattern emerges. The high occurring duplets seem to located at specific places in the request. Zooming on the ones which reoccur in the different requests, we note that the majority are located at positions which are 6 (or a multiple thereof) position separated from each other.

Hextet xxyyzz

For the rest of the analysis we will indicate the hextet as “xxyyzz” to discuss its sub-parts.

splitting one request line in the different hextets (i.e. 6 consecutive hex chars), we get the following, where we see the repetition of the ‘xx’ part

1f620e
09863c
3d9321
492c06
096b3e
492012
489b3b
5dc03d
3d6d30
5d082d
5d9e21
483011
09a211

‘xx’ part

Doing a frequency analysis on the ‘xx’ part we get (columns are: xx duplet, number of occurrence, decimal representation of xx duplet)

xx | #occ | decimal(xx)
04   77      4
09   77      9
16    5     22
19   12     25
1f   57     31
2e   17     46
3d   41     61
48   45     72
49   36     73
58   86     88
5d   66     93

The “decimal(xx)” part correlates directly with the variable part of the /block requests

312304?format=json

312309?format=json

312322?format=json

[…]

‘yy’ part

Following the logic for the ‘xx’ part, the ‘yy’ part would be something in each of the different block files. A quick frequency analysis shows that all ‘yy’ duplets (decimal 1 – 255) occur roughly at the same frequency.

Looking at a /block file, we see it is made up of “key”:value pairs. Frequency analysis on the “key” entries (for the 312304?format=json) gives [annotations added]

[...]
main_chain      1
mrkl_root       1
addr_tag      222  [ free text form ]
addr_tag_link 222  [ URL ]
weight        289  [ 4-5 digit number ]
vin_sz        289  [ 1-2 digit number ]
lock_time     289  [ 0 ]
out           289  [ just a place holder ]
inputs        289  [ just a place holder ]
vout_sz       289  [ 2 ]
hash          290  [ 64 hex char hash value ] <span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span>
time          290  [ unix time stamp, all in the same range ]
ver           290  [ 1 ]
relayed_by    290  [ IP address ]
size          290  [ 3-4 digit number ]
prev_out      496  [ just a place holder ]
witness       497  [ empty]
sequence      497  [ the same digit for all: 4294967295 ]
value        1087
type         1087
[...]

As there seem to be at least 255 ‘yy’ duplets, we look at the ones annotated above. Off these “hash” looks the most promising (64 hex chars; can be used to hide lots of stuff).

We know that “1f620e” codes for “T” (first encoded character)

  • xx: 1f (dec 31) indicates the 312331.json file
  • yy: 62 (dec 98) probably somehow related to the “hash” lines
  • zz: 0e (dec 16unknown
  • the hexadecimal ascii code for “T” is “54”

Looking for this hex code (54) in 312331.json reveals there are 52 of the “hash” lines which are matching. When analysing the “how many-th” of these hash line matches, we see it also matches on the 98th line, which corresponds to the “yy” part.

[...]
 72 "hash":"b8634bb572ae8696c50a0654451c212f75866bbad9877948309c529f50bb5d4d",
 85 "hash":"10c42099c9908546bf5eb4db835be43487818d34e9334f20390c083b45418bcc",
 90 "hash":"2b3833205d2a5b2aa3d7950c822111afb791189abc74b54c6d6f12d7600f58f8",
 92 "hash":"39116dd5d36773fb8bffe3250f71782c1ba0bda9eb0302f542dfca8881f14592",
 98 "hash":"43cffe4060bf2e543ce6b60e5714ea7ff8ef60162c3615be9dd194054e3dbdca",
107 "hash":"4154ccce0964f0f4062c919377161d130a135d9495d981c7e875be94c8eef031",
109 "hash":"e75f7df1662ccd2d8501f0d911aa18d12d1cf1a2a1ef76bc16eb0b070caf1542",
[...]

Examining the 98th hash line in more detail and looking for “54”

43cffe4060bf2e543ce6b60e5714ea7ff8ef60162c3615be9dd194054e3dbdca
             <span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span> ^^ (position 15-160

zz” has decimal value 16, so this let’s us assume that this is an array which starts at “0”.

Quickly checking this assumption we see that it also holds for the other hextets

09863c” codes for “o

checking our assumption

  •  xx: 09 (dec 9)
  •  yy: 86 (dec 134)
  •  zz: 3c (dec 60)
  •  “o”: ascii code “6f”

312309.json, looking for “6f”

[...]
131 "hash":"efd8e356f05dbc396ac6a9fe9ad143b61e7ab39b091d70f7e2126c708373ced4",
133 "hash":"0b83dde3c9bbdb06f438db9c503ee060bde941c4f1f6176f1c3b66c559d1d4f1",
134 "hash":"e12217881e2b80950a86bbc071b1800c65edcc11d4431cbcd1810a4afda46fbb",
149 "hash":"71b69de1368e73b66f1ccffd036440a422f6828db85c0c6d3ac7809eca86b31c",
155 "hash":"d4b3f72b3d84378be66fa9b9bf26ff8e2cc5efc8f4aece7290f6f52a65c15f2c",
[...]

and looking at the 134-th “hash” line we find “6f”

e12217881e2b80950a86bbc071b1800c65edcc11d4431cbcd1810a4afda46fbb
                                           (position 61-62) ^^

Now that we know how the encryption work we can look up all symbols we need to create the message for Flag 3

$ curl ‑A Satoshi:0.15.1 http://6.6.6.6/address/2e0c3c58da28585d065894275db6135d320648250f1606275d2a044960362e2c1c58cb1858d51a5db73258c0354871165d621c492419491a3a58be344942254971315d8b10589c08160e37493d17491004?format=json

{
 "hash160":"5cfb25c53220ec02913648a380e2b07fe0287ef2",
 "address":"19Ue2gkjFb7K3nodHSpz3DUqrVUNeNNkj2",
 "return_code":203
 "return_value":"RmxhZyAzOiBDU0NCRXtqejJoNDc4ZGZnXiMlJSQlamRman0="
}

Base64 decoding the return_value we get the third flag

 Flag 3: CSCBE{jz2h478dfg^#%%$%jdfj}

[Another approach for finding the missing characters was brute force. Once the concept of the hextets was clear, one could iterate over the different yy and zz values. The webserver implemented rate limiting which hindered this approach, but made it not impossible]

About the author

Kris BoulezKris Boulez has extensive experience in Information Security. He joined NVISO in early 2017. The last decade he has mainly worked on Enterprise Security Architectures (ESA), PKI and (Web) Application Security. For ESA, he strongly believes in a business-driven approach (SABSA) and for human well-being in the healing power of coffee. You can find Kris on LinkedIn.

 

One thought on “Write-up on Blockchain data exfiltration (CSCBE18 qualifiers) challenge

Leave a Reply