El Equilibrio: mayo 2020

viernes, 22 de mayo de 2020

Why Receipt Notifications Increase Security In Signal

This blog post is aimed to express and explain my surprise about Signal being more secure than I thought (due to receipt acknowledgments). I hope you find it interesting, too.

Signal, and especially its state update protocol, the Double Ratchet algorithm, are widely known for significantly increasing security for instant messaging. While most users first see the end-to-end security induced by employing Signal in messaging apps, the properties achieved due to ratcheting go far beyond protecting communication against (active) attackers on the wire. Due to updating the local device secrets via the Double Ratchet algorithm, the protocol ensures that attackers, who temporarily obtain a device's local storage (on which Signal runs), only compromise confidentiality of parts of the communications with this device. Thus, the leakage of local secrets from a device only affects security of a short frame of communication. The exact duration of compromise depends on the messaging pattern among the communicating parties (i.e., who sends and receives when), as the state update is conducted during the sending and receiving of payload messages.

The Double Ratchet

The Double Ratchet algorithm consists of two different update mechanisms: the symmetric ratchet and the asymmetric ratchet. The former updates symmetric key material by hashing and then overwriting it with the hash output (i.e., k:=H(k)). Thus, an attacker, obtaining key material can only predict future versions of the state but, due to the one-wayness of the hash function, cannot recover past states. The asymmetric ratchet consists of Diffie-Hellman key exchanges (DHKE). If, during the communication, party A receives a new DH share g^b as part of a message from the communication partner B, then A samples a new DH exponent a and responds with the respective DH share g^a in the next sent message. On receipt of this DH share, B will again sample a new DH exponent b' and attach the DH share g^b' to the next message to A. With every new DH share, a new DHKE g^ab is computed among A and B and mixed into the key material (i.e., k:=H(k,g^ab)). For clarity, I leave out a lot of details and accuracy. As new DH shares g^a and g^bare generated from randomly sampled DH exponents a and b, and the computation of g^ab is hard if neither a nor b are known, the key material recovers from an exposure of the local secrets to an attacker after a new value g^ab was freshly established and mixed into it. Summing up this mechanism, if an attacker obtains the local state of a Signal client, then this attacker cannot recover any previously received message (if the message itself was not contained in the local state), nor can it read messages that are sent after a new g^ab was established and mixed into the state. The latter case happens with every full round-trip among A and B (i.e., A receives from B, A sends to B, and A receives again from B).

Conceptual depiction of Double Ratchet in Signal two years ago (acknowledgments were only protected between client and server). The asymmetric ratchet fully updates the local secrets after one round-trip of payload messages.

Research on Ratcheting

During the last two years, the Signal protocol inspired the academic research community: First, a formal security proof of Signal was conducted [1] and then ratcheting was formalized as a generic primitive (independent of Signal) [2,3,4]. This formalization includes security definitions that are derived via 1. defining an attacker, 2. requiring security unless it is obvious that security cannot be reached. Protocols, meeting this optimal notion of security, were less performant than the Double Ratchet algorithm [3,4]. However, it became evident that the Double Ratchet algorithm is not as secure as it could be (e.g., recovery from exposure could be achieved quicker than after a full round-trip; see, e.g., Appendix G of our paper [3]). Afterwards, protocols (for slightly weakened security notions) were proposed that are similarly performant as Signal but also a bit more secure [5,6,7].

Protecting Acknowledgments ...

In our analysis of instant messaging group chats [8] two years ago (blog posts: [9,10]), we found out that none of the group chat protocols (Signal, WhatsApp, Threema) actually achieves real recovery from an exposure (thus the asymmetric ratchet is not really effective in groups; a good motivation for the MLS project) and that receipt acknowledgments were not integrity protected in Signal nor WhatsApp. The latter issue allowed an attacker to drop payload messages in transmission and forge receipt acknowledgments to the sender such that the sender falsely thinks the message was received. Signal quickly reacted on our report by treating acknowledgments as normal payload messages: they are now authenticated(-encrypted) using the Double Ratchet algorithm.

... Supports Asymmetric Ratchet

Two years after our analysis, I recently looked into the Signal code again. For a training on ratcheting I wanted to create an exercise for which the lines in the code should be found that execute the symmetric and the asymmetric ratchet respectively. Somehow I observed that the pure symmetric ratchet (only updates via hash functions) was nearly never executed (especially not when I expected it) when lively debugging the app but almost always new DH shares were sent or received. I realized that, due to encrypting the receipt acknowledgments now, the app always conducts full round-trips with every payload message. In order to observe the symmetric ratchet, I needed to temporarily turn on the flight mode on my phone such that acknowledgments are not immediately returned.

Conceptual depiction of Double Ratchet in Signal now (acknowledgments encrypted). The asymmetric ratchet fully updates the local secrets after an acknowledgment for a message is received.

Consequently, Signal conducts a full DHKE on every sent payload message (in case the receiving device is not offline) and mixes the result into the state. However, a new DH exponent is always already sampled on the previous receipt (see sketch of protocol above). Thus, the exponent for computing a DHKE maybe remained in the local device state for a while. In order to fully update the state's key material, two round-trips must be initiated by sending two payload messages and receiving the resulting two acknowledgments. Please note that not only the mandatory receipt acknowledgments are encrypted but also notifications on typing and reading a message.

If you didn't understand exactly what that means, here a tl;dr: If an attacker obtains your local device state, then with Signal all previous messages stay secure and (if the attacker does not immediately use these secrets to actively manipulate future conversations) all future messages are secure after you wrote two messages (and received receipt acknowledgments) in all of your conversations. Even though this is very (in practice certainly sufficiently) secure, recent protocols provide stronger security (as mentioned above) and it remains an interesting research goal to increase their performance.

[1] https://eprint.iacr.org/2016/1013.pdf
[2] https://eprint.iacr.org/2016/1028.pdf
[3] https://eprint.iacr.org/2018/296.pdf
[4] https://eprint.iacr.org/2018/553.pdf
[5] https://eprint.iacr.org/2018/889.pdf
[6] https://eprint.iacr.org/2018/954.pdf
[7] https://eprint.iacr.org/2018/1037.pdf
[8] https://eprint.iacr.org/2017/713.pdf
[9] https://web-in-security.blogspot.com/2017/07/insecurities-of-whatsapps-signals-and.html
[10] https://web-in-security.blogspot.com/2018/01/group-instant-messaging-why-baming.html

Deepin Or UbuntuDDE

I'm sure nowadays many Deepin users are thinking in changing to UbuntuDDE, so let's explain some differences between both Linux distros.

1. Community

At least in the main telegram channel Deepin has more than 2.000 users, but UbuntuDDE is new in beta version and have about 500 users.

2. Boot

Despite de booting sound is the same in both distros, Deepin's animation is nicer than ubuntu's which uses a too bright background.

3. Default memory and CPU usage

The CPU usage is similar, but Deepin by default is using more processes, more network connections and more drivers than UbuntuDDE.

4. Workspaces

UbuntuDDE allows up to 7 workspaces meanwhile Deepin right now only allows 4.
Is not only more workspaces for UbuntuDDE, it's also the more eficient way to display them.

5. Software Versions

Deepin is based on Debian so the program versions on store and apt are old but stable, and can have problems with the old libraries installed on the system when compiling new software.

We can see below that Ubuntu's compiler version is quite new, the 9.3.0 which is quite well, but Deepin's version is 6.3.0.

Regarding the kernels, UbuntuDDE has the 5.4.0.21 and Deepin the 4.15.0-30, the libc in both systems is updated.

6. The store

Deepin's store is fast and polished and contain the main software, but and the UbuntuDDE

Conclussions

Deepin is the most used of both and it's the original one, but many users are trying the UbuntuDDE (which is beta for now) because the need of using recent versions, also the 4 workspaces on Deepin is another limitation for some Linux users. Probably Deepin v20 will overcome the limitations but the main decision is between Debian as base system or ubuntu, and for more users the trend in workstations is ubuntu.

Gallery

Continue reading

jueves, 21 de mayo de 2020

Open Sesame (Dlink - CVE-2012-4046)

A couple weeks ago a vulnerability was posted for the dlink DCS-9xx series of cameras. The author of the disclosure found that the setup application that comes with the camera is able to send a specifically crafted request to a camera on the same network and receive its password in plaintext. I figured this was a good chance to do some analysis and figure out exactly how the application carried out this functionality and possibly create a script to pull the password out of a camera.

The basic functionality of the application is as follows:

Application sends out a UDP broadcast on port 5978
Camera sees the broadcast on port 5978 and inspects the payload – if it sees that the initial part of the payload contains "FF FF FF FF FF FF" it responds (UDP broadcast port 5978) with an encoded payload with its own MAC address
Application retrieves the camera's response and creates another UDP broadcast but this time it sets the payload to contain the target camera's MAC address, this encoded value contains the command to send over the password
Camera sees the broadcast on port 5978 and checks that it is meant for it by inspecting the MAC address that has been specified in the payload, it responds with an encoded payload that contains its password (base64 encoded)

After spending some time with the application in a debugger I found what looked like it was responsible for the decoding of the encoded values that are passed:

super exciting screen shot.

After spending some time documenting the functionality I came up with the following notes (messy wall of text):


	Command	Comments
	.JGE SHORT 0A729D36	; stage1
	./MOV EDX,DWORD PTR SS:[LOCAL.2]	; set EDX to our 1st stage half decoded buffer
	.\|MOV ECX,DWORD PTR SS:[LOCAL.4]	; set ECX to our current count/offset
	.\|MOV EAX,DWORD PTR SS:[LOCAL.3]	; set EAX to our base64 encoded payload
	.\|MOVSX EAX,BYTE PTR DS:[EAX]	; set EAX to the current value in our base64 payload
	.\|MOV AL,BYTE PTR DS:[EAX+0A841934]	; set EAX/AL to a hardcoded offset of its value table is at 0a841934
	.\|MOV BYTE PTR DS:[ECX+EDX],AL	; ECX = Offset, EDX = start of our half-decoded buffer, write our current byte there
	.\|INC DWORD PTR SS:[LOCAL.4]	; increment our offset/count
	.\|INC DWORD PTR SS:[LOCAL.3]	; increment our base64 buffer to next value
	.\|MOV EDX,DWORD PTR SS:[LOCAL.4]	; set EDX to our counter
	.\|CMP EDX,DWORD PTR SS:[ARG.2]	; compare EDX (counter) to our total size
	.\JL SHORT 0A729D13	; jump back if we have not finished half decoding our input value
	.MOV ECX,DWORD PTR SS:[ARG.3]	; Looks like this will point at our decoded buffer
	.MOV DWORD PTR SS:[LOCAL.5],ECX	; set Arg5 to our decoded destination
	.MOV EAX,DWORD PTR SS:[LOCAL.2]	; set EAX to our half-decoded buffer
	.MOV DWORD PTR SS:[LOCAL.3],EAX	; set arg3 to point at our half-decoded buffer
	.MOV EDX,DWORD PTR SS:[ARG.4]	; ???? 1500 decimal
	.XOR ECX,ECX	; clear ECX
	.MOV DWORD PTR DS:[EDX],ECX	; clear out arg4 value
	.XOR EAX,EAX	; clear out EAX
	.MOV DWORD PTR SS:[LOCAL.6],EAX	; clear out local.6
	.JMP SHORT 0A729DAE	; JUMP

	./MOV EDX,DWORD PTR SS:[LOCAL.3]	; move our current half-decoded dword position into EDX
	.\|MOV CL,BYTE PTR DS:[EDX]	; move our current byte into ECX (CL) (dword[0])
	.\|SHL ECX,2	; shift left 2 dword[0]
	.\|MOV EAX,DWORD PTR SS:[LOCAL.3]	; move our current dword position into EAX
	.\|MOVSX EDX,BYTE PTR DS:[EAX+1]	; move our current dword position + 1 (dword[1]) into EDX
	.\|SAR EDX,4	; shift right 4 dword[1]
	.\|ADD CL,DL	; add (shift left 2 dword[0]) + (shift right 4 dword[1])
	.\|MOV EAX,DWORD PTR SS:[LOCAL.5]	; set EAX to our current decoded buffer position
	.\|MOV BYTE PTR DS:[EAX],CL	; write our decoded (dword[0]) value to or decoded buffer
	.\|INC DWORD PTR SS:[LOCAL.5]	; increment our position in the decoded buffer
	.\|MOV EDX,DWORD PTR SS:[LOCAL.3]	; set EDX to our current dword position
	.\|MOV CL,BYTE PTR DS:[EDX+1]	; set ECX to dword[1]
	.\|SHL ECX,4	; left shift 4 dword[1]
	.\|MOV EAX,DWORD PTR SS:[LOCAL.3]	; set EAX to our current dword position
	.\|MOVSX EDX,BYTE PTR DS:[EAX+2]	; set EDX to dword[2]
	.\|SAR EDX,2	; shift right 2 dword[2]
	.\|ADD CL,DL	; add (left shift 4 dword[1]) + (right shift 2 dword[2])
	.\|MOV EAX,DWORD PTR SS:[LOCAL.5]	; set EAX to our next spot in the decoded buffer
	.\|MOV BYTE PTR DS:[EAX],CL	; write our decoded value into our decoded buffer
	.\|INC DWORD PTR SS:[LOCAL.5]	; move to the next spot in our decoded buffer
	.\|MOV EDX,DWORD PTR SS:[LOCAL.3]	; set EDX to our current half-decoded dword
	.\|MOV CL,BYTE PTR DS:[EDX+2]	; set ECX dword[2]
	.\|SHL ECX,6	; shift left 6 dword[2]
	.\|MOV EAX,DWORD PTR SS:[LOCAL.3]	; set EAX to our current half-decoded dword
	.\|ADD CL,BYTE PTR DS:[EAX+3]	; add dword[2] + dword[3]
	.\|MOV EDX,DWORD PTR SS:[LOCAL.5]	; set EDX to point at our next spot in our decoded buffer
	.\|MOV BYTE PTR DS:[EDX],CL	; write our decoded byte to our decoded buffer
	.\|INC DWORD PTR SS:[LOCAL.5]	; move to the next spot in our decoded buffer
	.\|ADD DWORD PTR SS:[LOCAL.3],4	; increment our encoded buffer to point at our next dword
	.\|MOV ECX,DWORD PTR SS:[ARG.4]	; set ECX to our current offset?
	.\|ADD DWORD PTR DS:[ECX],3	; add 3 to our current offset?
	.\|ADD DWORD PTR SS:[LOCAL.6],4	; add 4 to our byte counter??
	.\|MOV EAX,DWORD PTR SS:[ARG.2]	; move total size into EAX
	.\|ADD EAX,-4	; subtract 4 from total size
	.\|CMP EAX,DWORD PTR SS:[LOCAL.6]	; compare our total bytes to read bytes
	.\JG SHORT 0A729D50	; jump back if we are not done

	.MOV EDX,DWORD PTR SS:[LOCAL.3]	; set EDX to our last DWORD of encoded buffer
	.MOVSX ECX,BYTE PTR DS:[EDX+3]	; set ECX to dword[3] last byte of our half-decoded dword (dword + 3)
	.INC ECX	; increment the value of dword[3]
	.JE SHORT 0A729E1E
	.MOV EAX,DWORD PTR SS:[LOCAL.3]	; set EAX to our current half-decoded dword
	.MOV DL,BYTE PTR DS:[EAX]	; set EDX (DL) to dword[0]
	.SHL EDX,2	; shift left 2 dword[0]
	.MOV ECX,DWORD PTR SS:[LOCAL.3]	; set ECX to our current encoded dword position
	.MOVSX EAX,BYTE PTR DS:[ECX+1]	; set EAX to dword[1]
	.SAR EAX,4	; shift right 4 dword[1]
	.ADD DL,AL	; add (shifted left 2 dword[0]) + (shifted right 4 dword[1])
	.MOV ECX,DWORD PTR SS:[LOCAL.5]	; set ECX to point at our next spot in our decoded buffer
	.MOV BYTE PTR DS:[ECX],DL	; write our decoded value (EDX/DL) to our decoded buffer
	.INC DWORD PTR SS:[LOCAL.5]	; move to the next spot in our decoded buffer
	.MOV EDX,DWORD PTR SS:[LOCAL.3]	; set EDX to point at our dword
	.MOV AL,BYTE PTR DS:[EDX+1]	; set EAX/AL to dword[1]
	.SHL EAX,4	; shift left 4 dword[1]
	.MOV EDX,DWORD PTR SS:[LOCAL.3]	; set EDX to our current dword
	.MOVSX ECX,BYTE PTR DS:[EDX+2]	; set ECX to dword[2]
	.SAR ECX,2	; shift right 2 dword[2]
	.ADD AL,CL	; add (shifted left 4 dword[1]) + (shifted right 2 dword[2])
	.MOV EDX,DWORD PTR SS:[LOCAL.5]	; set EDX to point at our current spot in our decoded buffer
	.MOV BYTE PTR DS:[EDX],AL	; write our decoded value to the decoded buffer
	.INC DWORD PTR SS:[LOCAL.5]	; move to the next spot in our decoded buffer
	.MOV EAX,DWORD PTR SS:[LOCAL.3]	; set EAX to point at our current dword
	.MOV CL,BYTE PTR DS:[EAX+2]	; set ECX/CL to dword[2]
	.SHL ECX,6	; shift left 6 dword[2]
	.MOV EAX,DWORD PTR SS:[LOCAL.3]	; point EAX at our current dword
	.ADD CL,BYTE PTR DS:[EAX+3]	; add dword[3] + (shifted left 6 dword[2])
	.MOV EDX,DWORD PTR SS:[LOCAL.5]	; point EDX at our current decoded buffer
	.MOV BYTE PTR DS:[EDX],CL	; write our decoded value to the decoded buffer
	.INC DWORD PTR SS:[LOCAL.5]	; increment our deocded buffer
	.MOV ECX,DWORD PTR SS:[ARG.4]	; set ECX to our current offset?
	.ADD DWORD PTR DS:[ECX],3	; add 4 for our current byte counter?
	.JMP 0A729EA6	; jump

Translated into english: the application first uses a lookup table to translate every byte in the input string, to do this it uses the value of the current byte as an offset into the table. After it is done with "stage1" it traverses the translated input buffer a dword at a time and does some bit shifting and addition to fully decode the value. The following roughly shows the "stage2" routine:

(Dword[0] << 2) + (Dword[1] >> 4) = unencoded byte 1

(Dword[1] << 4) + (Dword[2] >> 2) = unencoded byte 2

(Dword[2] << 6) + Dword[3] = unencoded byte 3

I then confirmed that this routine worked on an "encoded" value that went over the wire from the application to the camera. After confirming the encoding scheme worked, I recreated the network transaction the application does with the camera to create a stand alone script that will retrieve the password from a camera that is on the same lan as the "attacker". The script can be found here, thanks to Jason Doyle for the original finding (@jasond0yle ).