Differences

This shows you the differences between two versions of the page.

--- pgpfan:seip [2024/07/10 20:11] – [MAC] Thank Stephan Verbücheln b.walzer
+++ pgpfan:seip [2024/10/10 15:37] (current) – [SEIP for TLS] Consistency. b.walzer
@@ Line 40: / Line 40: @@
 //<sub>OpenPGP's CFB prefix</sub>//
-The original IV is replaced with zeros and the IV is shifted down to the first part of the message. Encrypting zeros is a common way to generate a new key from the existing key. For example, the popular [[wp>Galois/Counter_Mode|Galois/counter]] (GCM) block cipher mode does this to generate the key for the GHASH used. Encrypting zeros can be thought of as hashing the key using the [[wp>One-way_compression_function#Davies–Meyer|Davies-Meyer]] method. So we can redraw the diagram to make it easier to understand:
+The original IV is replaced with zeros and the IV is shifted down to the first part of the message. Encrypting zeros is a common way to generate a new key from the existing key. For example, the popular [[wp>Galois/Counter_Mode|Galois/counter]] (GCM) block cipher mode does this to generate the key for the GHASH used. The [[https://www.cs.ucdavis.edu/~rogaway/ocb/|OCB]] mode does this to prevent attacker knowledge of of the Δ chain. Encrypting zeros can be thought of as hashing the key using the [[wp>One-way_compression_function#Davies–Meyer|Davies-Meyer]] method. So we can redraw the diagram to make it easier to understand:
 {{ocfb_hash_se.svg}}\\
@@ Line 57: / Line 57: @@
 Normally it is trivial to remove the first part of a CFB message but if an attacker chopped off the first part of a SE message, the check value would break. But if this was an intended feature the OpenPGP standard should mention that the check value is critical to security and the standard doesn't. This isn't the sort of thing that anyone using encrypted email in the 1990's would care about anyway, messages were either signed and guaranteed intact or not.
-It isn't even clear here why a IV, encrypted or not, is required at all for the SE formatted message. Generally, an IV is useful when the same key is used to encrypt two or more messages. Think TLS where thousands/millions of messages might be encrypted with the same key during a connection. In the sort of usage that OpenPGP targets there is normally only one message. So from this it is possible to imagine that at one time the IV was set to zero simply because it was not considered useful. That is actually done in another part of the OpenPGP standard for that exact reason. Perhaps it was later considered important to add in an IV and it made the most sense at the time to put it as the first block of the plaintext.
+It isn't even clear here why a IV, encrypted or not, is required at all for the SE formatted message. Generally, an IV is useful when the same key is used to encrypt two or more messages. Think TLS where thousands/millions of messages might be encrypted with the same key during a connection. In the sort of usage that OpenPGP targets there is normally only one message and that message is encrypted with a new/unique key. So from this it is possible to imagine that at one time the IV was set to zero simply because it was not considered useful. That is actually done in another part of the OpenPGP standard for that exact reason. Perhaps it was later considered important to add in an IV and it made the most sense at the time to put it as the first block of the plaintext.
 I have drifted well into baseless speculation here. I shouldn't be required or even be allowed to indulge in such speculation. This seems to be an excellent example to support a principle near and dear to my heart. When creating a standard (or any specification really) you are not done once you have defined //what// it is and //how// it works. You should really also explain //why// it works the way it does. Explaining //why// puts you at risk of someone later coming along and successfully attacking your reasoning, but if you don't provide an explanation they will tend to assume that there wasn't any reasoning.
@@ Line 111: / Line 111: @@
 This sort of attack is not possible in the case of SEIP. The attacker can get the victim to create a new MDC for them (the MDC is not protected against creation). The secret IV is not encrypted in the same way as the Plaintext (and is thus protected). If the attacker embeds an entire valid message, IV and all, inside the message the victim creates for them, then the IV will eventually get decrypted in a different way than it was encrypted. Alternatively, if the attacker lets the victim encrypt the IV for them, then the attacker will not know what the IV was to start with (the victim provides it) and will not be able to create a valid MDC.
-If the attacker knows the secret IV then they can probably create a new and valid message. So IV disclosure is significant at this level of threat. Still, the situation where the attacker can make the victim encrypt messages for them is fairly unlikely. It seems to me that, overall, SEIP is better than GCM((I acknowledge that GCM is not all that popular in some circles. But what other popular cipher block mode is available right now for use as a counterexample?)) for IV misuse considering how much worse GCM is than SEIP on IV reuse.
+If the attacker knows the secret IV then they can probably create a new and valid message. So IV disclosure is significant at this level of threat.
+If an attacker could somehow come up with an encrypted suffix with appropriately encrypted MDC, splicing that suffix on the end of an existing message to extend it would result in a block of damage. The MDC would pick that block up and fail the modification detection in a way the attacker could not anticipate. So it is very possible that damage amplification could save the day for message creation if an attacker becomes aware of the secret IV.
 ====Luck====
@@ Line 121: / Line 123: @@
 ====MAC====
-Speaking of missing things, it had to be pointed out to me that the SEIP mode, which I had been thinking of as a //hash// then encrypt scheme, could be interpreted as a //MAC// then encrypt scheme((Thanks to Stephan Verbücheln for pointing out that the modification detection of SEIP can be seen as the naïve textbook version of an HMAC: ''HMAC(k, m) = H(k || m)'')). A [[wp>Message_authentication_code|MAC]] is a way of detecting modifications using a secret key shared between the originator and receiver of a message. So the secret IV is the shared secret key here. From the prospective of a //MAC// then encrypt scheme things look like this (terminology inconsistent with the rest of the article):
+Speaking of missing things, it had to be pointed out to me that the SEIP mode, which I had been thinking of as a //hash// then encrypt scheme, could be interpreted as a //MAC// then encrypt scheme((Thanks to [[https://verbuecheln.ch/|Stephan Verbücheln]] for pointing out that the modification detection of SEIP can be seen as the naïve textbook version of an HMAC: ''HMAC(k, m) = H(k || m)'')). A [[wp>Message_authentication_code|MAC]] is a way of detecting modifications using a secret key shared between the originator and receiver of a message. So the secret IV is the shared secret key here. From the prospective of a //MAC// then encrypt scheme things look like this (terminology inconsistent with the rest of the article):
   - The key is hashed by encrypting zeros producing a new key.
@@ Line 131: / Line 133: @@
 I have to admit that this seems more straightforward than all the stuff about protection. But be careful here. Most are not very knowledgeable about the properties of a MAC buried under the encryption. A MAC is most often used where the material being MACed and the MAC tag are available to the attacker. That is not the case here. I would like to think that the protection discussion is better for ultimately understanding //why// things work the way they do verses //what// results things provide. That was certainly the case for me while developing the protection discussion.
-It should be pointed out that SEIP, while using a hash, is not an [[wp>HMAC|HMAC]] as such things are normally understood. The most significant difference is that the internal state of the hash is disguised in a different way to prevent length extension attacks. An HMAC does this by hashing in a secret key to the output of the hash of the protected material. SEIP instead encrypts the output of the hash of the protected material, thus preventing attacker access in the first place.
+It should be pointed out that SEIP, while using a hash, is not an [[wp>HMAC|HMAC]] as such things are normally understood. The most significant difference is that the internal state of the hash is disguised in a different way to prevent length extension attacks. An HMAC does this by hashing in a secret key to the output of the hash of the protected material. SEIP instead encrypts the output of the hash of the protected material, thus preventing attacker access in the first place. It seems to me at least, that it is easy to understand why encrypting the hash prevents length extension. Preventing attacker knowledge of a value is after all the primary function of encryption. Easier than understanding why rehashing the hash in an HMAC prevents length extension where we are ultimately using the output of the same type of hash we are protecting.
+The topic of key committing block cipher modes is popular right now (2024)((Much of the excitement seems to have come from: [[https://eprint.iacr.org/2019/016.pdf|Fast Message Franking: From Invisible Salamanders to Encryptment]])). This property ends up being related to MAC behaviour in most contemporary block cipher modes so this is as good a place as any to have the discussion. A key committing block cipher mode will make it impossible for an attacker to produce a message that will pass the integrity check with two different keys. The SEIP mode is key committing for a simple, easy to understand, reason. SEIP decrypts first and then checks for modifications using the MDC (MAC then encrypt). Changing the key used to decrypt a message with SEIP changes, well, everything. The plaintext and the MDC change to entirely different values in a very complicated, unpredictable, way. The encrypted IV changes as well, but in a predictable way. Then the MDC check is applied to the wreckage. An attacker would have a very difficult time coming up with two keys that would cause the eventual MDC to pass. If the MAC had been done first on decryption (encrypt then MAC) as, say, GCM does, then the attacker would only have to figure out how to come up with two keys that would produce the same MAC value. With the GHASH used for GCM this is fairly easy. As a result, GCM is not key committing.
 ====Hash Properties====
@@ Line 160: / Line 164: @@
   - If the MDC passes then proceed with processing the plaintext.
-So we need to decrypt the message before we can check it for modifications. The MDC is based on SHA-1 which is inherently constant time due to the primitives it is constructed from. The decryption function, on the other hand, might not be constant time and could conceivably leak information about the plaintext due to timing differences. Cipher implementations that can leak information due to timing differences are considered seriously uncool these days and are rare. So such leakage with our SEIP for TLS scheme would also be rare.
+So we need to decrypt the message before we can check it for modifications (MAC then encrypt). The CBC padding oracle vulnerability is often used as an example of the weakness of such a scheme but, as already mentioned, it is specifically not relevant here. Instead let's consider timing. The MDC is based on SHA-1 which is inherently constant time due to the primitives it is constructed from. The decryption function, on the other hand, might not be constant time and could conceivably leak information about the plaintext due to timing differences when used in TLS. A block cipher mode that checked the MAC first and then refused to decrypt (encrypt then MAC) would be resistant to this sort of thing. The attacker would be restricted to messages generated legitimately and could not just generate their own.
 We would preserve the authentication established at the start of the connection based on the shared key. This is normally how this is done in TLS where there is no separate authentication key. GCM in TLS is handled this way for example. When the SEIP mode was created, the specific objective was to handle the case of anonymous/unauthenticated/unsigned messages/files. So the term "modification detection" was used. When block cipher modes targeting things like TLS were created, it was assumed that every message would be authenticated. So the term "authenticated encryption" was used. There is no difference other than the terminology. GCM can be used in a modification detection situation in the same way that SEIP can be used in an authenticated encryption situation. This difference in terminology sometimes causes confusion. It is perfectly appropriate to categorize the SEIP mode as an example of authenticated encryption.
@@ Line 169: / Line 173: @@
 Up to this point we assumed a cipher with 128 bit blocks. The OpenPGP standard supports ciphers with 64 bit blocks. Use of a 64 bit cipher in a SEIP formatted message would result in the secret IV only being 64 bits long. So the strength of the modification detection would be reduced. It's not clear to me that SEIP with 64 bit blocks is something that anyone ever implemented and that such messages/files were ever generated. This is still an important point as a dependency on block size for modification detection strength is not something that anyone would expect.
+A bit of insight from this discussion... The damage amplification of CFB is helpful here but would not have been as helpful if "encrypt then MAC" had been used. So "MAC/hash then encrypt" is synergistic with damage amplification. The OCB mode seems to me to be an extreme version of this principle. It could be classed as a "XOR then encrypt" mode. There is no MAC/hash used as all, OCB simply XORs the plaintext together to create a checksum and then encrypts the whole message. It seems to rely on the superior damage amplification of the ECB (electronic code book) mode to make this effective.
 [[pgpfan:index|PGP FAN index]]\\