Differences

This shows you the differences between two versions of the page.

--- pgpfan:mdc [2022/05/14 20:21] – TLA identified. Can't leave blocks entirely unexplained. b.walzer
+++ pgpfan:mdc [2024/06/29 21:25] (current) – The technical article now exists. b.walzer
@@ Line 1: / Line 1: @@
 ======The OpenPGP Modification Detection Code is Actually Good======
+//A more detailed (and technical) article covering the same ground as this one exists: [[pgpfan:seip]].//
 I once worked for a company that had a strange and intriguing dilemma. They had a popular Product. Marketing determined that the popularity was due to the fact that the Product lasted significantly longer than competing products. No one in the company had the faintest idea why that was the case. The design did not differ in any obvious way from the design used by the competition. While I was there, an engineering project was initiated with the hope of understanding why the Product was better. I left the company before any definite result. For all I know the mystery still remains.
-The situation with the OpenPGP modification detection code (MDC) very much reminds me of the story of the Product. Legend has it that the MDC was created as a kind of an afterthought. It works very well but it is not obvious why it does. I have never seen an inclusive explanation. Here I will attempt to produce such an explanation.
+The situation with the OpenPGP modification detection code (MDC) very much reminds me of the story of the Product. Legend has it that the MDC was created as a kind of an afterthought((Since I first wrote this, I have come to believe that this is //just// a legend. The principles that make the MDC work were known at the time of its design. See the [[pgpfan:intptxt]] article for a related discussion.)). It works very well but it is not obvious why it does. I have never seen an inclusive explanation. Here I will attempt to produce such an explanation.
-When used for something like email, the messages are authenticated directly with a signature. So the MDC is not relevant in the most common use case. So the MDC is not that important. It would still simplify things and eliminate much pointless discussion if the MDC could in fact be shown as strong. It would eliminate having to go through the more obscure uses of OpenPGP to determine how applicable the MDC was to each.
+Note that there is a another legend floating around that states that the MDC only has the equivalent of "16 bits of security". This is simply wrong and was probably the result of failing to read to the bottom of an email thread(([[https://mailarchive.ietf.org/arch/msg/openpgp/UYEBC7hnZNbMoNWrfz9zJQb_FUk/|The misread ITEF OpenPGP discussion thread about the security properties of the MDC]])).
+When OpenPGP is used for something like email, the messages are authenticated directly with a signature. So the MDC is not relevant in the most common use case. So the MDC is not that important. It would still simplify things and eliminate much pointless discussion if the MDC could in fact be shown as strong. It would eliminate having to go through the more obscure uses of OpenPGP to determine how applicable the MDC was to each.
 If you were to encrypt, say, a message text, an attacker would have no way to determine what was said in that message just by looking at the encrypted result. That, after all, is the whole point of encrypting it in the first place. An attacker might still be able to make changes to the text if they can get access to it along the way. If they know what text is encrypted they might, by duplicating, deleting and moving the encrypted text be able to change the meaning of the text. Many encryption systems allow an attacker to selectively flip the bits in the eventual decrypted message. By that I mean that an attacker can change a 0 to a 1 or a 1 to a 0. The attacker doesn't know what the bit is to start with by looking at the encrypted message, but if they know what text is encrypted they can change it to whatever they want by picking the right bits to flip. The MDC is used to detect any sort of malicious changes to OpenPGP encrypted messages and files.
-We will start with a simple example in the style of the MDC and then improve it. For the sake of this example we will first assume that some sort of block((Things are usually encrypted in chunks (16 bytes long in most cases). The "blocks" referred to here are those chunks. This fact is ignored as much as possible to make the concepts clearer.)) cipher mode is used that allows any bit to be flipped in the message. The popular counter block cipher mode has that property(([[wp>Block_cipher_mode_of_operation#Counter_(CTR)|Counter Block Cipher Mode]])). This is our oversimplified MDC:
+We will start with a simple example in the style of the MDC and then improve it. For the sake of this example we will first assume that some sort of block((Things are usually encrypted in chunks (16 bytes long in most cases). The "blocks" referred to here are those chunks. This fact is ignored as much as possible in this article to make the concepts clearer.)) cipher mode is used that allows any bit to be flipped in the message. The popular counter block cipher mode has that property(([[wp>Block_cipher_mode_of_operation#Counter_(CTR)|Counter Block Cipher Mode]])). This is our oversimplified MDC:
 {{mdc1.svg}}
-We create this by hashing the message. Then we append the hash to the end of the message. After that we encrypt everything; message and hash.
+We create this by [[wp>Cryptographic_hash_function|hashing the message]]. Then we append the hash to the end of the message. After that we encrypt everything; message and hash. To check for modification we hash the message and compare that hash to the hash appended to the message.
-Let's consider the easiest situation for the attacker and assume they know the entire message. Then the attacker can hash that known message and will then know what the hash was before encryption. As a result they can modify the hash to any value they want by flipping bits as required. So the attacker can change the message to anything they want without restriction and can change the hash so that their changes would not be detected. If they can somehow arrange to have an appropriate hash encrypted in the message they can chop off the end of the message without detection.
+Let's consider the easiest situation for the attacker and assume they know the entire message. Then the attacker can hash that known message and will then know what the hash was before encryption. As a result they can modify the hash to any value they want by flipping bits as required. So the attacker can change the message to anything they want without restriction and can change the hash so that their changes would not be detected. If their target message is shorter than the original they can just generate the hash early and drop the extra part. So this is not entirely secure.
-We can make it harder to maliciously reduce the length of the message by adding an explicit message length:
+It might be good to switch to a block encryption mode that is not so inherently easy to modify. The cipher feed back (CFB) block mode imposes a penalty on modification in the form of completely unpredictable random garbage in the block after the modified block((See the [[pgpfan:cipherfeedback|Cipher Feedback]] article for a more detailed discussion.)):
 {{mdc2.svg}}
-Now the attacker has to come up with a new encrypted length, and they don't have the encryption key.
+Now the hash only has to detect the random garbage triggered by the attempt to change the CFB protected data. Since predicting the random garbage would require knowledge of the encryption key, the attacker has no real way to fix either the garbage or the hash. Attempts to change the last block of the message will cause unpredictable damage to the hash itself. So the modification detection code and the cipher feedback block mode work together.
-It might be good to switch to a block encryption mode that is not so inherently easy to modify. The cipher feed back (CFB) mode imposes a penalty on modification in the form of completely unpredictable random garbage in the block after the modified block((See the [[pgpfan:cipherfeedback|Cipher Feedback]] article for a more detailed discussion.)):
+Changing the last block of the hash would not cause any random garbage because there is no place for that random garbage to go. This means that at least a portion of the hash is modifiable. Let's fix that:
 {{mdc3.svg}}
-Now the hash only has to detect the random garbage triggered by the attempt to change the CFB protected data. Since predicting the random garbage would require knowledge of the encryption key, the attacker has no real way to fix either the garbage or the hash. Attempts to change the last block of the message will cause unpredictable damage to the hash itself.
+We have added some random data to the start of the message. The random data prefix is included in the hash. That means that the attacker can never know the entire message and as a result will not know what the hash is to start with. As a result they will not be able to change the hash in a rational way by flipping bits. So the hash is protected by first randomizing it and then encrypting it.
-Changing the last block of the hash would not cause any random garbage because there is no place for that random garbage to go. This means that at least a portion of the hash is modifiable. Let's fix that:
+There are some mostly theoretical attacks that involve getting the victim to encrypt messages created by the attacker so that the attacker then can modify them by chopping off the start and/or the end of the message without detection. The version of cipher feedback used by OpenPGP((See the [[pgpfan:ocfb|OpenPGP's Improved Cipher Feedback Mode]] article for some more detail.)) (OCFB) prevents that sort of attack by preventing attacker knowledge of the random prefix data and requiring the key to create a new random prefix. This is the OCFB-MDC (OpenPGP Cipher FeedBack - Modification Detection Code) mode used by OpenPGP (irrelevant detail omitted):
 {{mdc4.svg}}
-We have added some random data before the length. That random data is included in the hash. That means that the attacker can never know the entire message and will not know what the hash is to start with. As a result they will not be able to change it in a rational way by flipping bits.
+Now both the hash and the random data are protected by first randomizing them and then encrypting them.
-This brings us to the end of our journey. The previous diagram represents the MDC (some irrelevant detail omitted). If you want to attack the MDC and modify a message without triggering the MDC you will have to deal with all of the following challenges:
+If you want to attack OCFB-MDC and modify a message without triggering the MDC you will have to deal with the following challenges:
   * Everything is encrypted. You will have to work through the encryption. You don't know the key.
   * The hash will detect your change directly.
-  * CFB will cause unpredictable damage to the next block which will also be detected by the hash.
+  * OCFB will cause unpredictable damage to the next block which will also be detected by the hash.
-  * Even if you can somehow figure out how to make a rational change to the message/file you will not know how to change the hash due to the random data.
+  * Even if you can somehow figure out how to make a rational change to the message/file you will not know how to change the hash due to the random data prefix.
+  * The random data prefix is very well protected by the OpenPGP version of cipher feedback (OCFB).
-This might seem clunky and redundant. The same result is achieved in multiple ways. But that doesn't reduce the security; it seems very unlikey that one method could be used to overcome another. It also makes sense in an OpenPGP context. All of this was preexisting in the OpenPGP standard:
+This might seem inelegant but it makes complete sense in the OpenPGP context. This was preexisting in the OpenPGP standard:
-  * The length is the regular OpenPGP packet length.
+  * The OCFB block mode is the standard mode used in OpenPGP.
-  * The CFB block mode is the standard mode used in OpenPGP.
+  * The random data is used by the OCFB block mode to prevent similar/identical messages/files from leaking data after encryption.
-  * The random data is used by the CFB block mode to prevent similar/identical messages/files from leaking data after encryption.
 All that was required to make the MDC was the addition of a single hash. The MDC is actually an example of minimalist and appropriate design.
-I am not a professional cryptographer, but the MDC seems pretty secure. No one can say for sure that the MDC is secure. Anyone can prove it is //not// by demonstrating that they can modify messages/files without tripping the MDC. In the 20 years that the MDC has existed (2022) no one has managed to do this.
+I am not a professional cryptographer, but the MDC seems pretty secure. No one can say for sure that the MDC is completely secure. Anyone can prove it is //not// by demonstrating that they can modify messages/files without tripping the MDC. In the 20 years that the MDC has existed (2022) no one has managed to do this. I doubt that was because of a lack of effort. OpenPGP gets a fair bit of academic scrutiny.
-The combination of CFB and MDC is effectively authenticated encryption. It detects changes in messages based on the shared secret of the encryption key. There is a definition of authenticated encryption that makes refusal to release suspect data mandatory, but that is not relevant for the sort of offline applications that OpenPGP is used for. There is only one encrypted message/file available when working with an offline system. Eventually someone is going to have to look at a suspect message to try to determine if they are under some sort of attack. Someone might want to try to recover the data in a corrupted file. If you want to define CFB-MDC-NR (NR for No Release) for some situation where that would make sense then feel free to do so; there is nothing intrinsic to CFB-MDC that would prevent you from doing that.
+The combination of OCFB and MDC is effectively authenticated encryption. It detects changes in messages based on the shared secret of the encryption key. There is a definition of authenticated encryption that makes refusal to release suspect data mandatory, but that is not relevant for the sort of offline applications that OpenPGP is used for. There is only one encrypted message/file available when working with an offline system. Eventually someone is going to have to look at a suspect message to try to determine if they are under some sort of attack. Someone might want to try to recover the data in a corrupted file. If you want to define OCFB-MDC-NR (NR for No Release) for some situation where that would make sense then feel free to do so; there is nothing intrinsic to OCFB-MDC that would prevent you from doing that.
-Most authenticated encryption schemes use some sort of counter block encryption mode and as a result depend heavily on the implementation refusing to release data because counter mode is completely modifiable without penalty. In an offline encryption environment where such implementation behaviour can't be guaranteed, the inherent modification deterrence of the CFB mode becomes important. So the MDC is specifically suited to the offline encryption environment in a way that other schemes are not.
+Most authenticated encryption schemes use some sort of counter block encryption mode and as a result depend heavily on the implementation refusing to release suspect data because counter mode is completely modifiable without penalty. In an offline encryption environment where such implementation behaviour can't be guaranteed, the inherent modification deterrence of the OCFB mode becomes important. So the MDC is specifically suited to the offline encryption environment in a way that other schemes are not.
-The MDC uses the SHA1 method for the hash. Not everyone knows that the discovered weakness in SHA1 is irrelevant to the MDC. I suppose you could redefine it as the "MDC hash" and include the weakness in the specification to prevent unnecessary angst. In general, the MDC is likely to be resistant to weaknesses in the hash due to the fact that the hash is encrypted and randomized by the random data which makes it very hard to mess with.
+The MDC uses the SHA1 method for the hash. Not everyone knows that the discovered weakness in SHA1 is irrelevant to the MDC. I suppose you could redefine it as the "MDC hash" and specify that it only needs to be irreversible to prevent unnecessary angst. In general, the MDC is likely to be resistant to weaknesses in the hash due to the fact that the stored hash is encrypted and randomized by the random data which makes it very hard to mess with.
-The MDC is secure and is well suited to the sort of offline encryption that the OpenPGP standard embodies. Proposals to add one or more encrypted authenticated modes and depreciate the MDC don't make sense to me. We would be better off if we simply did nothing.
+The MDC is secure and is well suited to the sort of offline encryption that the OpenPGP standard embodies. [[pgpfan:no_new_ae|Proposals to add one or more encrypted authenticated modes and depreciate the MDC don't make sense to me]]. We would be better off if we simply did nothing.
 =====References=====
@@ Line 64: / Line 68: @@
   * [[https://www.rfc-editor.org/rfc/rfc4880#section-5.13|RFC-4880 sec 5.13 (Symmetrically Encrypted Integrity Protected Data packet)]]
   * [[https://www.rfc-editor.org/rfc/rfc4880#section-5.14|RFC-4880 sec 5.14 (Modification Detection Code packet)]]
-  * [[https://mailarchive.ietf.org/arch/msg/openpgp/UYEBC7hnZNbMoNWrfz9zJQb_FUk/|IETF OpenPGP email list thread about the security properties of the MDC)]]
 [[pgpfan:index|PGP FAN index]]\\
 [[em:index|Encrypted Messaging index]]