-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux random number generator updates #158
base: main
Are you sure you want to change the base?
Conversation
36d0350
to
892d727
Compare
Any update on the merging status of this? The intent of the proposal is to get this section to apply to modern kernels and reduce the semantic complexity discussed. I find the value of giving a simple story of the current semantics of the devices significant, as a newcomer to Linux will not have to increase their limited mental load with details that are only of historical interest. The only "change" is the mentioning of |
I verified the POSIX inclusion: https://pubs.opengroup.org/onlinepubs/9799919799/functions/getentropy.html The Linux kernel text looks correct to me. I've asked Greg KH to verify the Linux kernel situation. Yes, I should have done that a while ago, but I'm on the case now :-). |
Greg KH asked me to look at the docs & talk directly with the kernel implementers. First, looking at available docs... POSIX does have The /dev stuff needs more research. It's true that commit torvalds/linux@30c08ef from 2020 was earlier merged in. However, there were problems in the unification noted in 2022. So I want to check and make sure that these statements about /dev/*random are correct. I believe the people I eventually need to contact about Linux kernel random number generation are Jason A. Donenfeld and Theodore Ts'o. Jason seems to be more active recently on it, and tytso has a long background on it. |
The most authoritative source is the code itself. I've identified the key source file to be reviewed: |
The problem is that /dev/random blocks. In many cases blocking is a much worse problem. See: “Myths about /dev/urandom” The urandom(4) man page says:
|
The same problem hits
|
<tr> | ||
<td>C (POSIX)</td> | ||
<td><tt>rand(), *rand48()</tt></td> | ||
<td><tt>getentropy()</tt></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine!
For example, the Linux kernel provides cryptographically secure random number values via its `getrandom` system call, as well as the special files `/dev/urandom` and `/dev/random`. In most cases you would want to use the `getrandom` system call where practical, or the `/dev/urandom` special file if `getrandom` is hard to access (e.g., from a shell script). These generate cryptographically secure random values using a CSPRNG and entropy gathered by the kernel. In special circumstances, such as creating a long-lived cryptographic key, you might instead want to use `/dev/random` or the equivalent option in `getrandom`; this forces the kernel to wait (block) until it has a high estimated amount of internal entropy. The purpose of `/dev/random` is to ensure there is a large amount of internal entropy, but the blocking may be indefinite in some circumstances and it’s usually not necessary. What's important is that an attacker can't practically guess the random value, not the value of this internal entropy estimate. (see [“Myths about /dev/urandom”](https://www.2uo.de/myths-about-urandom/) by Thomas). In the future there may be no difference between `/dev/random` and `/dev/urandom`. | ||
|
||
For example, the Linux kernel provides cryptographically secure random number values via its **/dev/urandom** special file, its **/dev/random** special file, and its **getrandom** system call. In most cases you would want to use the **/dev/urandom** special file or the **getrandom** system call. These generate cryptographically secure random values using a CSPRNG and entropy gathered by the kernel. In special circumstances, such as creating a long-lived cryptographic key, you might instead want to use **/dev/random** or the equivalent option in **getrandom**; this forces the kernel to wait (block) until it has a high estimated amount of internal entropy. The purpose of **/dev/random** is to ensure there is internal entropy, but the blocking may be indefinite in some circumstances and it’s usually not necessary | ||
For example, POSIX defines the `getentropy` interface to access a cryptographically secure pseudo-random number generator. On Linux systems `getentropy` is wrapper over the `getrandom` system call. The Linux kernel additionally provides the special files `/dev/urandom` and `/dev/random` that have been similar since Linux kernel version 5.6. In most cases you would want to use the `getentropy` interface where practical, or the `/dev/random` special file if `getentropy` is hard to access (e.g., from a shell script). Both generate cryptographically secure random values using a CSPRNG and entropy gathered by the kernel. The key difference between `/dev/urandom` and `/dev/random`, is that `/dev/urandom` may provide output during early boot even before the random generator is fully seeded, while `getentropy` and `/dev/random` block until the generator is fully initialized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to note getentropy
is in POSIX (we should probably note it's a standard). Noting that on Linux it calls getrandom
is also good. However, this text misses a key issue: getentropy
, /dev/random
, and some uses of getrandom
can block. If you're not creating keys for long-term use, the usual recommendation is to avoid the blocking versions, because having a system catastrophically fail usually means that people will do work-arounds that are even worse. The "early boot" text doesn't really explain things.
How about this instead? (I've made some edits since my first try):
For example, the POSIX specification defines the getentropy
interface to access a cryptographically secure pseudo-random number generator. On Linux systems getentropy
is a wrapper around the Linux-specific and more flexible getrandom
system call. The Linux kernel additionally provides the special readable files /dev/urandom
and /dev/random
. The Linux kernel, and many other systems, gather entropy from hardware and mix that into random values using a CSPRNG to provide these cryptographically secure random values. Typical Linux systems record old seeds on shutdown so they will continue to produce highly random values when they reboot. Unfortunately, some of these requests will block - possibly forever - until the underlying system's estimate of internal entropy is high enough (getentropy
, /dev/random
, and some uses of getrandom
will all block). In most cases on Linux kernel based systems you should use avoid blocking indefinitely, and instead use a non-blocking form for getrandom
or the non-blocking /dev/urandom
. However, for creating long-lived secrets (like GPG/SSL/SSH keys) where the quality of randomness is especially critical, consider using the blocking versions. This is a potential trade-off between availability and other aspects of security. See “Myths about /dev/urandom” by Thomas and Linux man page urandom(4).
Randomness is quite technical and challenging. I've proposed some alternative text, building on what you proposed earlier. Once we agree on something, I intend to bring it to the Linux kernel developers who focus on the random number generator, to get their take. Security is confidentiality, integrity, and availability. If a system can't start, nobody cares about it. So we need to make it clear that there's a trade-off. What's more, in a lot of cases the secrets are short-lived (e.g., session keys) and the protections provided by entropy estimates are usually not worth making the system completely fail. |
I see the availability aspect important to you, and I see rightfully, but this introduces additional complexity that the one reading the guidance would have to manage. The key aspects that led me to simplify the guidance initially to an unconditional endorsement of getentropy and thus towards simplicity:
As such, I was led to believe that the only practical recommendation for the average developer is to avoid all nuances and details and use At the same time, there is the issue of availability that is important, but at the same time I do not feel that getrandom() with the flag to not block is the unconditional answer, as in the case of openssh keys for example it will produce output that is predictable and with a long-lasting effect. These keys will be predictable even years after they have been generated. The availability issue is something that could happen, probably when the system after a boot hasn't gathered enough entropy (could be in custom designed boards or in VMs as in the example I used above). In these cases using getrandom() with the option to not-block will hide the issue during the design phase and prevent a fix (e.g., using something like virtio_rng on a VM or a hardware random generator on a board). During the production phase it can prevent the issue of (un)availability. As such, my recommendation would be to cover the potential availability issue, maybe with a pointers as after writing the above I'm not sure that getrandom() and /dev/urandom are even solutions (as opposed to quick hacks) to the availability issue for that type of random generator. PS. this is more expressing my point of view rather than a text suggestion. If you also see value on these aspects above and we get on the same page I can propose something. |
We're talking about "no/low estimated entropy" which isn't quite the same as "uninitialized" in its usual sense. In many cases the kernel doesn't normally credit entropy, and believes it's 0 or low, yet there's no known way for an attacker to predict the random value. In particular, "Writing to /dev/random or /dev/urandom will update the entropy pool with the data written, but this will not result in a higher entropy count." per urandom. This is a common case; distros generally write out the seed on shutdown, and read it back, so in cases where it can "start where it left off" this matters. It's really common for systems to reload the seed they previously had on shutdown, where it has a seed no one can know yet its estimated entropy is 0.
Obviously that's bad :-). We already cover that, though. SSH keys are cases where we already recommend the use of blocking calls, like /dev/random. I agree that this is tricky :-). We have a fundamental trade-off. I think we should clearly present it as a trade-off. If a system is unavailable that's often a non-starter. One of the most authoritative guidance about this trade-off when using Linux is the text from urandom(4):
I guess one option is to directly quote this as part of the text. |
That's certainly correct and since you have checked the internals recently have a better overview than I had. However what is my concern is whether there a distinction between the uninitialized rng and the estimated as zero entropy level rng for the linux kernel? Without a distinction it seems to me very hard to define the expectation of what the output will be at this early boot stage. (and just to make it clear that it is not my position that the /dev/random behavior was good when it had the notion of entropy loss and could block at any time causing effectively a DoS; my focus is on the current behavior where the rng will only unblock when it considers itself initialized irrespective of whether I agree with the criteria chosen)
Let's not forget that this is an assumption that this happens (being in the embedded world right now changed a little my perspective). Maybe that's a good point to include in the text.
To my understanding you suggest to split the recommendation based on the use cases such as:
Thinking about it, this separation is not enough. On the second case we only have an assumption that a short term effect will come from using that short-term key. If this short term key is used to transport the "crown jewels", the assumption of suitability goes away --again thinking that there is no strict definition of the unblocked behavior strength. It feels that any distinction between blocking and non-blocking behavior usage must come with very concrete assumptions that will in turn limit the usefulness of the advice. |
And let me propose some text to see how a recommendation based on use case can look:
The second would have to have an example, but I'm not sure I can find a useful one. I hope that this demostrates a little better why I think that the usefulness of the advice no 2 is limited. |
Part 1: I agree that using blocking calls for long-term keys are the general consensus. However, Part 2: The "early boot" isn't quite right. The Linux kernel developers are pretty conservative about what counts as entropy - and understandably so. As a result, the kernel can report "0 entropy" when in fact there's no way an attacker can determine the seed value (so it's not REALLY 0 entropy). It's not clear how to fix this; the kernel devs only have so much information and it is reasonable for them to be conservative. However, this also means that many systems hang if you force them to use blocking calls. It's frustrating. I'd love to say "always use blocking calls for cryptographic randomness" - it is much easier to understand - but the real world makes that impractical to simply require. |
Quick note: It may not look it, but I really appreciate this discussion & deep dive into random number generation. It seems simple at first, but this is an area that is absolutely fraught with complications. Many successful attacks have involved the cryptographic random number generator, so it matters. On the other hand, making entire systems fail because they think they don't have enough entropy is a great way to ensure that the system, and any thought of security, is removed immediately. Threading these issues is complex. |
Same here, and this is a difficult case. There is a need for balance between being conservative by suggesting blocking and being efficient by non-blocking, while at least I rely more on "gut" feeling rather than clear data. What I miss whether the blocking issue as communicated in myths-about-urandom, is an issue that is still relevant today as when it was written. Your suggestion is a good balance, and I'll update the MR with it. Would it be ok to remove the reference to myths-about-urandom, as the urandom/random environment was significantly different? The myths document provides a very nice historical context for the devices, but to someone learning linux and security today it will be of limited practical value. |
This includes the POSIX interface getentropy, that is simpler to use than getrandom, and in practice it is available for as long as getrandom is available in glibc, in addition to being part of OpenBSD before that. https://pubs.opengroup.org/onlinepubs/9799919799/ This patch set also removes the long discussion about /dev/random and /dev/urandom which I loved, but today these interfaces function similarly. torvalds/linux@30c08ef Further discussion at: ossf#158 Signed-off-by: Nikos Mavrogiannopoulos <[email protected]>
This includes the POSIX interface getentropy, that is simpler to use than getrandom, and in practice it is available for as long as getrandom is available in glibc, in addition to being part of OpenBSD before that. https://pubs.opengroup.org/onlinepubs/9799919799/ This patch set also removes the long discussion about /dev/random and /dev/urandom which I loved, but today these interfaces function similarly. torvalds/linux@30c08ef This still recommends for now the non-blocking interfaces based on discussion at: ossf#158 Signed-off-by: Nikos Mavrogiannopoulos <[email protected]>
This includes the POSIX interface getentropy, that is simpler to use than getrandom, and in practice it is available for as long as getrandom is available in glibc, in addition to being part of OpenBSD before that. https://pubs.opengroup.org/onlinepubs/9799919799/
This patch set also removes the long discussion about /dev/random and /dev/urandom which I loved, but today these interfaces function similarly. torvalds/linux@30c08ef