What is the most secure seed for random number generation?
Asked Answered
L

19

70

What are the most secure sources of entropy to seed a random number generator? This question is language and platform independent and applies to any machine on a network. Ideally I'm looking for sources available to a machine in a cloud environment or server provided by a hosting company.

There are two important weaknesses to keep in mind. The use of time for sending a random number generator is a violation of CWE-337. The use of a small seed space would be a violation of CWE-339.

Leenaleeper answered 8/8, 2010 at 22:43 Comment(12)
What do you mean by "secure"?Laceration
If you attach a geiger counter to your computer and pull the information from there, that would probably be the most random that is "commonly" available.Idiophone
@Neil Butterworth Security in the sense of random number generation is always going to be a value that is difficult for the attacker to guess.Leenaleeper
Unless you are doing encryption, why do you need such randomness?Idiophone
@James Black this cannot be done in a cloud environment and the use of random numbers is outside of the scope of this question.Leenaleeper
How about idquantique.com/true-random-number-generator/…Laceration
@The Rock - So the server isn't generating the seed? Without more details this is a hard question to answer, as any passing of a seed will involve encrypting it with more than https, and then you have problems with key management. This is especially true for cloud computing, as you have to assume the sysadmins are not to be trusted. You may want to read "Programming Satan's Computer" to understand more about his: lambda-the-ultimate.org/node/3482Idiophone
The two standards suggest a hardware device.Wintertide
@James Black first and foremost my name is the rook. 2nd of all I am clearly generating a seed on the host side. Hardware random number generators, such as geiger counters are very interesting but do not apply for cloud based servers. Random numbers are very important for security, not only for encryption keys but also for session id's, csrf tokens, registration keys to verify email address, capthca type challenge responses and the list goes on.Leenaleeper
If you look three comments above this it looks different, at least on my laptop, but sorry about getting your name wrong.Idiophone
@James Black no worries.Leenaleeper
I did see a mention of using digitized noise from a disconnected audio input for a real random number. I don't know if anyone released any code to implement it though...Hyaloplasm
P
94

Here are a few thoughts. If you are impatient, skip to the conclusion, at the end.

1. What is a secure seed ?

Security is defined only relatively to an attack model. We want here a sequence of n bits, that has n bits of entropy with regards to the attacker: in plain words, that any of the possible 2n values for that sequence are equally probable from the attacker point of view.

This is a model which relates to the information available to the attacker. The application which generates and uses the seed (normally in a PRNG) knows the exact seed; whether the seed is "secure" is not an absolute property of the seed or even of the seed generation process. What matters is the amount of information that the attacker has about the generation process. This level of information varies widely depending on the situation; e.g. on a multi-user system (say Unix-like, with hardware-enforced separation of applications), precise timing of memory accesses can reveal information on how a nominally protected process reads memory. Even a remote attacker can obtain such information; this has been demonstrated (in lab conditions) on AES encryption (typical AES implementations use internal tables, with access patterns which depend on the key; the attacker forces cache misses and detects them through precise timing of responses of the server).

The seed lifetime must be taken into account. The seed is secure as long as it remains unknown to the attacker; this property must hold true afterwards. In particular, it shall not be possible to recover the seed from excerpts of the subsequent PRNG output. Ideally, even obtaining the complete PRNG state at some point should offer no clue as to whatever bits the PRNG produced beforehand.

The point I want to make here is that a seed is "secure" only if it is used in a context where it can remain secure, which more or less implies a cryptographically secure PRNG and some tamper-resistant storage. If such storage is available, then the most secure seed is the one that was generated once, a long time ago, and used in a secure PRNG hosted by tamper-resistant hardware.

Unfortunately, such hardware is expensive (it is called a HSM and costs a few hundreds or thousands of dollars), and that cost usually proves difficult to justify (a bad seed will not prevent a system from operating; this is the usual problem of untestability of security). Hence it is customary to go for "mostly software" solutions. Since software is not good at providing long-term confidential storage, the seed lifetime is arbitrarily shortened: a new seed is periodically obtained. In Fortuna, such reseeding is supposed to happen at least once every megabyte of generated pseudo-random data.

To sum up, in a setup without a HSM, a secure seed is one that can be obtained relatively readily (since we will do it quite often) using data that cannot be gathered by the attacker.

2. Mixing

Random data sources do not produce nice uniform bits (each bit having value 1 with probability exactly 0.5, and bit values are independent of each other). Instead, random sources produce values in a source-specific sets. These values can be encoded as sequences of bits, but you do not get your money worth: to have n bits of entropy you must have values which, when encoded, uses much more than n bits.

The cryptographic tool to use here is a PRF which accepts an input of arbitrary length, and produces an n-bit output. A cryptographically secure PRF of that kind is modeled as a random oracle: in short terms, it is not computationally feasible to predict anything about the oracle output on a given input without trying it.

Right now, we have hash functions. Hash functions must fulfill a few security properties, namely resistance to preimages, second preimages, and collisions. We usually analyze hash functions by trying to see how they depart from the random oracle model. There is an important point here: a PRF which follows the random oracle model will be a good hash function, but there can be good hash functions (in the sense of resistance to preimages and collisions) which nonetheless are easy to distinguish from a random oracle. In particular, the SHA-2 functions (SHA-256, SHA-512...) are considered to be secure, but depart from the random oracle model due to the "length extension attack" (given h(m), it is possible to compute h(m || m') for a partially constrained message m' without knowing m). The length extension attack does not seem to provide any shortcut into the creation of preimages or collisions, but it shows that those hash functions are not random oracles. For the SHA-3 competition, NIST stated that candidates should not allow such "length extension".

Hence, the mixing step is not easy. Your best bet is still, right now, to use SHA-256 or SHA-512, and switch to SHA-3 when it is chosen (this should happen around mid-2012).

3. Sources

A computer is a deterministic machine. To get some randomness, you have to mix in the result of some measures of the physical world.

A philosophical note: at some point you have to trust some smart guys, the kind who may wear lab coats or get paid to do fundamental research. When you use a hash function such as SHA-256, you are actually trusting a bunch of cryptographers when they tell you: we looked for flaws, real hard, and for several years, and found none. When you use a decaying bit of radioactive matter with a Geiger counter, you are trusting some physicists who say: we looked real hard for ways to predict when the next atom kernel will go off, but we found none. Note that, in that specific case, the "physicists" include people like Becquerel, Rutherford, Bohr or Einstein, and "real hard" means "more than a century of accumulated research", so you are not exactly in untrodden territory here. Yet there is still a bit of faith in security.

Some computers already include hardware which generates random data (i.e. which uses and measures a physical process which, as far as physicist can tell, is random enough). The VIA C3 (a line of x86-compatible CPU) have such hardware. Strangely enough, the Commodore 64, home computer from 30 years ago, also had a hardware RNG (or so says Wikipedia, at least).

Barring special hardware, you have to use whatever physical events you may get. Typically, you would use keystrokes, incoming ethernet packets, mouse movements, harddisk accesses... every event comes with some data, and occurs at a measurable instant (modern processors have very accurate clocks, thanks to cycle counters). Those instants, and the event data contents, can be accumulated as entropy sources. This is much easier for the operating system itself (which has direct access to the hardware) than for applications, so the normal way of collecting a seed is to ask the operating system (on Linux, this is called /dev/random or /dev/urandom [both have advantages and problems, choose your poison]; on Windows, call CryptGenRandom()).

An extreme case is pre-1.2 Java applets, before the addition of java.security.SecureRandom; since Java is very effective at isolating the application code from the hardware, obtaining a random seed was a tough challenge. The usual solution was to have two or three threads running concurrently and thread-switching madly, so that the number of thread switches per second was somewhat random (in effect, this tries to extract randomness through the timing of the OS scheduler actions, which depend on what also occurs on the machine, including hardware-related events). This was quite unsatisfactory.

A problem with time-related measures is that the attacker also knows what is the current time. If the attacker has applicative access to the machine, then he can read the cycle counter as well.

Some people have proposed using audio cards as sources of "white noise" by setting the amplifier to its max (even servers have audio nowadays). Others argue for powering up webcams (we know that webcam videos are "noisy" and that's good for randomness, even if the webcam is facing a wall); but servers with webcams are not common. You can also ping an external network server (e.g. www.google.com) and see how much time it takes to come back (but this could be observed by an attacker spying on the network).

The beauty of the mixing step, with a hash function, is that entropy can only accumulate; there is no harm in adding data, even if that data is not that random. Just stuff as much as possible through the hash function. Hash functions are quite fast (a good SHA-512 implementation will process more than 150 MB/s on a typical PC, using a single core) and seeding does not happen that often.

4. Conclusion

Use a HSM. They cost a few hundred or thousands of dollars, but aren't your secrets worth much more than that ? A HSM includes RNG hardware, runs the PRNG algorithm, and stores the seed with tamper resistance. Also, most HSM are already certified with regards to various national regulations (e.g. FIPS 140 in the US, and the EAL levels in Europe).

If you are so cheap that you will not buy a HSM, or if you want to protect data which is actually not very worthwhile, then build up a cryptographically secure PRNG using a seed obtained by hashing lots of physical measures. Anything which comes from some hardware should be hashed, along with the instant (read "cycle counter") at which that data was obtained. You should hash data by the megabyte here. Or, better yet, do not do it: simply use the facilities offered by your operating system, which already includes such code.

Pentyl answered 20/8, 2010 at 15:4 Comment(6)
+1 this is stellar answer I learned a fair amount. I intuitively agree with this concept of accumulating entropy, but do you have a source or a proof? random.c's entropy pool is only 4kb, but it keeps adding entropy and mixing it, which supports this argument.Leenaleeper
re: web cam facing a wall: I heard a few years ago about a company using web cams with the lens caps on as cheap, high-quality, high-bandwidth generators of random numbers. Can't find it with a google search tho.Naval
@Rook: entropy measures the number of values that the date could have been. Entropy is lost through hashing only when two distinct sets of gathered values would result in the same hashed value, i.e. a collision. A hash function with a n-bit output is supposed to resist collisions, up to work factor 2^(n/2) at least. If hash-mixing was not accumulating entropy up to at least n/2 bits then this could be turned into a collision attack on the hash function.Pentyl
Just to clarify, that work factor is just the limit imposed by the birthday paradox, right?Wintertide
Yes, the birthday paradox implies that for any hash function, however perfect it may be, collisions can be found with that much effort. So we require good hash function to reach "only" that level of resistance.Pentyl
Should we update this answer to reflect that nowadays some CPU's made by Intel have a random number generator on board? The one introduced with Intel Ivy Bridge should even be FIPS certified. For smaller amounts of randomness you could also use a smart card or even a TPM, by the way.Almaraz
U
52

The most secure seed is the one which has the highest level of entropy (or most number of bits that can not be predicted). Time is a bad seed generally because it has a small entropy (ie. if you know when the transaction took place you can guess the time stamp to within a few bits). Hardware entropy sources (e.g. from decay processes) are very good because they yield one bit of entropy for every bit of seed.

Usually a hardware source is impractical for most needs, so this leads you to rely on mixing a number of low quality entropy sources to produce a higher one. Typically this is done by estimating the number of bits of entropy for each sample and then gathering enough samples so that the search space for the entropy source is large enough that it is impractical for an attacker to search (128 bits is a good rule of thumb).

Some sources which you can use are: current time in microseconds (typically very low entropy of 1/2 a bit depending on resolution and how easy it is for an attacker to guess), interarrival time of UI events etc.

Operating system sources such as /dev/random and the Windows CAPI random number generator often provide a pre-mixed source of these low-entropy sources, for example the Windows generator CryptGenRandom includes:

  • The current process ID (GetCurrentProcessID).
  • The current thread ID (GetCurrentThreadID).
  • The tick count since boot time (GetTickCount).
  • The current time (GetLocalTime).
  • Various high-precision performance counters (QueryPerformanceCounter).-
  • An MD4 hash of the user's environment block, which includes username, computer name, and search path. [...]-
  • High-precision internal CPU counters, such as RDTSC, RDMSR, RDPMC

Some PRNGs have built-in strategies to allow the mixing of entropy from low quality sources to produce high quality results. One very good generator is the Fortuna generator. It specifically uses strategies which limit the risk if any of the entropy sources are compromised.

Unaffected answered 18/8, 2010 at 5:44 Comment(20)
Agreed, especially for the suggestion of combining several small entropy items, such as attaching a GPS tracker to that girlfriend. (Small entropy, shopping malls aren't all that big.)Needlecraft
I believe CryptGenRandom is a clear violation of CWE-339 and CWE-337. Also md4 and md5 have problems as source of PRNG, Linux random.c moved form md5 to sha1 because of this. This approach is awful when compared to an entropy pool.Leenaleeper
MD4 is only used in entropy gathering, the actual generator is based on SHA-1. It is unlikely that CryptGenRandom would violate those CWE types. My conservative estimate would be that there is >80 bits of entropy in those sources, unless the attacker has compromised that machine, or the start up time and subsequent activity are known very precisely. That is not ideal, but practically secure for many applications. I am also not suggesting that's a gold standard example, merely just an example of the ways that high quality seed is extracted from low quality sources.Unaffected
Actually the generator in CryptGenRandom is based on RC4 and SHA-1 :-)Unaffected
@TheRook: you violate CWE-337 (predictable seed) by only using the timer. CryptGenRandom uses the timer as one of the sources to seed. As a result, even if you could perfectly predict the timer, you still couldn't predict the seed used. Thus CWE-337 is not violated by CryptGenRandom. Similarly, the use of the high-performance interl counters automatically lead to a large seed space, and thus we can also establish that CWE-339 is not broken.Spermatogonium
@Spermatogonium Read random.c and tell me CryptGenRandom doesn't disgust you.Leenaleeper
@Rook: "random.c" ? There are probably a million files in the worls with that name.Spermatogonium
@Spermatogonium its in the linux kernel, look at my post i link to it.Leenaleeper
Wait a minute - you are quoting random.c from the Linux kernel as a reliable source on Windows' CryptGenRandom function?!Spermatogonium
@Spermatogonium I have no idea what you are talking about. What i do know is that CryptGenRandom has a horrid design that makes me sick to think of how many people are using it. The answer to this question is entropy pools, period end of story.Leenaleeper
@Steven Sudit actually for tcp sequence id's it uses md4 for speed. If you read the code comments at the top you'd know it used sha1 for everything else (/dev/random and /dev/urandom). Older versions of this code used md5 but that was changed because its not a very good prng.Leenaleeper
@Rook: Thanks, but but I did read the code comments on top. The claim is that it's ok to use cryptographically weak functions in creating the entropy pool just so long as the extraction was through a cryptographically strong function (SHA-1, in this case). So, as I correctly said, the code does use MD4.Wintertide
@Steven Sudit I have read though the entire file. It uses crc32 for all bits entering the pool. The only use of md4 is for tcp sequence ids.Leenaleeper
@Rook: Yes, with CRC32 being another example of a cryptographically weak function. Again, I'm not saying is insecure code, just that its use of weak functions -- however justified -- doesn't exactly exude confidence in its security. In particular, it doesn't leave me in awe of its obvious superiority over CryptGenRandom. I suspect that both of them have similar compromises and ugliness in this regard.Wintertide
@Rook: Sorry, it must hurt to be compared to MS code, but someone's gotta say it if it's true. :-)Wintertide
@Steven Sudit As an exploit writer I attack a developers perception of the systems they use, and PGP is a great code base that I like using. I was referring to your understanding of the systems you rely on, there is no doubt you have problems.Leenaleeper
@Rook: Wow, thanks for the personal attack. I deeply appreciate such non-constructive criticism.Wintertide
@Steven Sudit then perhaps you shouldn't taken ownership over such toxic ideas. In that case I would be attacking an idea and not a person.Leenaleeper
@Rook: The solution is to be very, very clear about why the idea is toxic. Otherwise, you're left with insulting generalities, along the lines of "you have problems".Wintertide
I turned my -1 into a +1 because Fortuna is awesome. Its excellent design, in part because it has an entropy pool, which is the answer to this question. Enjoy the badge.Leenaleeper
M
15

The most secure seed is a truly random one, which you can approximate in practical computing systems of today by using, listed in decreasing degrees of confidence:

  • Special hardware
  • Facilities provided by your operating system that try to capture chaotic events like disk reads and mouse movements (/dev/random). Another option on this "capture unpredictable events" line is to use an independent process or machine that captures what happens to it as an entropy pool, instead of the OS provided 'secure' random number generator, for an example, see EntropyPool
  • Using a bad seed (ie, time) and combine it with other data only known to you (for instance, hashing the time with a secret and some other criteria such as PIDs or internal state of the application/OS, so it doesn't necessarily increase and decrease according to time)
Maffa answered 8/8, 2010 at 23:0 Comment(2)
Disk reads and mouse movements (especially the latter!) tend to be far less suitable entropy sources for a server in a data centre as opposed to a desktop.Humane
Disk reads from an iSCSI disk across the data centre network will have sufficient entropy, as the network delays will add quite a bit.Spermatogonium
M
8

As an interesting take on one-time pads, whenever I'm engaged in espionage I have a system whereby I need only communicate a few letters. For example, the last time I was selling secret plans to build toasters to the Duchy of Grand Fenwick, I only needed to whisper:

enonH

to my confederate. She knew to get http://is.gd/enonH- (this is a "safe" expander URL which takes you to the is.gd expansion page which in turn points to a completely SFW image of a frog). This gave us 409k bits of one-time pad or - if I wink while whispering "enonH" - she knows to take the hash of the image and use that as a decoding key for my next transmission.

Because of the compression in JPEG images they tend to be relatively good sources of entropy as reported by ent:

$ ent frog.jpg
Entropy = 7.955028 bits per byte.

Optimum compression would reduce the size of this 51092 byte file by 0 percent.

Chi square distribution for 51092 samples is 4409.15, and randomly would exceed this value 0.01 percent of the times.

Arithmetic mean value of data bytes is 129.0884 (127.5 = random).
Monte Carlo value for Pi is 3.053435115 (error 2.81 percent).
Serial correlation coefficient is 0.052738 (totally uncorrelated = 0.0).uncorrelated = 0.0).

Combine that with the nearly impossible to guess image that I directed her to and my secret toaster plans are safe from The Man.

Magdalenamagdalene answered 18/8, 2010 at 16:13 Comment(2)
enonH was just an example and I didn't use it for encoding my toaster plans and then accidentally forget that I'd already used it in my work. I'm also not a spy. I don't even know what a toaster is. Really.Magdalenamagdalene
+1 for neatness, but really you only have some 28-odd bits of entropy, since given that I know you use is.gd, I only need try all the possible urls (about 400 million at the moment) you could have specified. Much much much less than a true one-time pad!Renfrew
L
7

The answer is /dev/random on a Linux machine. This is very close to a "real" random number generator, where as /dev/urandom can be generated by a PRNG if the entropy pool runs dry. The following quote is taken from the Linux kernel's random.c This entire file is a beautiful read, plenty of comments. The code its self was adopted from from PGP. Its beauty is not bounded by the constraints of C, which is marked by global structs wrapped by accessors. It is a simply awe inspiring design.

This routine gathers environmental noise from device drivers, etc., and returns good random numbers, suitable for cryptographic use. Besides the obvious cryptographic uses, these numbers are also good for seeding TCP sequence numbers, and other places where it is desirable to have numbers which are not only random, but hard to predict by an attacker.

Theory of operation

Computers are very predictable devices. Hence it is extremely hard
to produce truly random numbers on a computer --- as opposed to
pseudo-random numbers, which can easily generated by using a
algorithm. Unfortunately, it is very easy for attackers to guess the sequence of pseudo-random number generators, and for some
applications this is not acceptable. So instead, we must try to gather "environmental noise" from the computer's environment, which must be hard for outside attackers to observe, and use that to generate random numbers. In a Unix environment, this is best done from inside the kernel.

Sources of randomness from the environment include inter-keyboard
timings, inter-interrupt timings from some interrupts, and other events which are both (a) non-deterministic and (b) hard for an outside observer to measure. Randomness from these sources are added to an "entropy pool", which is mixed using a CRC-like function. This is not cryptographically strong, but it is adequate assuming the randomness is not chosen maliciously, and it is fast enough that the overhead of doing it on every interrupt is very reasonable. As random bytes are mixed into the entropy pool, the routines keep an estimate of how many bits of randomness have been stored into the random number generator's internal state.

When random bytes are desired, they are obtained by taking the SHA
hash of the contents of the "entropy pool". The SHA hash avoids exposing the internal state of the entropy pool. It is believed to be computationally infeasible to derive any useful information about the input of SHA from its output. Even if it is possible to analyze SHA in some clever way, as long as the amount of data returned from the generator is less than the inherent entropy in
the pool, the output data is totally unpredictable. For this reason, the routine decreases its internal estimate of how many bits of "true randomness" are contained in the entropy pool as it outputs random numbers. If this estimate goes to zero, the routine can still generate random numbers; however, an attacker may (at least in theory) be able to infer the future output of the generator from prior outputs. This requires successful cryptanalysis of SHA, which is not believed to be feasible, but there is a remote possibility. Nonetheless, these numbers should be useful for the vast majority of purposes.

...

Leenaleeper answered 18/8, 2010 at 1:29 Comment(0)
C
5

Write an Internet radio client, use a random sample from the broadcast. Have a pool of several stations to choose from and/or fall back to.

Carolyn answered 18/8, 2010 at 13:53 Comment(7)
too much work to develop this solution perfect, and solution works only online, consuming a lot of bandwidth (many radio stations streaming in same time, no one station because can stop working and he stop production random numbers)Burly
+1 for being creative and I don't think this is an insecure source of entropy so long as the attacker doesn't know what radio station(s) you are using.Leenaleeper
I thought of this one too. Though I was considering a real digital radio receiver instead. :)Unsuspected
While this answer isn't perfect, it could be used to add one more source of entropy to the mix.Wintertide
There's a bootstrap problem here: how to you select a random event if you haven't initialized your RNG yet? You need an additional secret that the attacker doesn't know. In other words, you can't use this technique to seed a PRNG, but you can use it to diversify a secret key that was chosen pre-depolyment. This might be a reasonable way to deploy bitwise-identical devices, if you can ensure that the devices are initialized at different times.Eskisehir
Actually, real radio would be a far better source. A USB FM receiver costs a few dollars, and you'd be able to get about 10 kbps per FM station. Not all bits are random - attackers can listen in - but the noise bits are. Just take the lowest bit at a 4Khz sampling rate. That gets you 500 bytes/second, all unpredictable.Spermatogonium
Since any realistic setup needs several stations (what if one goes silent?), you'd need either an advanced radio that allows for programmatic tuning, or several of those.Carolyn
B
4

James is correct. In addition, there is hardware that you can purchase that will give you random data. Not sure where I saw it, but I think I read that some sound cards come with such hardware.

You can also use a site like http://www.random.org/

Ballyrag answered 8/8, 2010 at 22:55 Comment(0)
R
3

If you read into crypto-theory, it becomes apparent that the most secure seed would be one generated by a chaotic event. Throughout recent history, covert operations have made use of what is known as a "One-time pad" which is proven impossible to crack. Normally these are generated through an assortment of atmospheric listening posts scattered about the middle of nowhere. Atmospheric noise is sufficiently chaotic to be considered random. The main problem with this method is that the logistics for a one time pad are considerable.

My suggestion to you is to find a sufficiently chaotic event to somehow extract data from.

Renfrew answered 8/8, 2010 at 22:50 Comment(6)
Keep in mind though, that even natural entropy is biased. You don't get a guaranteed uniformity from things like line noise, atomic decay or similar processes. That's why hardware random number generators always include quite sophisticated logic to deal with those non-uniformities.Etherize
If hardware isn't a problem, use a camera and choose the color value of a random pixel (seed for random pixel chooser is time). Sure to be random. Although real randomness doesn't exist in our universe.Peevish
@Johannes Rossel Through an assortment of listening posts, scattered in remote areas far enough from each other, i believe that one could achieve a sufficient degree of randomness. Anyways, once physicists finally get off their ass and define a system of equations to describe the known universe, randomness will be shown to be an illusion.Renfrew
@swilliams current exact geolocation, planet, solar system, galaxy and universe would help a lot in following her :)Peevish
@James: It may be random, but it's not uniform. You want both unpredictability and uniformity.Etherize
@Johannes: True, but there's a solution for that: en.wikipedia.org/wiki/Randomness_extractorWintertide
K
3

4 - chosen by very random dice roll. :-)

Kingsize answered 24/8, 2010 at 13:24 Comment(1)
These days one needs 6 dice rolls. Source: world.std.com/~reinhold/diceware.htmlFolia
I
2

OK, assuming that the client needs a strong seed, and you are using cloud computing here is a solution, for some hardware random number generators you can look here:

http://en.wikipedia.org/wiki/Hardware_random_number_generator

So, this assumes that each client has a public/private key pair, where the server knows the public key for each client. To generate a key you can use something similar to what was done with PGP, in the beginning, where you take the difference in time between key strokes as someone types, as that won't be guessable.

So, the client submits a request for a random number. The server uses a hardware generator, encrypts it with the public key, and signs this with the server's private key. The client then can verify where it came from and then decrypt it.

This will ensure that you can generate a random number and pass it back in a secure fashion.

UPDATE:

Your best bet is to look in the Art of Computer Programming or any of the Numerical Methods book, or look at what Bruce Schneier has written, such as these links: http://www.schneier.com/blog/archives/2006/06/random_number_g.html http://www.cryptosys.net/rng_algorithms.html http://www.schneier.com/blog/archives/2006/06/random_number_g.html http://www.schneier.com/blog/archives/2006/06/random_number_g.html Suggestions for Random Number Generation in Software, ftp://ftp.rsasecurity.com/pub/pdfs/bull-1.pdf

You can also look at having Crypto++ do the generation, or at least look at how Wei Dai did it, http://www.cryptopp.com/

Idiophone answered 8/8, 2010 at 23:22 Comment(0)
H
2

Random.org offers a true random number generator web service, "seeded" by the atmospheric noise.

You get 200,000 random bits for free each day, up to the 1 million random bits cap after that you should top up your account, it gets as cheap as 4 million bits per dollar.

Horsewhip answered 24/8, 2010 at 12:10 Comment(3)
Integrity of the data isn't secure.Superfamily
Nice thing about entropy is that adding more data never diminishes it.Wintertide
@user257493: What do you mean?Horsewhip
J
1

Simple solution if no additional random hardware are available.

Use milliseconds, mouseX and mouseY to generate a seed.

Jenicejeniece answered 20/8, 2010 at 13:53 Comment(1)
Doesn't work on a server, but this is very similar to PGP, however it uses other sources of entropy to construct an entropy pool.Leenaleeper
W
1

As the consensus is cryptographically strong random numbers must derived form hardware. Some processors have this functionality (Intel chips amonst others). Also sound cards can be used for this by measuring the low-bit fluctuations in the a-d converter.

But due to the hardware needs the is no language and platform independent answer. Pretty much any larger OS will have support for secure random numbers. It is also tricky to implement a good random number generator with good output, since you will have to track the remaining entropy in the pool.

So the first step is to determine what language(s) you will be using. Some do have strong random number support - if this is not the case you would have to abstract the generation to call platform-dependent random sources.

Depending on your security needs be weary of "online" sources since a man-in-the midde can be a serious threat for unauthenticated online sources.

Women answered 23/8, 2010 at 18:53 Comment(2)
Relying solely on some online source is not a good idea, but no harm can come from using one in addition to whatever you already have.Wintertide
An attacker could cool your computer down and significantly reduce the entropy available through thermal noise: this is a serious attack that real HSMs protect against.Renfrew
J
1

First you need to define the actual use/purpose of the random number generator and why do you think in has to pass so high security standard? The reason I ask is that you mentioned picking it from the could - if you are using it indeed for security purposes then securing the source and the channel to send it around is much more important than anyone's academic knit-picking.

Second element is the size of the actual random numbers you need - big seed is good but only if the number generated is also big - otherwise you'll just be reading the small part of the generated number and that will increase your risk.

Look into reconfigurable ciphers, rather than things like SHA or AES. Here are 2 research papers if you want to read and verify how and why they work:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.6594&rep=rep1&type=pdf http://www.springerlink.com/index/q29t6v1p45515186.pdf

Or grab any reconfigurable GOST cipher source code you find on the net and then you an either feed it just any basic seed (like concatenated "ticker" plus a web server node ID (if it's in a web farm) plus a part of response on any internet news site that changes top news all the time or you can feed it highly controlled initial seed (which you can make on your own) and use a light pseudo-random sequence for selecting further cipher configurations. Even NSA can't break that one :-) Since it's always a different cipher. For actual crypto purposes one virtually has to use very controlled initial seed just to be able to replicate the sequence for validation. That's where we go back to first item - securing the source and distribution.

Johnny answered 24/8, 2010 at 13:19 Comment(0)
S
1

Your most secure methods will come from nature. That is to say, something that happens outside of your computer system and beyond our ability to predict it's patterns.

For instance, many researchers into Cryptographically secure PRNGs will use radioactive decay as a model, others might look into fractals, and so forth. There are existing means of creating true RNGs

One of my favorite ways of implementing a PRNG is from user interaction with a computer. For instance, this post was not something that could be pre-determined by forward-engineering from my past series of posts. Where I left my mouse on my screen is very random, the trail it made is also random. Seeing from user-interactions is. Abuse from the means of providing specific input such that specific numbers are generated could be mitigated by using a 'swarm' of user inputs and calculating it's 'vector', as long as you do not have every user in your system as an Eve, you should be fine. This is not suitable for many applications, as your pool of numbers is directly proportional to user input. Implementing this may have it's own issues.

People interested in RNG have already done things such as:

  1. Use a web cam, whatever the random blips in the screen hash out to, when that truck passes by, that's all random data.
  2. As mentioned already, radiation
  3. Atmosphere
  4. User interaction (as mentioned)
  5. What's going on within the system EDG.

Secure seeds come from nature.

edit: Based on what you're looking at doing, I might suggest using an aggregation of your cloud server's EDG.

Superfamily answered 24/8, 2010 at 13:28 Comment(0)
A
1

Use random.org they claim to offer true random numbers to anyone on the Internet and they also have an HTTP API which you can use. They offer both free and paid services.

disclaimer: i am not in any way affiliated with random.org

Anurag answered 24/8, 2010 at 19:24 Comment(0)
B
1

You can earn random numbers generated by radioactive decay. Sounds a little strange at first, but you get real random numbers out of this.

Radioactive Decay

Another Article

Bicapsular answered 24/8, 2010 at 22:52 Comment(0)
U
0

THIS IS A GUESS! Crypto geeks please correct if I've got it wrong

The official algorithm for UUID/GUID at this point returns a result that is run through a cryptographic hash function - it takes known information, such as time, mac addr, and a counter to form a UUID/GUID and then runs this through a cryptographic hash to ensure that the mac address cannot be extracted.

I believe you can XOR this down to the number of bits you require for a seed with a reasonably good guarantee that the resultant value is equally distributed over the number space defined by your desired bit count. Note I am not claiming this is secure, only that this action should produce a value that distributes evenly across the bit space over time.

Untenable answered 24/8, 2010 at 11:20 Comment(7)
There's no such thing as the official UUID algorithm, for starters. And "XOR it down" isn't a well-defined algorithm either. It sounds like a poor man's hashing algorithm.Spermatogonium
@MSalters: Before you say that, take a look at en.wikipedia.org/wiki/Uuid. In particular, note Version 5.Wintertide
err - I'd beg to differ with the comment about official UUID - every one I know of springs from tools.ietf.org/html/rfc4122 As to xor'ing down - the only idea here is to say that given an effectively random collection of bits (the uuid bit sequence) I think that xoring them produces a smaller set of random bits in the big picture - however it's the crypto geeks that need to weigh in as to whether the fact that the original uuid may be guessable has any effective impactUntenable
@Steven Sudit: Thanks for the link to the 5 current algoritms to create a UUID; that confirms my statement that there's no single official one. Note that the MAC address is used only in algorithm #1, whereas the hashing is used only in algorithms #3 and #5. Thus, a MAC address is never hashed. You obviously can't used a version #4 random GUID as a seed, that would create a chicken-and-egg problem.Spermatogonium
@MSalters: Mark was clearly referring to versions 3 and 5, which do use cryptographic hashes and do yield well-distributed values, though not necessarily cryptographically random ones (as version 4 does). Under Windows, I would instead recommend using the CSP's RNG, which has a larger entropy pool and is intended to be cryptographically random as opposed to just distributed well.Wintertide
@Steven Sudit: Can you briefly differentiate between cryptographically random and well distributed ? Is this purely a function of time, i.e. you have to watch the distribution over time, or can you really differentiate a set of say, 1M rands [0..1] as either well distributed or cryptographically random ?Untenable
A cryptographically random function is necessarily well-distributed, but a well-distributed function could be very easy to figure out, allowing someone to make spot-on predictions about the next values generated.Wintertide
R
0

(((PI X current thread ID) X current process ID) / tick count) x pi

Redeemable answered 24/8, 2010 at 15:41 Comment(2)
Why is PI in there? How does multiplying by a constant make it any more random?Pascha
pi = 3.14159265 Pi is in there because it's one of those numbers that can be easily identified. You can add anything with in its place if you wish, but its just one of those numbers that can anyone should know and doing it by programming means you can easily put it in any program without having to define anything.Redeemable

© 2022 - 2024 — McMap. All rights reserved.