## A Programmer’s Quest for Primes: Part 4 – Special Primes

10 03 2009

Continuing my playing with primes, I’m now up to having 1 billion of them sitting on my hard drive. That’s 11GB in a text file. It’s actually spread out between 10 files in 100 million prime blocks to load in an out of RAM easier. They were generated in only 30 minutes using the Sieve of Eratosthenes and a modified version of the prime program found here, http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm.

In this post I went through my files finding some statistics on special primes. The graphs below have groups of 1 million primes on the x-axis, and the number of special primes found within that 1 million on the y-axis.

Twin primes: prime numbers that differ by two. For example, 17 and 19. There are theories that there are an infinite number of twin primes, but looking at it as a limit as the function approaches infinity, it certainly looks like it goes to 0, if very slowly. In the first 1 billion primes, there are 58,047,180 twin primes (according to my program’s results at least), or about 5.8% of the total. Cousin primes: prime numbers that differ by four. Interestingly, according to some number theory equations called the Hardy–Littlewood conjecture, they are supposed to have about the same distribution as twin primes. My results show it’s quite close, 58,040,262 of them in the first billion primes. The graph is about the same. Sexy primes: no, I didn’t come up with the name, the Latin word for 6 is sex, and the math people love to name things in Latin and use Greek symbols. Primes that differ by 6. There are 105,002,853 of them in the first billion, quite a bit more than twin or cousin primes. Answer primes: pairs of primes that differ by 42. Okay, I did make this one up. Any scifi fans out there will see the reference. There are 22,729,810 of them in the first billion primes. The graph is at least more interesting than the others due to the fact it’s a larger number. The distance between primes increases as the primes gets bigger. Some random facts of interest (to weird people like me anyway).

The longest distance between two primes out of the first billion (consecutive composite integers): 394. There are no prime numbers between 22367084959 and 22367085353. If you split the 1 billion primes into blocks of 1 hundred million, and then find the longest runs of composite integers in each block,

292 between 1453168141 and 1453168433
336 between 3842610773 and 3842611109
354 between 4302407359 and 4302407713
340 between 8605261447 and 8605261787
382 between 10726904659 and 10726905041
354 between 11705477863 and 11705478217
346 between 14066075347 and 14066075693
376 between 16161669787 and 16161670163
372 between 20404137779 and 20404138151
394 between 22367084959 and 22367085353

Since the distance on average between primes increases as you get bigger, you’d think the max distance would increase more clearly. Such is one of the mysteries of the primes.

## A Programmer’s Quest for Primes: Part 2 – Sieve of Eratosthenes

6 03 2009

This is part 2 of my quest to calculate prime numbers using small and efficient programs written in C. In the first part I wrote a program to brute force find primes by testing sequential numbers to see if they were divisible by any other prime numbers found before, and making the process a bit faster by only checking odd numbers and stopping when I reach a prime greater than the square root of the number in question.

For my second prime calculating program I make use of an algorithm developed 2,200 years ago by the Greek mathematician Eratosthenes. A friend once told me all the great computer science ideas were developed 20 years ago. I often tend to think it was further back than that. Division and modulus commands are very costly in terms of CPU cycles. Multiplication and addition are much faster. Rather than checking to see if a number is prime by dividing it by all prime numbers found before it, it’s much faster to multiply the prime numbers to find all the composites, and then mark anything remaining as prime. A bit array would be the ideal data structure for this algorithm, but because C doesn’t have a bit array structure built in. The only other option is use the bool datatype (C99) or an unsigned char (ANSI), but either way you’re wasting 7 bits of memory for each element (you only need one bit). While searching for bit arrays in C I found another programmer’s quest for calculating prime numbers here, http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm.

Rather than reinventing the wheel I just finished up my much simpler version using char arrays to find the first 100 million primes. The code can be found here, http://pherricoxide.jumbofx.com/code/eratosthenesprimecalc.c. After this point it starts running out of RAM though, so to find more primes you’ll have to use a paging or cluster method as coded in the last link.

Using the brute force program it took over 2 hours to compute the first 100 million primes.

Using this method, it took less than 7 minutes. Quite a difference to say the least. The only downside is that this algorithm uses much more RAM.