Jump in prime counting
srdeaton opened this issue · 7 comments
Hi,
Was working with your implementation, doing some detailed calculations. Thought had bug in my code, but tracked it to the prime counting function; specifically to region [110^9,1.110^9]. The start is hard to find, has somewhat gradual increase, but the end is not, a sharper fall. The end jump is in [1.07210^9,1.07410^9] with an error of about 12,400.
Attached some code showing this, and a plot of my function's output which alerted me. For the paper I'm working on, plan to show output for entire computable range; I can skip over this region, but would look strange in plots w/o explanation. Have not decide how to proceed; skip & explain or show & explain. Either way, it's still a good implementation. Don't think there a too many such regions, < 10^16, especially that far off. At least from what I could tell so far.
If you can fix it within the next month, at most, then I can redo calculations in my paper before submission.
Thanks,
Shaun
PrimeCounting_ErrorDeviations.txt
Hi @srdeaton,
Thanks for reporting this. Could you provide your sessionInfo()
(Just type sessionInfo()
in the console)?
Thanks again,
Joseph
TL;DR
This was fixed in version 2.3.0
. N.B. the current version is 2.4.1
. You might want to update R as well. The current version is 4.0.0 (Arbor Day)
.
The Nitty Gritty
Your problem is that you are using a very old version (almost 2 years old). I recall running into the very same problem while I was doing a major overhaul to primeCount
and primeSieve
. The code change was pretty drastic from version 2.2.0
to 2.3.0
. I scoured the commits to see if there were any notes like "Fixed prime count issue around a billion", but was unsuccessful. I should have opened up an issue when I found it (that's on me). Since then, I've tried to be better about documenting bugs fixes and documentation in general.
I will update the NEWS
to reflect this correction as of version 2.3.0
.
I can confirm your results. Below is an example using version 2.2.0
:
packageurl <- "http://cran.r-project.org/src/contrib/Archive/RcppAlgos/RcppAlgos_2.2.0.tar.gz"
install.packages(packageurl, repos=NULL, type="source")
library(RcppAlgos)
M1 <- 0.9 * 10^9
M2 <- 1.1 * 10^9
n <- round(seq(from=M1,to=M2,by=100000))
length(n)
# [1] 2001
plot(n,primeCount_Eval(n),type='l')
Here is an example using version 2.3.0
(I first restarted my R
session):
packageurl <- "http://cran.r-project.org/src/contrib/Archive/RcppAlgos/RcppAlgos_2.3.0.tar.gz"
install.packages(packageurl, repos=NULL, type="source")
library(RcppAlgos)
M1 <- 1.073*10^9
M2 <- 1.0735*10^9
n <- round(seq(from=M1,to=M2,by=200))
length(n)
# [1] 2500
plot(n,primeCount_Eval(n),type='l')
Validate Using Kim Walisch's primecount library
First, we generate some random data:
set.seed(123)
a <- sample(M1:M2, 10000)
write.csv(sort(a), file = "testWithWalisch.csv", row.names = F)
Then, on the command line, we run the Kim Walisch primecount
on all of the values generated above:
touch primecount_walisch_output.txt
while IFS=, read -r col1; do primecount-4/primecount "$col1" >> primecount_walisch_output.txt; done < RcppAlgos/testWithWalisch.csv
Finally, we read the results into R
and validate RcppAlgos::primeCount
on the same input:
fromWalisch <- read.table("../primecount_walisch_output.txt")
dim(fromWalisch)
# [1] 10000 1
head(fromWalisch)
# V1
# 1 54364435
# 2 54364436
# 3 54364441
# 4 54364444
# 5 54364445
# 6 54364447
canon <- fromWalisch$V1
head(canon)
# [1] 54364435 54364436 54364441 54364444 54364445 54364447
algosImpl <- sapply(sort(a), primeCount)
head(algosImpl)
# [1] 54364435 54364436 54364441 54364444 54364445 54364447
identical(canon, algosImpl)
# [1] TRUE
The shout-out is much appreciated, however I would be remiss if I did not mention @kimwalisch. Quite simply, the prime number capabilities in RcppAlgos
would not be possible without his excellent work.
Also, to address your question: "Is there a version of the prime count function for
vector inputs, containing increasing integers, such that the function will
not restart its count for each integer?"
First off, that is a great question. I wondered that myself, and sadly to my knowledge, I don't think it is possible to use Legendre's method (the method used in RcppAlgos::primeCount
) to count primes between two bounds.
I think your best bet is to call primeCount
first then use primeSieve
in chunks. That sounds like what you are already doing.
Anywho, I hope that helps.
Best of luck,
Joseph