R's t-distribution says "full precision may not have been achieved"
Asked Answered
N

2

6

I am working with a problem that routinely needs to compute the density of the t distribution rather far in the tails in R.

For example, using R's t distribution function, dt(1.424781, 1486, -5) returns [1] 2.75818e-10. Some of my final outputs (using this density as an input) do not match a reference value from analogous computations performed in MATLAB by my colleague, which I think may be due to imprecision in the t distribution's tails in R.

If I compare to MATLAB's t distribution function, nctpdf(1.424781, 1486, -5) returns ans = 4.3932e-10, which is quite a bit different than R's output.

edit: R prints two identical warning messages

In dt(1.424781, 1486, -5) : full precision may not have been achieved
in 'pnt{final}'

This is on Mac, R version 3.3.1

Nyberg answered 27/8, 2016 at 17:35 Comment(5)
Ugh.. I am getting the same result as the OP. What he doesn't mention is that the dt in question prints out a warning (two identical warnings for me): "In dt(1.424781, 1486, -5) : full precision may not have been achieved in 'pnt{final}'" This warning is present in this post on qbeta. I using openSuse 13.1 and R 3.3.1 Patched (2016-08-04 r71028). I'll perform an update right now and see if the result changes.Gumbotil
@Hack-R True, it could be the other way around. Any advice on how to check which is the more accurate?Nyberg
@Gumbotil Thanks, the warning makes it much more clear. OK, so, it seems this effects the upper tail only I believe, based on this bugs.r-project.org/bugzilla/show_bug.cgi?id=16680Bryner
@AlexanderEtz Now that we have the warning I think we can just focus on R, but if you wanted to check which one is correct I'd recommend referencing a value from some highly lauded textbook that's been around for many editions.Bryner
@Gumbotil Yea, that sounds good, give it a goBryner
G
5

It appears that the issue comes from the algorithm that R implements for the non-central t-distribution for this case. Two factors combine to produce the result:

  1. the supplied non-centrality parameter, -5, is an "extreme value."

From the Note section of the help file, ?pt,

The code for non-zero ncp is principally intended to be used for moderate values of ncp: it will not be highly accurate, especially in the tails, for large values.

So the algorithm that calculates these values is not intended to calculate extreme values like -5. We can see this by reducing the ncp value to a more moderate level, say -1:

dt(1.424781, 1486, -1)
[1] 0.0211143
  1. The requested value is in the upper tail of the distribution:

The Source section of ?pt says

For the non-central case of pt based on a C translation of

Lenth, R. V. (1989). Algorithm AS 243 — Cumulative distribution function of the non-central t distribution, Applied Statistics 38, 185–189.

This computes the lower tail only, so the upper tail suffers from cancellation and a warning will be given when this is likely to be significant.

For example, the same ncp value, -5 with the negative of the x value returns

dt(-1.424781, 1486, -5)
[1] 0.0006719519

without a warning.

Gumbotil answered 27/8, 2016 at 18:50 Comment(0)
P
3

You could use Boost (available through the BH package) via Rcpp as an alternative:

// [[Rcpp::depends(BH)]]

#include <Rcpp.h>
#include <boost/math/distributions/non_central_t.hpp> 

using namespace boost::math;

// [[Rcpp::export]]
double dnct(const double x, const double df, const double ncp) {
  non_central_t dist(df, ncp);
  return pdf(dist, x);
}


/*** R
dnct(1.424781, 1486, -5)
*/

This returns:

[1] 4.393078e-10

I don't know if Boost or Matlab is more precise here, but results are at least similar. The boost docs give some information regarding precision.

Papyraceous answered 27/8, 2016 at 19:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.