The only function exported by BioFindr is the findr function itself. Nevertheless, many of the internal functions may be useful when digging deeper in the results for specific genes. The package documentation contains detailed descriptions of all package functions, intertwined with the methods section of the original paper, and should give a good overview of what is available. To illustrate how these functions can be used, we will reproduce the following figure (Supplementary Fig. S1 from the original paper):
LLR distribution of the relevance test for hsa-miR-200b-3p on 23722 potential targets of Geuvadis dataset.
Internally, all BioFindr functions use matrix-based inputs and supernormalized data. The easiest way to convert our data is to run supernormalize on the initial data:
Since all log-likelihood ratios are computed from the same summary statistics, a single function computes them all. To compute the log-likelihood ratios for a specific A-gene (here: hsa-miR-200b-3p with column vector of expression data Ym) with a causal instrument (best eQTL) with genotype vector E, run:
For the method of moments, the null and real log-likelihood ratio distribution are available in the form of distribution objects, and we can simply evaluate their pdfs on a range of values:
Compared to the figure at the top of the page, we see that the method of moments provides a smooth fit to the histogram and consequently also posterior probabilities that increase more smoothly with increasing LLR values.
KDE estimates
For the KDE method, we don’t have a distribution object fitting the histogram. Instead with use kernel density estimation and return estimated pdf values at every value of the LLR input vector:
preal_kde = BioFindr.fit_kde(llr4);
For plotting, we filter a relevant range of values from all vectors: