You cannot put references in pdf_output?

Finally, after long (definitely too long) break I came with something hopefully useful. Recently I was working on some templates for reports for my previous work. Long story short I wanted to create a bookdown package, that will contain full template of report, which will be converted into html and pdf files. Since I was working with people who had no knowledge on R, LaTeX, HTML, or other programming stuff, I decided that i will have single file in RMarkdown that will compile into both html and tex. To be honest I wasn’t 100% successful, since I couldn’t find solution to have very specific table in abstract that will run without some hard coded LaTeX or HTML syntax – but I will describe it in another post (if I find time ofc).

One of the most annoying things that I had trouble with, is that I couldn’t find a solution to have proper references in pdf output when using \@ref syntax. All the solutions I found on SO where pointing out, that because of how RMarkdown is compiled into LaTeX, it cannot recognize @ref as reference indicator. Even worse, when your output is html, this problem does not occur. So, how to have a cake and eat it too? Meaning – how to have single table imported to RMarkdown file, with simple RMarkdown syntax, that will convert correctly into both: pdf and html? And what to do if there is more than one table?

Solution

Simply write some functions! But before you do that, you need to make some prearrangement. Unfortunately I cannot show you results here cause they are only meaningful for pdf outputs, however you can simply copy-paste the code and see what is happening by yourself.

YAML and preamble.tex

Those are two things, that you need to be fixed first. If you are not familiar with those two, simply YAML is the funny thing between triple dashes in the begging of your RMarkdown code (or when you use bookdown, part of referring to output options it is placed in the file called _output.yml). Place following code in your Rmd file (or part from pdf_document: to pandoc_args: [ "--csl", "elsevier-harvard.csl"] in _output.yaml when using bookdown):

output: 
    pdf_document:
        includes:
            in_header: preamble.tex
        keep_tex: yes
        dev: "cairo_pdf"
        latex_engine: pdflatex
        citation_package: none
        pandoc_args: [ "--csl", "elsevier-harvard.csl"]
bibliography: [pra.bib]
link-citations: yes

You will of course need a my.bib file, which contains list of references formatted using BibTeX formatting. You can easily google how to make such a file. You should also google and download elsevier-harvard.csl file that contains instructions for LaTeX which bibliography style you want to use. The third file that is referenced in YAML header is preamble.tex. This file indicates which LaTeX packages should be included in tex file. Place following lines in this file:

\usepackage{hyperref}
\usepackage[backend=bibtex,citestyle=authoryear]{biblatex}
\addbibresource{pra.bib}

Be careful, since for some packages, the order in which they are loaded matters. Generally if you set this three things, you are ready to place some functions in the beginning of your Rmd file (I usually place them in the first code chunk).

Convert @ref to \\citeauthor{', x, '}, ', '{\\citeyear{', x, '}}'

formatingCite <- function(x, hyperlink = T) {
  if (hyperlink) {
    paste0('\\citeauthor{', x, '}, ', '\\hyperlink{ref-', x, '}',
           '{\\citeyear{', x, '}}')
} else {
    paste0('\\citeauthor{', x, '}, ', '{\\citeyear{', x, '}}')  
  } 
}

This function is so straightforward in converting any string into LaTeX command, that only hyperlink argument needs a little comment. This argument controls if you want references in text to be clickable (i.e. they act as hyperlinks to entry in References part of document).

Convert references in column

ref_in_table <- function(x, hyperlink = T) {
  x %>%
    gsub('@', '', .) %>%
    strsplit('; ') %>%
    lapply(formatingCite, hyperlink) %>%
    lapply(paste, collapse = '; ') %>% 
    unlist()
}

This function is also really simple. First it removes @ from string, than it splits our references (they need to be separated with semi-colon). Next it applies formatingCite() function to each of the elements, i.e. this is the part where every reference is translated into LaTeX syntax. Finally it pastes strings and separates them with semi-colon and unlists the results.

Using functions in document

So how to use this functions to render beautiful references in a table? Below you find two code chunks that you can paste into your Rmd file. If you have preamble.tex, elsevier-harvard.csl and correct yaml header, everything should work fine.

library('magrittr')
library('dplyr')
#fun converts RMarkdown syntax @ref to LaTeX syntac for references
formatingCite <- function(x, hyperlink = T) {
  if (hyperlink) {
    paste0('\\citeauthor{', x, '}, ', '\\hyperlink{ref-', x, '}',
           '{\\citeyear{', x, '}}')
} else {
    paste0('\\citeauthor{', x, '}, ', '{\\citeyear{', x, '}}')  
  } 
}
#fun converts syntax for references from RMarkdown to LaTeX
#within whole column
ref_in_table <- function(x, hyperlink = T) {
  x %>%
    gsub('@', '', .) %>%
    strsplit('; ') %>%
    lapply(formatingCite, hyperlink) %>%
    lapply(paste, collapse = '; ') %>% 
    unlist()
}
test_table <- data.frame(col1 = c('Value1', 'Value2', 'Value3'),
                         col2 = c(12,14,16),
                         references = c('@eppo2018; @rcore2018', '@rcore2018', ''),
                         stringsAsFactors = F)
knitr::kable(test_table, 'latex', escape = F)

test_table2 <- data.frame(col1 = c('Value1', 'Value2', 'Value3'),
                         col2 = c(12,14,16),
                         references = c('@eppo2018; @rcore2018', '@rcore2018', ''),
                         stringsAsFactors = F)

test_table2 <- test_table2 %>% mutate(references = ref_in_table(test_table2$references, T))

test_table2$references[test_table2$references ==
                        '\\citeauthor{}, \\citeyear{}' |
                       test_table2$references ==
                        '\\citeauthor{}, \\hyperlink{ref-}{\\citeyear{}}'] <- NA

knitr::kable(test_table2, 'latex', escape = F)

Final remarks

I found this function extremly useful when I work with RMarkdown and include many tables with references. I think that I should put it in a package.