Inter_Intra_Variation_Executable/Notes.org at master · MattNolanLab/Inter_Intra_Variation_Executable

Executable document project

Goal

To generate a version of Pastoll et al. 2020 in which code runs to generate the figures.

Format

Insert code blocks into the R markdown document. Stencila will convert this document back into an eLife formatted article document.

Notes

Inserted URL for the article at https://hub.stenci.la/open/ and then downloaded the R markdown document.

Set up a GitHub repository to contain the converted documents as well as project data and functions.

Make sure the ‘bookdown’ package is installed in R.

This enable documents generated with Knit to have functioning links to figures / code chunks. Re-start R Studio after installing the bookdown package.

Insert code blocks

Set up code block at the top of the document. This calls libraries and loads data.
Code blocks near to figures.

Bookdown note

To stop the server run ‘servr::daemon_stop(1)’

To link from text to code for figures

add ‘\@ref(fig:label)’ where label is the chunk label. This will generate a link when the project is compiled with bookdown (although it will not with Knit). The chunk header also needs to include a figure caption for the link to be generated correctly by bookdown. In bookdown the link is to the figure rather than the code chunk.

Bookdown appears unable to references code chunks (see here: rstudio/bookdown#238).

For links to work it seems the label an only be text (e.g. no spaces, symbols, etc).

Download the Stencila app

See instructions here: https://github.com/stencila/stencila#linux

Run:

curl -L https://raw.githubusercontent.com/stencila/stencila/master/install.sh | bash

Gives the following output:

Installing stencila to home/matt.local/bin/stencila-v0.34.0 🚨 ERROR encoda:puppeteer Chromium does not exist in expected location: home/matt.local/bin/stencila-v0.34.0/node_modules/puppeteer/.local-chromium/linux-768783/chrome-linux/chrome 🛈 INFO stencila Setup complete. Pointing stencila to home/matt.local/bin/stencila-v0.34.0/stencila

Add stencila to the path using:

export PATH=$PATH:/home/matt/.local/bin/stencila-v0.34.0/stencila

Still reports command not found.

Run the Stencila app

Run with:

stencila convert pastoll-et-al-2020.Rmd article.html –theme elife

Generates a folder containing images but no html file. Output is:

🚨 ERROR encoda:puppeteer Chromium does not exist in expected location: usr/local/bin/node_modules/puppeteer.local-chromium/linux-768783/chrome-linux/chrome 🛈 INFO stencila Setup complete. ⚠ WARN encoda html Properties of `Article` not supported by encode: `editors`, `dateReceived`, `dateAccepted`, `isPartOf`, `licenses`, `keywords`, `fundedBy`, `bibliography` 🚨 ERROR stencila givenNames.map is not a function

Error probably caused by format of the YAML header (the region marked by —).

Replace YAML with original. Running stencil convert now generates a html file.

Questions

<2020-08-14 Fri>

I’m not sure that removing the original figures works, at least in the pdf version.

There are two problems here.

In general the originals are more readable (which is presumably the purpose of the pdf). For example, the panels now extend over multiple pages in the pdf. This makes it difficult to read. This may not be such an issue for the html version, but I haven’t seen this yet.
There is potential for confusion were the original figure and appear superficially different. For example, in some cases the formatting of the figures in the pdf differs slightly from the original. This can be either because I’ve remade the figure in R, or because the figure panels were assembled from R generated panels but had additional text/formatting added.

There’s few issues (minor I think) with running the code in the .Rmd file from your branch.

Neither are fatal but I haven’t had time to look for causes. I’ll merge the Pull request once I can figure these out.

`Error: attempt to use zero-length variable name`

The figure is generated but inserted after the first line of the code chunk.

In RStudio the other figures are generated but are partially transparent and have a circling indicator over them.

Is there a style template the pdf is supposed to comform to?

It’s a little unappealing to look at.For example, underlined citations, arbitrary size figure panels, figure legends crossing multiple pages.

Some citations are broken.

E.g. Introduction line 4.

There are no page numbers.

There are boxes in the text that state “No output to show”

E.g. Second page.

<2020-07-03 Fri>

Table 1 from the manuscript is missing from the .Rmd file generated by Stencila.

Where to put code used for analyses with results that are shown across multiple panels.

Figure 4D, 4E and Table 1 all present reslults from the same analysis.

To avoid repetition I’ve placed this analysis in a separate block at the top of the document after the initialisation code. . I placed it there because no particular panel has priority.

I could copy the code into each chunk but I find this problematic as someone playing with the code could end up with inconsistent plots across the locations (Figure 4D, 4E and Table 1). Or, they could make modifcations in one location and then find their changes over-written when they run the code from a different location. If the code is in one place then at least they will know that they are changing code used in multiple panels.

What is the format for referencing figures?

In Bookdown a figure is referenced with \@ref(fig:label) where label is specified at the start of the code chunk for the figure. The way the Stencila generated .Rmd document references figures, e.g. [Figure 1](#fig1), appears different to Bookdown. It’s not clear here what fig1 is referring to. Perhaps the file name for the image (‘fig1.jpg’)? Also, when compiled to html with Knitr the links don’t work in the Stencila generated .Rmd document (this isn’t a major issue but might be a source of confusion).

Nokome suggest to use the following at the start of a code chunk for a figure, {r fig.cap=”(ref:figure3g)”}. It’s not clear here how this is referenced from the text part of the document, perhaps using (ref:figure 3g)? This appears to differ from the format used in the Stencila generated document.

What to include in the document and how to organise it?

First, should the images from the original version of the manuscript remain in the document?

I think Nokome might be suggesting to remove completely the original figures and instead generate figures by running code at start up. I think this would be very slow. It also has the drawbacks that not all figures could be generated from code, so they would have to be added back as images, and that it’s often helpful to see panels side by side. We could do this from code with packages like cowplot, but this often takes a lot of tweaking to look nice and it could be a lot of work for figures that were not originally made this way.

Second, where should the relevant code chunks be placed and referenced from?

I think what Nokome is suggesting is to insert the code chunks within the legend immediately above the relevant panel label, so the legend for each figure would be broken up by code chunks. I worry this might be difficult to read.

If the original figures remain, then the code chunks could instead be referenced from within the legend, either after the panel label, or at the end of the legend, e.g. using \@ref(fig:label). The code chunk(s) could then be inserted after the legend for the figure.

<2020-06-30 Tue>

When I Knit pastoll-et-al.Rmd the references are given as ???.

Adding, in the header section, the line ‘bibliography: pastoll-et-al-2020.references.bib’ fixes this. Would be nice not to have to do this. Presumable ‘references: pastoll-et-al-2020.references.bib’ is not read by Knitr.

Wish list: it would be great to have an RStudio extension to build an eLife version of the document.

This might work in the same way as Knit or Bookdown commands.

How best to refer to figures?

Goal: At the moment the document links to and loads an image stored in .media. Nokome previously indicated we’d like to replace this with a link to the code. This is option 1 below. I think it has some disadvantages. I’ve suggested an alternative.

Option 1

Link to the code from within the figure legend. Clicking the link for a particular panel runs the code and plots a new version of the panel. The link in the text could reference the code chink rather than the figure. Advantages: the figures are already visible without seeing code; keeping the original panel could be useful for comparison with panels generated after changing the code; does not require all panels to be converted to code. Disadvantages: end up with multiple versions of the same plot. The original panel may look (or be) different to the code generated panel.

Option 2

Link to the code from within the text. Original panels removed from the document. The link in the text would reference the code block in the same way that figures are referenced in a Bookdown document. Advantages: only one version of each panel, code exectuable from main text Disadvantages: delay between clicking the link and seeing the figure could be very long (would not promote readability); because many figures have multiple panels that relate to one another, it would either be necessary to click on links to each panel (E.g. Figure 1A and then Figure 1B) if you want to see both, or if one link runs code for all panels it would be necessary to wait for each panel to be generated, or there would need to be multiple links (e.g. Figure 1, and Figure 1A) or some kind of menu; formats for raw (unprocessed) data in some figure panels are not easily loaded into or viewed in R / Python, e.g. left panels in Figure 2A-C.

<2020-06-06 Sat>

In the document generated by the Stencila converter some of the figures are shown but others are missing. I can manually add them back but I’m not sure how best to do this without breaking the formating or something else downstream when the document converts back to the publication format. Please advise.
I see that the text for the figure legends is included as a subheading at a level below the section heading. However, this is missing for several of the figures in the converted document. Is there a fix for this that doesn’t involve manually replacing it?
Is there a model for how / where code blocks should be inserted so that they format correctly in the final document?

NB: I envisaged that you would replace the image tags for the figures with the usual Rmd code chunks. As long as the correct identifiers are used to link the figures to their caption, the parser should be able to reconstitute the structured figure.

Will executable figures replace the original figures or be generated alongside them?

NB: They will replace them.

How should code blocks associate with Figures and Tables? E.g. Should I add links to the main text as in the same as a standard .Rmd document? Or just leave the code blocks as standalone elements?

NB: As above, please use Bookdown convention for linking.

Do we want to execute everything? E.g. Numbers given in the manuscript that come from analyses could be linked directly to the data they come from? Happy try this but will add to the time commitment.

NB: This is really up to you. We do support inline code chunks and it would be great to showcase that, but I understand that it wil be more work for you. Perhaps, just go for the easiest ones?

Can I refer to analysis functions outside the R markdown document? Will ‘source’ work to run a .R file containing these functions? If so, should I source from the setup code block or somewhere else?

NB: Yes you can use source (obviously this will require that you upload the sourced files to the project so we can include them in the container at runtime). There is a tradeoff however to using source in that it makes that source code less visible to the reader. So maybe the best approach is to put preparatory code in the setup block, and code relevant to generating a particular figure there.

I’m assuming that I should refer to other files using paths relative to the directory containing the .Rmd file. Will this be ok?

NB: Yes, absolutely, they will also get included in the project so they are available at runtime.

Should I load packages from the setup block or somewhere else? Can I source a separate initialisation script to do this?
Is there a quick way to convert the document back to the final format so I can check things are ok as I go along?

As mentioned above, you can use Encoda for this. If you don not have Node.js installed and would prefer a standalone executable let us know and we should be able to prioritixze a new release of the Stencila CLI (which includes Encoda).

Minor. Having the bibliography at the top of the .Rmd document is a bit annoying. Will it break things later if I move it? Can it be loaded from a separate document?

To do

Figure 3B

This looks pretty ugly. Update figure labels, etc.xs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goal

Format

Notes

Inserted URL for the article at https://hub.stenci.la/open/ and then downloaded the R markdown document.

Set up a GitHub repository to contain the converted documents as well as project data and functions.

Make sure the ‘bookdown’ package is installed in R.

Insert code blocks

Bookdown note

To link from text to code for figures

Download the Stencila app

Run the Stencila app

Questions

<2020-08-14 Fri>

I’m not sure that removing the original figures works, at least in the pdf version.

There’s few issues (minor I think) with running the code in the .Rmd file from your branch.

`Error: attempt to use zero-length variable name`

In RStudio the other figures are generated but are partially transparent and have a circling indicator over them.

Is there a style template the pdf is supposed to comform to?

Some citations are broken.

There are no page numbers.

There are boxes in the text that state “No output to show”

<2020-07-03 Fri>

Table 1 from the manuscript is missing from the .Rmd file generated by Stencila.

Where to put code used for analyses with results that are shown across multiple panels.

What is the format for referencing figures?

What to include in the document and how to organise it?

<2020-06-30 Tue>

When I Knit pastoll-et-al.Rmd the references are given as ???.

Wish list: it would be great to have an RStudio extension to build an eLife version of the document.

How best to refer to figures?

Option 1

Option 2

<2020-06-06 Sat>

To do

Figure 3B

FilesExpand file tree

Notes.org

Latest commit

History

Notes.org

File metadata and controls

Goal

Format

Notes

Inserted URL for the article at https://hub.stenci.la/open/ and then downloaded the R markdown document.

Set up a GitHub repository to contain the converted documents as well as project data and functions.

Make sure the ‘bookdown’ package is installed in R.

Insert code blocks

Bookdown note

To link from text to code for figures

Download the Stencila app

Run the Stencila app

Questions

<2020-08-14 Fri>

I’m not sure that removing the original figures works, at least in the pdf version.

There’s few issues (minor I think) with running the code in the .Rmd file from your branch.

`Error: attempt to use zero-length variable name`

In RStudio the other figures are generated but are partially transparent and have a circling indicator over them.

Is there a style template the pdf is supposed to comform to?

Some citations are broken.

There are no page numbers.

There are boxes in the text that state “No output to show”

<2020-07-03 Fri>

Table 1 from the manuscript is missing from the .Rmd file generated by Stencila.

Where to put code used for analyses with results that are shown across multiple panels.

What is the format for referencing figures?

What to include in the document and how to organise it?

<2020-06-30 Tue>

When I Knit pastoll-et-al.Rmd the references are given as ???.

Wish list: it would be great to have an RStudio extension to build an eLife version of the document.

How best to refer to figures?

Option 1

Option 2

<2020-06-06 Sat>

To do

Figure 3B