<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Alessandro Gasparini</title>
<link>https://www.ellessenne.xyz/blog.html</link>
<atom:link href="https://www.ellessenne.xyz/blog.xml" rel="self" type="application/rss+xml"/>
<description>{{&lt; meta description-meta &gt;}}</description>
<generator>quarto-1.9.37</generator>
<lastBuildDate>Sat, 22 Feb 2025 23:00:00 GMT</lastBuildDate>
<item>
  <title>Using simulation to study operating characteristics of stepped wedge cluster-randomised trials</title>
  <link>https://www.ellessenne.xyz/blog/2025/02/23/</link>
  <description><![CDATA[ 




<p>Hi everyone,</p>
<p>We recently published an article in Statistics in Medicine on the <a href="https://onlinelibrary.wiley.com/doi/10.1002/sim.10347">use of joint longitudinal-survival modelling to accommodate informative dropout in longitudinal stepped wedge cluster-randomised trials</a>. It’s the result of many moons of work, and I am extremely pleased that it is finally out - make sure to check it out!</p>
<p>While the paper focuses on the use of joint modelling to obtain unbiased treatment effect estimates, there is an interesting aspect of this work that (I think) is probably a bit underrated. Specifically, we developed an algorithm to simulate data from stepped wedge trials with different outcome types (continuous, binary, count) in the presence of dropout, either informative or not. This is implemented in the <a href="https://github.com/RedDoorAnalytics/simswjm">{simswjm} R package</a>, which is openly available from the <a href="https://github.com/RedDoorAnalytics">Red Door Analytics GitHub page</a>.</p>
<p>Using this implementation of the simulation algorithm, we can use Monte Carlo simulation methods to study the operating characteristics of stepped wedge longitudinal trials in the presence of dropout. For instance, how much efficiency do we lose if some subjects do not complete the trial? What if the dropout rate is different between treated and non-treated participants? Well, we now have the possibility of studying this empirically using these new tools that we developed.</p>
<p>Say you have a stepped wedge trial with a continuous outcome, 4 intervention sequences, 5 periods, and 2 clusters randomised to each intervention sequence. Let’s assume a within-individual intra-class correlation coefficient (ICC) <img src="https://latex.codecogs.com/png.latex?%5Crho_a%20=%200.588"> and a between-individuals ICC <img src="https://latex.codecogs.com/png.latex?%5Crho_d%20=%200.021"> (same parameters from the simulation study in the paper). Moreover, let’s assume non-informative dropout at a constant rate, for simplicity but also to highlight some side effects of dropout that do not introduce bias in the analysis if unaccounted for.</p>
<p>We can now simulate 10,000 trials, where each cluster recruits between 15 and 150 participants, and study the inflation of standard errors in the presence of dropout. Specifically, we can analyse both the complete data (if no dropout occurred) and the actual observed data (where dropout happens) for each trial using the usual extension of the Hussey and Hughes model introduced by <a href="https://link.springer.com/article/10.1186/s13063-015-0840-9">Baio et al.</a> and compare the estimated standard errors of the estimated treatment effect.</p>
<p>If we plot the ratio of these standard errors for each simulated trial, we obtain the following plot:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2025/02/23/simswjm.png" class="img-fluid quarto-figure quarto-figure-center figure-img"></p>
</figure>
</div>
<p>What we can see here is that, irrespective of how many subjects are recruited by each cluster, standard errors are 15/20% larger in the presence of dropout, on average. Given that dropout is not informative we are not getting biased estimates of the treatment effect (check out the results in the published paper for reference), but we lose efficiency thus affecting statistical inference. This provides a powerful tool that can be used at the design stage to ensure that study power is preserved if dropout was to happen.</p>
<p>Obviously, this is somewhat naive and for illustration purposes only - you can of course tweak all the data-generating parameters above and focus on different aspects or operating characteristics of a trial, but you get the idea. This is not rocket science, but I think it shows (once again!) how powerful of a tool statistical simulation can be. As someone much smarter than I am once said, <em>if you can simulate it then you can understand it</em>. Or something like that, I don’t know.</p>
<p>Anyway, remember to check out the <a href="https://onlinelibrary.wiley.com/doi/10.1002/sim.10347">paper</a> and the <a href="https://github.com/RedDoorAnalytics/simswjm">R package</a>, and just get in touch if you have any thoughts on this. You can also replicate the example above using the R script archived <a href="https://gist.github.com/ellessenne/135b1b11e2b7267f3ae62d5566335175">here</a> and posted below:</p>
<script src="https://gist.github.com/ellessenne/135b1b11e2b7267f3ae62d5566335175.js"></script>



 ]]></description>
  <category>rstats</category>
  <category>statistics</category>
  <category>stepped wedge trials</category>
  <category>longitudinal data</category>
  <category>simulation</category>
  <guid>https://www.ellessenne.xyz/blog/2025/02/23/</guid>
  <pubDate>Sat, 22 Feb 2025 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Standardised survival probabilities in R</title>
  <link>https://www.ellessenne.xyz/blog/2024/03/26/</link>
  <description><![CDATA[ 




<p>Hi everyone,</p>
<p>You might already know that a couple of years ago <a href="https://www.bsyr.me/">Betty Syriopoulou</a> published a tutorial paper on standardised survival probabilities, a useful tool to supplement and enhance the reporting of regression models for time-to-event (survival) outcomes.</p>
<p>The background, in short, is that while hazard ratios are ubiquitous in medical research with time-to-event outcomes, they are often misreported, misinterpreted, and have several, well-known limitations; I will not go much into detail, but some relevant references if you want to read more about this topic are:</p>
<ul>
<li><a href="https://www.nature.com/articles/s41416-022-01949-6">Betty’s tutorial paper</a> on standardised survival probabilities (of course);</li>
<li>The popular paper by Miguel Hernán on <a href="https://doi.org/10.1097/EDE.0b013e3181c1ea43"><em>The Hazards of Hazard Ratios</em></a>;</li>
<li>The paper by Mats Stensrud and Miguel Hernán questioning <a href="https://jamanetwork.com/journals/jama/article-abstract/2763185"><em>Why Test for Proportional Hazards</em></a>.</li>
</ul>
<p>Standardised survival probabilities provide a useful and easy-to-interpret measure to complement hazard ratios, can be calculated after fitting survival models (and can therefore include covariates to adjust for, e.g., because they are confounders for a certain exposure-outcome association of interest), and can be contrasted to produce standardised survival probability differences (i.e., risk differences) and ratios (i.e., risk ratios). As a bonus, if the confounders that you decide to include in your regression model are sufficient to control for confounding, then the estimated risk differences (or ratios) can be interpreted as population causal effects: <em>nice</em>! More on this causal interpretation <a href="https://link.springer.com/article/10.1007/s10654-018-0375-y">here</a> and <a href="https://link.springer.com/article/10.1007/s10654-016-0157-3">here</a>.</p>
<p>The tutorial paper is easy to follow and illustrates the utility of such measures very nicely, but most importantly, Stata code to reproduce the analyses and use this method in practice is provided in the supplementary material.</p>
<p>Unfortunately, only Stata code was published with the paper, but Betty and I have been talking about porting this to R for a while now, and guess what: today is your lucky day! We have just published on my GitHub profile a repository with R code that can be used to replicate every result from the paper, including every plot that was included in the tutorial. The GitHub repository can be found <a href="https://github.com/ellessenne/standsurv-tutorial-r">here</a> (<a href="https://github.com/ellessenne/standsurv-tutorial-r" class="uri">https://github.com/ellessenne/standsurv-tutorial-r</a>), and the code is released under the MIT license, so you can pretty much do whatever you want with it.</p>
<p>Here is a little preview of what you will be able to accomplish:</p>
<p><img src="https://www.ellessenne.xyz/blog/2024/03/26/05-stdcomp-stddcomp-plot.png" class="img-fluid"></p>
<p>…and there you have it, don’t forget to check out the code and go test this out for your next project, predict standardised probabilities, make some nice plots, complement your hazard ratios, and take your survival analyses to the next level.</p>
<p>Finally, please note that the code in the repository is quite specific to this project and not very general, so you will have to do a little bit of work to get this going, but there should be enough comments to give you all the information you will possibly need to extend it. If you have any questions, feel free to get in touch or just open an issue in the GitHub repository. As always, any feedback is more than welcome.</p>
<p>Before we go, here is one more treat for you, if you are still reading: an R package with a general, easy to use implementation of regression standardisation <em>may or may not</em> be in development. Stay tuned for more… 👀</p>



 ]]></description>
  <category>rstats</category>
  <category>statistics</category>
  <category>survival analysis</category>
  <category>tutorial</category>
  <guid>https://www.ellessenne.xyz/blog/2024/03/26/</guid>
  <pubDate>Mon, 25 Mar 2024 23:00:00 GMT</pubDate>
</item>
<item>
  <title>New {rsimsum} who this?</title>
  <link>https://www.ellessenne.xyz/blog/2024/03/03/</link>
  <description><![CDATA[ 




<p>Hi everyone,</p>
<p>Let’s get straight to business: a new release of the {rsimsum} package just landed on CRAN, and you can install it with</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsimsum"</span>)</span></code></pre></div></div>
<p>It’s a quite large release that is more than a year in the making, with lots of new features, a few bug fixes, and a lot of internal housekeeping (that you don’t need to worry about, this should not affect any user-facing behaviour – but please get in touch if it does). I’ll summarise a few things worth highlighting in this blog post; everything else is listed in the <a href="https://ellessenne.github.io/rsimsum/news/index.html">changelog page</a> on the package website.</p>
<p>As always, I am extremely grateful to all users of {rsimsum} who use and break the package and come up with bug reports and feature suggestions: the package is much better because of you. And if you have further bug reports or suggestions, or any feedback on the package really, feel free to get in touch (contact details <a href="../../../../about.html">here</a>) or post on <a href="https://github.com/ellessenne/rsimsum">GitHub</a>.</p>
<section id="relative-bias" class="level2">
<h2 class="anchored" data-anchor-id="relative-bias">Relative bias</h2>
<p>We start with relative bias, which is now a supported summary measure that is reported by default, including Monte Carlo errors. I hinted at this in a <a href="../../../../blog/2022/11/13/index.html">previous blog post</a>, and it’s finally here!</p>
<p>Let’s have a look at this in more detail. Specifically, we use the <code>MIsim</code> dataset, which is bundled with {rsimsum}:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsimsum)</span>
<span id="cb2-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MIsim"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsimsum"</span>)</span>
<span id="cb2-3">s <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(</span>
<span id="cb2-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> MIsim,</span>
<span id="cb2-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"b"</span>,</span>
<span id="cb2-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.50</span>,</span>
<span id="cb2-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>,</span>
<span id="cb2-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">methodvar =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"method"</span>,</span>
<span id="cb2-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ref =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CC"</span></span>
<span id="cb2-10">)</span></code></pre></div></div>
</div>
<p>If we summarise this, we can now pick <code>bias</code> and <code>rbias</code> as performance measures:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(s, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stats =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bias"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rbias"</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Values are:
    Point Estimate (Monte Carlo Standard Error)

Bias in point estimate:
              CC         MI_LOGT             MI_T
 0.0168 (0.0048) 0.0009 (0.0042) -0.0012 (0.0043)

Relative bias in point estimate:
              CC         MI_LOGT             MI_T
 0.0335 (0.0096) 0.0018 (0.0083) -0.0024 (0.0085)</code></pre>
</div>
</div>
<p>…and that’s it, as easy as that.</p>
<p>Of course, this won’t work if the true value of the estimand is zero, but that is not surprising (as we divide by zero):</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(</span>
<span id="cb5-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> MIsim,</span>
<span id="cb5-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"b"</span>,</span>
<span id="cb5-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,</span>
<span id="cb5-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>,</span>
<span id="cb5-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">methodvar =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"method"</span>,</span>
<span id="cb5-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ref =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CC"</span></span>
<span id="cb5-8">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb5-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stats =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rbias"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Values are:
    Point Estimate (Monte Carlo Standard Error)

Relative bias in point estimate:
        CC   MI_LOGT      MI_T
 NaN (NaN) NaN (NaN) NaN (NaN)</code></pre>
</div>
</div>
<p>Nonetheless, this provides a useful alternative that allows quantifying bias on a relative scale (compared to the absolute scale of <em>plain</em> bias). If you are looking for more details, formulae are described in the <a href="https://ellessenne.github.io/rsimsum/articles/A-introduction.html">introductory vignette</a> (which has been updated accordingly).</p>
</section>
<section id="nested-loop-plots" class="level2">
<h2 class="anchored" data-anchor-id="nested-loop-plots">Nested loop plots</h2>
<p>Nested loop plots can now accommodate designs that are not fully factorial. Let’s start with a fully factorial design, as in the <code>nlp</code> dataset:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"nlp"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsimsum"</span>)</span>
<span id="cb7-2">s.nlp <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(</span>
<span id="cb7-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> nlp,</span>
<span id="cb7-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"b"</span>,</span>
<span id="cb7-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,</span>
<span id="cb7-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>,</span>
<span id="cb7-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">methodvar =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"model"</span>,</span>
<span id="cb7-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">by =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"baseline"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ss"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"esigma"</span>)</span>
<span id="cb7-9">)</span>
<span id="cb7-10"></span>
<span id="cb7-11"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb7-12"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(s.nlp, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stats =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bias"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"nlp"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2024/03/03/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This has not changed since the previous release. If we however have an incomplete design, missing scenarios will now be empty:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">nlp.subset <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">subset</span>(nlp, <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(nlp<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>ss <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&amp;</span> nlp<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>esigma <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb8-2">s.nlp.subset <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(</span>
<span id="cb8-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> nlp.subset,</span>
<span id="cb8-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"b"</span>,</span>
<span id="cb8-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,</span>
<span id="cb8-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>,</span>
<span id="cb8-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">methodvar =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"model"</span>,</span>
<span id="cb8-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">by =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"baseline"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ss"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"esigma"</span>)</span>
<span id="cb8-9">)</span>
<span id="cb8-10"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(s.nlp.subset, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stats =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bias"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"nlp"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2024/03/03/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This should allow using nested loop plots in a wider variety of scenarios. Many thanks to <a href="https://github.com/mikesweeting">Mike Sweeting</a> for the suggestion.</p>
</section>
<section id="zipper-plots" class="level2">
<h2 class="anchored" data-anchor-id="zipper-plots">Zipper plots</h2>
<p>Lastly, it is now possible to further customise the appearance of zipper plots; specifically, the horizontal lines that are used to denote the estimated confidence intervals (based on Monte Carlo errors) of coverage probability under each scenario. The default behaviour hasn’t changed, with yellow bands:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data</span>(relhaz, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsimsum"</span>)</span>
<span id="cb9-2">relhaz <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">subset</span>(relhaz, relhaz<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>baseline <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Weibull"</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&amp;</span> relhaz<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>model <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"RP(2)"</span>)</span>
<span id="cb9-3">s.zip <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(</span>
<span id="cb9-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> relhaz,</span>
<span id="cb9-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"theta"</span>,</span>
<span id="cb9-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>,</span>
<span id="cb9-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.50</span>,</span>
<span id="cb9-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">methodvar =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"model"</span>,</span>
<span id="cb9-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">by =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"n"</span>,</span>
<span id="cb9-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb9-11">)</span>
<span id="cb9-12"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(s.zip, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"zip"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2024/03/03/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>What is nice is that now you can use the <code>zip_ci_colors</code> argument of <code>autoplot()</code> to customise this. If you pass a single colour name (or hex value), that will be used throughout:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(s.zip, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"zip"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">zip_ci_colours =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2024/03/03/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>If you pass two values, those will be used for optimal and sub-optimal coverage, respectively:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(s.zip, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"zip"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">zip_ci_colours =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2024/03/03/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Finally, if you pass three values, those will be used for optimal coverage, under-coverage, and over-coverage, respectively:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(s.zip, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"zip"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">zip_ci_colours =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2024/03/03/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Of course, I would suggest picking better colours for a publication-worthy plot, but you get the gist. Many thanks to <a href="https://github.com/lorenzo-guizzaro">Lorenzo Guizzaro</a> for the suggestion.</p>


</section>

 ]]></description>
  <category>rstats</category>
  <category>rsimsum</category>
  <guid>https://www.ellessenne.xyz/blog/2024/03/03/</guid>
  <pubDate>Sat, 02 Mar 2024 23:00:00 GMT</pubDate>
</item>
<item>
  <title>The blog is back</title>
  <link>https://www.ellessenne.xyz/blog/2024/01/03/</link>
  <description><![CDATA[ 




<p>Hi everyone,</p>
<p>Long time no see, right?</p>
<p>The last post I wrote on this blog is almost a year old, and a lot has happened in the meantime, behind the scenes. I won’t go into details (if you know, you know!), but 2023 has been a tough year for a bunch of different reasons. But <em>we are back, baby!</em>, and this time I will try to contribute content more consistently.</p>
<p>If you are new around here, you’ll notice that things look a bit different. For the keen observers in the audience this won’t be a surprise, but this blog is now powered by the amazing <a href="https://quarto.org/">Quarto</a>, with a custom, hand-crafted theme to bring the old website design into 2024. Gone is the old <a href="https://en.wikipedia.org/wiki/International_orange">International Orange</a>; welcome to <a href="https://create.vista.com/colors/color-names/very-peri/">Very Peri</a>, the former Pantone color of the year for 2022. The new main typeface is <a href="https://brailleinstitute.org/freefont">Atkinson Hyperlegible</a> for improved accessibility, and <a href="https://fonts.google.com/specimen/Ubuntu+Condensed">Ubuntu Condensed</a> and <a href="https://fonts.google.com/specimen/Fira+Code">Fira Code</a> are used for headers and code, respectively.</p>
<p>I think the new design is great, and I hope you’ll like it too. Please have a look around and let me know if you spot anything broken.</p>
<p>I am keeping this post short and sweet, so that’s it for now, but I’ll be back very soon with some news I have already started working on. In the meantime, hope you had a nice holiday season and a happy new year!</p>



 ]]></description>
  <category>rstats</category>
  <category>quarto</category>
  <guid>https://www.ellessenne.xyz/blog/2024/01/03/</guid>
  <pubDate>Tue, 02 Jan 2024 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Creating dumbbell plots in R</title>
  <link>https://www.ellessenne.xyz/blog/2023/01/10/</link>
  <description><![CDATA[ 




<p>Hi everyone,</p>
<p>Happy new year! I hope you had a relaxing holiday season and that 2023 is treating you well so far.</p>
<p>Well, here’s another treat for you: today we are going to make a <em>dumbbell plot</em> from scratch, using our dear old friend {ggplot2}. Something quick and easy to get going in 2023, but fun nonetheless - and hopefully, useful too. Let’s start by defining what a dumbbell plot actually is:</p>
<blockquote class="blockquote">
<p>A dumbbell plot (also known as a dumbbell chart, or connected dot plot) is great for displaying changes between two points in time, two conditions, or differences between two groups.</p>
<p>Source: <a href="https://www.amcharts.com/demos/dumbbell-plot/">amcharts.com</a></p>
</blockquote>
<p>You might have seen this before in one of the nice visualisations that the OECD publishes from time to time:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/oecd.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:70.0%" alt="OECD Example Dumbbell Plot"></p>
</figure>
</div>
<p>As you can see, this is an intuitive way of showing how a certain metric has changed between two points in time. Let’s get going then, shall we?</p>
<p>For this example, we will use data on monthly step counts that yours truly logged in 2021 and 2022. One of the things I wanted to do more of in 2022, compared to 2021, was walking; will I have succeeded with that? Well, we’ll find out soon.</p>
<p>I extracted this data from Garmin Connect, as I have been wearing a Garmin watch for the past few years now, and this is stored in a dataset named <code>dt</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(dt)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 6 × 3
  Month     X2021  X2022
  &lt;fct&gt;     &lt;dbl&gt;  &lt;dbl&gt;
1 January  114171 194624
2 February 118548 223310
3 March    105853 224946
4 April    172499 206213
5 May      158913 246563
6 June     166119 244314</code></pre>
</div>
</div>
<p>A very simple dataset, not much to see here. Let’s start building our plot: first, we create a <code>ggplot</code> object and put the different months on the vertical axis:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb3-2"></span>
<span id="cb3-3">db_plot <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(dt, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> Month))</span>
<span id="cb3-4">db_plot</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p1-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>Not much to see yet. Then, we add a set of points for 2021 data:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">db_plot <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> db_plot <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021"</span>))</span>
<span id="cb4-3">db_plot</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p2-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>Same as before, but we add data for 2022:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">db_plot <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> db_plot <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb5-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2022, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022"</span>))</span>
<span id="cb5-3">db_plot</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p3-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>It’s coming together nicely, isn’t it? Now, we add a segment to join the two sets of points:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">db_plot <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> db_plot <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_segment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> Month, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> X2022))</span>
<span id="cb6-3">db_plot</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p4-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>And there you have it. Thank you for reading and… wait! We are not done here, of course - now we need to turn this into a nice plot.</p>
<p>One problem here is that the segment overlaps the data points: that looks ugly. To solve this, we need to rebuild the plot but add the segment geometry first:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(dt, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> Month)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_segment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> Month, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> X2022)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2022, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022"</span>))</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p5-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>Already better! Then, let’s make the data points larger and change the colour of the <em>bar</em> to grey:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(dt, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> Month)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_segment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> Month, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> X2022), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"grey50"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2022, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p6-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>Let’s add a better scale for the horizontal axis; for this, we use the <code>comma()</code> function from the {scales} package:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(scales)</span>
<span id="cb9-2"></span>
<span id="cb9-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(dt, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> Month)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_segment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> Month, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> X2022), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"grey50"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2022, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> comma)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p7-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>Let’s label the plot correctly:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(dt, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> Month)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_segment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> Month, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> X2022), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"grey50"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2022, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> comma) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb10-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Steps"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Year"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p8-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>Now, the final touches: let’s change the theme of the plot and tidy things up a little:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(dt, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> Month)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_segment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> Month, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> X2022), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"grey50"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2022, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> comma) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_bw</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bottom"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot.margin =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">units =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lines"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Steps"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Year"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Monthly Steps Walked, 2022 vs 2021"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/p9-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>We can also simplify the above by turning our data into long format:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyr)</span>
<span id="cb12-2"></span>
<span id="cb12-3">dt_long <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pivot_longer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> dt, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">cols =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">starts_with</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"X"</span>))</span>
<span id="cb12-4">dt_long<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>name <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(dt_long<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>name, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">levels =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"X2021"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"X2022"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022"</span>))</span>
<span id="cb12-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(dt_long)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 6 × 3
  Month    name   value
  &lt;fct&gt;    &lt;fct&gt;  &lt;dbl&gt;
1 January  2021  114171
2 January  2022  194624
3 February 2021  118548
4 February 2022  223310
5 March    2021  105853
6 March    2022  224946</code></pre>
</div>
</div>
<p>The required code is very similar to what was used above, but we can now easily modify the colour palette too, and improve the title using {ggtext}:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggtext)</span>
<span id="cb14-2"></span>
<span id="cb14-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(dt, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> Month)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_segment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> Month, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> X2021, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> X2022), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"grey50"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> dt_long, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> value, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> name), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> comma) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_manual</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">values =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#F5DF4D"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#6667AB"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_bw</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(</span>
<span id="cb14-10">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"none"</span>,</span>
<span id="cb14-11">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot.title =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">element_markdown</span>(),</span>
<span id="cb14-12">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot.margin =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">units =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lines"</span>)</span>
<span id="cb14-13">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-14">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(</span>
<span id="cb14-15">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Steps"</span>,</span>
<span id="cb14-16">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>,</span>
<span id="cb14-17">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Year"</span>,</span>
<span id="cb14-18">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Monthly Steps Walked,</span></span>
<span id="cb14-19"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;span style='color:#6667AB;'&gt;2022&lt;/span&gt;</span></span>
<span id="cb14-20"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    vs</span></span>
<span id="cb14-21"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;span style='color:#F5DF4D;'&gt;2021&lt;/span&gt;"</span></span>
<span id="cb14-22">  )</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2023/01/10/index_files/figure-html/plong-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<blockquote class="blockquote">
<p>And yes, for all you colours nerds out there: the two hex codes are Pantone’s colours of the year for 2021 and 2022, “Illuminating” and “Very Peri”. Fitting, right?</p>
</blockquote>
<p>And yes, <em>I did walk more in 2022 compared to 2021</em>, it turns out! Well, except for October, but to be fair, I did run 120 km that month in 2021 compared to a shameful 0 km in 2022, so it could have been worse…</p>
<p>So there you have it: a short tutorial on building a dumbbell plot from scratch using {ggplot2} and other freely available tools, and making it <em>nice</em> (subjectively, of course). Other options for making dumbbell plots in R <em>do exist</em>, of course, such as the <a href="https://CRAN.R-project.org/package=ggalt">{ggalt}</a> package - make sure you check that out too. And until next time, take care!</p>



 ]]></description>
  <category>rstats</category>
  <category>ggplot2</category>
  <guid>https://www.ellessenne.xyz/blog/2023/01/10/</guid>
  <pubDate>Mon, 09 Jan 2023 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Monte Carlo errors for relative bias</title>
  <link>https://www.ellessenne.xyz/blog/2022/11/13/</link>
  <description><![CDATA[ 




<p>Hi everyone,</p>
<p>Today we will be talking about simulation studies and Monte Carlo errors. Yeah, I know, you heard it all before, but trust me… there will be something interesting in here, I swear!</p>
<p>Let’s start with the basics: Monte Carlo simulation is a stochastic procedure, thus for each run of a simulation study, we are likely to get <em>slightly</em> different results. And that’s fine! However, we want to be able to reproduce simulation results. Sure, we could <em>set the seed of the random numbers generator</em>, but what if we want the results to be similar regardless of what seed we set?</p>
<blockquote class="blockquote">
<p>This concept is often referred to as <em>statistical reproducibility</em>: in other words, we want to minimise simulation error for results to be similar across replications.</p>
</blockquote>
<p>This simulation error is often referred to as <em>Monte Carlo error</em>. And as you might already know, the <a href="https://CRAN.R-project.org/package=rsimsum">{rsimsum}</a> package can calculate Monte Carlo error for a variety of performance measures, such as bias.</p>
<p>However, what if you want to calculate Monte Carlo error for <em>any</em> potential performance measure out there? Well, today is your lucky day: as suggested by <a href="https://onlinelibrary.wiley.com/doi/10.1002/sim.4067">White, Royston, and Wood (2011)</a> in the settings of multiple imputation, we can calculate the Monte Carlo error using a jackknife procedure to our simulation results. That is, the Monte Carlo error for any performance measure will be the standard error of the mean of the pseudo values for that statistic, computed by omitting one simulation repetition at a time.</p>
<p>Throughout this post, I will introduce all the above concepts, and I will show you how to use the jackknife to compute the Monte Carlo standard error for relative bias. We will also validate our calculations, which is always a good thing to do!</p>
<section id="jackknife-resampling" class="level2">
<h2 class="anchored" data-anchor-id="jackknife-resampling">Jackknife resampling</h2>
<p>The jackknife technique is, <em>technically speaking</em>, a cross-validation technique and thus it can be considered to be a form of resampling. Specifically, given a sample of size <em>n</em>, a jackknife estimator can be built by aggregating the parameter estimates from each subsample of size <em>(n - 1)</em>, where the <em>n</em><sup>th</sup> observation is omitted each time. Loosely speaking, this is not far from the concept of <a href="https://link.springer.com/referenceworkentry/10.1007/978-0-387-30164-8_469">leave-one-out</a> cross-validation.</p>
<p>If we define a given statistics θ, for a sample of size <em>n</em>, we can calculate <em>n</em> jackknife replicates, denoted with θ<sup>n</sup>, each computed by omitting the <em>n</em><sup>th</sup> observation. Then, we can calculate the mean and the variance (and thus standard error) of such jackknife replicates, and there you go.</p>
<p>I know it is a short and <em>not comprehensive at all</em> description: if you want to read more about this, check the <a href="https://en.wikipedia.org/wiki/Jackknife_resampling">Wikipedia page on jackknife resampling</a> and <a href="https://www.jstor.org/stable/2335441">this paper by Bradley Efron</a>.</p>
</section>
<section id="relative-bias" class="level2">
<h2 class="anchored" data-anchor-id="relative-bias">Relative bias</h2>
<p>We now move on to relative bias. First, let’s define bias: according to <a href="https://doi.org/10.1002/sim.8086">Morris, White, and Crowther</a>, that is defined (for a certain estimand θ) as: <img src="https://latex.codecogs.com/png.latex?%0AE%5B%5Chat%7B%5Ctheta%7D%5D%20-%20%5Ctheta%0A"> and estimated by <img src="https://latex.codecogs.com/png.latex?%0A%5Cfrac%7B1%7D%7Bn%7D%20%5Csum_i%20%5Chat%7B%5Ctheta%7D_i%20-%20%5Ctheta,%0A"> assuming <em>n</em> repetitions.</p>
<p>This is routinely computed (and reported) for, anecdotally, most simulation studies in the literature. However, the magnitude of bias depends on the magnitude of θ; it is thus not necessarily easy to compare across studies with different data-generating mechanisms. That’s why it is sometimes interesting to compute relative bias, usually defined as <img src="https://latex.codecogs.com/png.latex?%0A%5Cfrac%7BE%5B%5Chat%7B%5Ctheta%7D%5D%20-%20%5Ctheta%7D%7B%5Ctheta%7D%0A"> and estimated by <img src="https://latex.codecogs.com/png.latex?%0A%5Cfrac%7B1%7D%7Bn%7D%20%5Csum_i%20%5Cfrac%7B%5Chat%7B%5Ctheta%7D_i%20-%20%5Ctheta%7D%7B%5Ctheta%7D%0A"> The obvious downside is that relative bias, as defined here, cannot be calculated when the true θ = 0, but that’s beyond the scope of this blog post.</p>
<p>Let’s not illustrate how to use the jackknife to calculate the Monte Carlo standard error of relative bias. To do so, we will use a dataset that comes bundled with {rsimsum} for illustration purposes:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsimsum)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MIsim"</span>)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str</span>(MIsim)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>tibble [3,000 × 4] (S3: tbl_df/tbl/data.frame)
 $ dataset: num [1:3000] 1 1 1 2 2 2 3 3 3 4 ...
 $ method : chr [1:3000] "CC" "MI_T" "MI_LOGT" "CC" ...
 $ b      : num [1:3000] 0.707 0.684 0.712 0.349 0.406 ...
 $ se     : num [1:3000] 0.147 0.126 0.141 0.16 0.141 ...
 - attr(*, "label")= chr "simsum example: data from a simulation study comparing 3 ways to handle missing"</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">MIsim <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">subset</span>(MIsim, MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>method <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CC"</span>)</span></code></pre></div></div>
</div>
<p>Remember that, for this specific dataset, the true value of the estimand is θ = 0.5. We will also be using a single method (in this case, <code>CC</code>) for simplicity, but this can obviously be generalised. Bias can be easily computed using the <code>simsum()</code> function:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">sb <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> MIsim, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"b"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>)</span>
<span id="cb4-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tidy</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(sb, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stats =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bias"</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>  stat        est        mcse       lower      upper
1 bias 0.01676616 0.004778676 0.007400129 0.02613219</code></pre>
</div>
</div>
<p>Let’s now calculate relative bias, by hand, together with bias (to compare with the output of <code>simsum()</code>):</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(</span>
<span id="cb6-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bias =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,</span>
<span id="cb6-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">relative_bias =</span> (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb6-4">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>        bias relative_bias
1 0.01676616    0.03353232</code></pre>
</div>
</div>
<p>We get the same results: good! Then, let’s use the jackknife to calculate the Monte Carlo standard error.</p>
<p>First, let’s calculate all jackknife (leave-one-out) replicates and plot their distribution:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">jk.estimate <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(</span>
<span id="cb8-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">X =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq_along</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b),</span>
<span id="cb8-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FUN =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b[<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>x] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>), <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Relative bias</span></span>
<span id="cb8-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FUN.VALUE =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">numeric</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-5">)</span>
<span id="cb8-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hist</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> jk.estimate, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xlab =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Jackknife Replicates"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">main =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/11/13/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Then, let’s calculate the jackknife standard error, using the following formula:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Ctext%7BSE%7D_%7B%5Ctext%7Bjackknife%7D%7D%20=%20%5Csqrt%7B%5Cfrac%7Bn%20-%201%7D%7Bn%7D%20%5Csum_i%5En%20%5Cleft%5B%20%5Chat%7B%5Ctheta%7D_i%20-%20%5Cbar%7B%5Chat%7B%5Ctheta%7D%7D%20%5Cright%5D%5E2%7D%0A"></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(jk.estimate)</span>
<span id="cb9-2">relative_bias_mcse <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(((n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> n) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>((jk.estimate <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(jk.estimate))<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb9-3">relative_bias_mcse</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 0.009557351</code></pre>
</div>
</div>
<p>…and there you have it! We can also validate this procedure by calculating the jackknife Monte Carlo standard errors for bias:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">jk.estimate <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(</span>
<span id="cb11-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">X =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq_along</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b),</span>
<span id="cb11-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FUN =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b[<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>x] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>), <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Bias</span></span>
<span id="cb11-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FUN.VALUE =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">numeric</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb11-5">)</span>
<span id="cb11-6">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(jk.estimate)</span>
<span id="cb11-7">bias_mcse <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(((n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> n) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>((jk.estimate <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(jk.estimate))<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb11-8">bias_mcse</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 0.004778676</code></pre>
</div>
</div>
<p>Recall the results from <code>simsum()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tidy</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(sb, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stats =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bias"</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>  stat        est        mcse       lower      upper
1 bias 0.01676616 0.004778676 0.007400129 0.02613219</code></pre>
</div>
</div>
<p>Once again, this is exactly the same, which is good.</p>
</section>
<section id="actually" class="level2">
<h2 class="anchored" data-anchor-id="actually">Actually…</h2>
<p>It turns out, we <em>don’t</em> actually need the jackknife to obtain Monte Carlo standard errors (MCSE) for relative bias (RB). It can be shown (as usual, <em>exercise left to the reader</em>) that we can obtain a closed-form formula:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Ctext%7BMCSE%7D_%7B%5Ctext%7BRB%7D%7D%20=%20%20%5Csqrt%7B%5Cfrac%7B1%7D%7Bn%20(n%20-%201)%7D%20%5Csum_i%5En%20%5Cleft%5B%20%5Cfrac%7B%5Chat%7B%5Ctheta%7D_i%20-%20%5Ctheta%7D%7B%5Ctheta%7D%20-%20%5Cwidehat%7B%5Ctext%7BRB%7D%7D%20%5Cright%5D%5E2%7D,%0A"></p>
<p>where RB hat is the estimated relative bias computed using the estimator above. Let’s compare this with the jackknife estimate. First, recall the estimated MCSE using the jackknife:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1">relative_bias_mcse</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 0.009557351</code></pre>
</div>
</div>
<p>Using the formula above:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b)</span>
<span id="cb17-2">rb <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb17-3"></span>
<span id="cb17-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> (n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(((MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> rb)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 0.009557351</code></pre>
</div>
</div>
<p>…and they’re the same. Cool!</p>
<p>If you are still not convinced, let me show you something else. What is another method that one could use to estimate standard errors <em>for any statistics you can pretty much think of</em>? Well, but of course, it’s our best friend the bootstrap!</p>
<p>Let’s use non-parametric bootstrap to calculate the standard error of our relative bias estimator. For that, we use the {boot} package in R:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb19-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(boot)</span>
<span id="cb19-2"></span>
<span id="cb19-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">387456</span>)</span>
<span id="cb19-4">rbfun <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(data, i, .true) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>((data[i] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> .true) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> .true)</span>
<span id="cb19-5">bse <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">boot</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> MIsim<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>b, rbfun, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.true =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">R =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50000</span>)</span></code></pre></div></div>
</div>
<p>Note that <code>rbfun()</code> is a function we define to calculate relative bias, that we use R = 50000 bootstrap samples to ensure convergence of the procedure, and that we set a seed (for reproducibility). The results of the bootstrap are printed below:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1">bse</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = MIsim$b, statistic = rbfun, R = 50000, .true = 0.5)


Bootstrap Statistics :
      original        bias    std. error
t1* 0.03353232 -4.320035e-05 0.009571821</code></pre>
</div>
</div>
<p>Once again: pretty close, not exactly the same (as the bootstrap is still a stochastic procedure), but close enough that we can be confident that they converge to the same value.</p>
<p>Finally, let me reassure you that the bootstrap has converged (and, <em>side note</em>, you should remember to do that too when you use it):</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb22" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb22-1">cmsds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(</span>
<span id="cb22-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">X =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq_along</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(bse<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>t)),</span>
<span id="cb22-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FUN =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(i) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sd</span>(bse<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>t[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>i]),</span>
<span id="cb22-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FUN.VALUE =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">numeric</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb22-5">)</span>
<span id="cb22-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(</span>
<span id="cb22-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> cmsds,</span>
<span id="cb22-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"l"</span>,</span>
<span id="cb22-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ylab =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Boostrap SE"</span>,</span>
<span id="cb22-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xlab =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bootstrap Iteration"</span></span>
<span id="cb22-11">)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/11/13/index_files/figure-html/unnamed-chunk-12-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>…looks like the bootstrap converged, and that, probably, we did not need so many bootstrap samples after all. Luckily computations were so inexpensive that we could afford it (the whole thing took just a few seconds on my laptop), so that’s ultimately fine.</p>
</section>
<section id="wrap-up" class="level2">
<h2 class="anchored" data-anchor-id="wrap-up">Wrap-up</h2>
<p>And there you have it, examples of how to use the jackknife (and the bootstrap!) to estimate standard errors for a given metric of interest. Time to go wild and apply this to your settings!</p>
<p>In conclusion, I think these are very powerful tools that every statistician should be at least familiar with; there is plenty of literature on the topic, let me know if you’d like some references. Hope you learned something from this, I sure did by writing up all of this – and if you are still reading, here’s some breaking news:</p>
<blockquote class="blockquote">
<p>Relative bias with Monte Carlo errors will be available in the next release of {rsimsum}! Coming soon to your nearest CRAN server…</p>
</blockquote>
<p>Thanks for reading, and until next time, take care!</p>


</section>

 ]]></description>
  <category>rstats</category>
  <category>simulation</category>
  <category>jackknife</category>
  <category>rsimsum</category>
  <guid>https://www.ellessenne.xyz/blog/2022/11/13/</guid>
  <pubDate>Sat, 12 Nov 2022 23:00:00 GMT</pubDate>
</item>
<item>
  <title>R on a Raspberry Pi Zero W</title>
  <link>https://www.ellessenne.xyz/blog/2022/07/26/</link>
  <description><![CDATA[ 




<p>If you know me, you should know that I am obsessed with tiny computers and that some time ago I got myself a <a href="../../../../2020/06/weekly-blog-news/">Raspberry Pi Zero W</a> on which I run <a href="https://pi-hole.net">Pi-hole</a> to block ads and tracking within our home network. How tiny is a Pi Zero W, you might wonder? Well, here it is:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/07/26/DSCF8281-pp.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:70.0%" alt="A Pi Zero W."></p>
</figure>
</div>
<p>And yes, that is a 13-inch MacBook Pro in the back and that’s a microSD card for storage:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/07/26/DSCF8284-pp.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:70.0%" alt="A Pi Zero W."></p>
</figure>
</div>
<p>I mean, look how tiny and cute it looks in its red-and-white case:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/07/26/DSCF8285-pp.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:70.0%" alt="A Pi Zero W."></p>
</figure>
</div>
<p>The <em>whole</em> case is barely thicker than the MacBook Pro itself, while the board is exactly 65 by 30 mm. It is, <em>of course</em>, not a powerhouse, being powered by a single-core, 1 GHz, 32-bit CPU with 512 MB of RAM, but that is more than enough for Pi-hole and some tinkering. And did I mention that it costs only $10? No? Well, isn’t that great?</p>
<p>Anyway, what you might not know is that, being powered by a full Debian-based distribution, you can install a variety of software using the <code>apt</code> package manager. Including R which, however, is often lagging behind the latest CRAN release (I could only install R 3.5.2 using <code>apt</code>).</p>
<p>Interestingly, a couple of days ago while I was browsing the schedule for <a href="https://www.rstudio.com/conference/2022/schedule/">rstudio::conf(2022)</a>, I came across the <a href="https://r4pi.org">R for the Raspberry Pi</a> project. It aims to provide up-to-date builds of R for Raspberry Pi computers, which can be installed in <a href="https://r4pi.org/docs/installation/">few simple steps</a>. I mean, I obviously had to check it out!</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/07/26/afml.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:50.0%" alt="A few moments later..."></p>
</figure>
</div>
<p>…here we are, connecting to the headless Pi via SSH on my macOS terminal:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/07/26/pi-scr-01.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%" alt="Screenshot from the Pi Zero W running R 4.1.2."></p>
</figure>
</div>
<p>Nice, even though the latest available version (for my Pi Zero W, at least) is only R 4.1.2.</p>
<blockquote class="blockquote">
<p>The next obvious question is: can you actually do something with R on such a low-powered computer?</p>
</blockquote>
<p>To answer this question, I set up a short benchmark script that simulates data from a logistic regression model, for 1,000 subjects, and then fits the corresponding, true model 100 times (using the {microbenchmark} package). The script is available in the following Gist if you want to try running it on your machine for comparison:</p>
<script src="https://gist.github.com/ellessenne/5e0c35a7a0625d9dd5bbe8c284d52155.js"></script>
<p>Here are the results:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/07/26/pi-scr-02.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%" alt="A Pi Zero W."></p>
</figure>
</div>
<p>The median time was 106.4 milliseconds, with an interquartile interval of 105.9 to 118.4 milliseconds. Not bad! For comparison, let’s run the benchmark script on my 2019, 13-inch MacBook Pro (2.4 GHz, quad-core i5 processor, an i5-8279U, with 16 GB of RAM). That can be done in just a couple of lines, directly from your R session:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">gist_url <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"https://gist.githubusercontent.com/ellessenne/5e0c35a7a0625d9dd5bbe8c284d52155/raw/9392820a5f53d8f438a19f1518e30eeca1df30f3/pizw-bench.R"</span></span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">source</span>(gist_url)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Unit: milliseconds
                expr      min       lq     mean  median       uq      max neval
 logistic regression 2.364326 2.817422 3.720309 3.29978 3.823631 12.73771   100</code></pre>
</div>
</div>
<p>Okay, I guess the Pi Zero W is <em>slow</em> compared to my relatively modern laptop, approximately 30 to 40 times slower… but hey! It’s machine learning on a $10 computer, isn’t that awesome?</p>
<p>It’s clearly not enough for large projects, but a fantastic option for democratising data science, thanks to open-source software and a tiny $10 computer. If you want to read more about the concept, don’t forget to check out <a href="https://peerj.com/preprints/3195/">this early draft</a> by the team of <a href="https://jtleek.com/">Jeff Leek</a> on the importance of <em>democratising data science education</em>. The future is bright!</p>



 ]]></description>
  <category>rstats</category>
  <category>raspberrypi</category>
  <category>machinelearning</category>
  <guid>https://www.ellessenne.xyz/blog/2022/07/26/</guid>
  <pubDate>Mon, 25 Jul 2022 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Numerical integration in generalised linear mixed models</title>
  <link>https://www.ellessenne.xyz/blog/2022/06/18/</link>
  <description><![CDATA[ 




<p>Hi everyone!</p>
<p>I was in class a few weeks ago helping with a course on longitudinal data analysis, and towards the end of the course, we introduced generalised linear mixed models (GLMMs).</p>
<p>Loosely speaking, GLMMs generalise linear mixed models (LMMs) in the same way that generalised linear models (GLMs) extend the standard linear regression model: by allowing to model outcomes that follow distributions other than the gaussian, such as Poisson (e.g., for count data) or binomial (e.g., for binary outcomes – looking at you, <em>logistic regression</em>). Using proper notation, in GLMMs the response for the i<sup>th</sup> subject at the j<sup>th</sup> measurement occasion, <img src="https://latex.codecogs.com/png.latex?Y_%7Bij%7D">, is assumed to follow any distribution from the exponential family such that <img src="https://latex.codecogs.com/png.latex?%0Ag%5BE(Y_%7Bij%7D%20%7C%20b_i)%5D%20=%20X_%7Bij%7D%20%5Cbeta%20+%20Z_%7Bij%7D%20b_i,%0A"> where <img src="https://latex.codecogs.com/png.latex?g(%5Ccdot)"> is a known link function and <img src="https://latex.codecogs.com/png.latex?b_i"> are the subject-specific random effects, assumed to follow a multivariate distribution with zero mean and variance-covariance matrix <img src="https://latex.codecogs.com/png.latex?%5CSigma">, <img src="https://latex.codecogs.com/png.latex?b_i%20%5Csim%20N(0,%20%5CSigma)">. Given that random effects are subject-specific, responses within each individual will be correlated.</p>
<p>Let’s go one step forward by defining the GLMM-equivalent of a logistic regression model, where the link function is a logistic function, and assuming a random intercept only (for simplicity): <img src="https://latex.codecogs.com/png.latex?%0A%5Ctext%7Blogit%7D%5BP(Y_%7Bij%7D%20=%201%20%7C%20b_i)%5D%20=%20%5Cbeta%20X_%7Bij%7D%20+%20b_i,%20%5C%20b_i%20%5Csim%20N(0,%20%5Cnu%5E2)%0A"> Here <img src="https://latex.codecogs.com/png.latex?X_%7Bij%7D"> can represent any set of <em>fixed effects</em> covariates, such as a treatment assignment, time, or something like that, <img src="https://latex.codecogs.com/png.latex?%5Cbeta"> a vector of regression coefficients, and <img src="https://latex.codecogs.com/png.latex?b_i"> is univariately normal. Note that interpretation of the fixed effects is similar to that of regression coefficients from a logistic regression model, as they can be interpreted as log odds ratios. The main difference, however, is that this interpretation is <em>conditional on random effects being set to zero</em> – loosely speaking, this interpretation holds for an <em>average</em> subject (in terms of random effects). It is fundamentally important to keep this in mind when interpreting the results of LMMs and GLMMs!</p>
<p>To estimate a GLMM, we can use the maximum likelihood approach. The joint probability density function of <img src="https://latex.codecogs.com/png.latex?(Y_i,%20b_i)"> is given by <img src="https://latex.codecogs.com/png.latex?%0Af(Y_i,%20b_i%20%7C%20X_i)%20=%20f(Y_i%20%7C%20b_i,%20X_i)%20f(b_i),%0A"> where the random effect components are however latent (e.g., unobserved and unobservable). In order to calculate the individual contributions to the likelihood <img src="https://latex.codecogs.com/png.latex?L_i">, we thus need to integrate out the density of the random effects: <img src="https://latex.codecogs.com/png.latex?%0AL_i%20=%20%5Cint_B%20f(Y_i%20%7C%20b_i,%20X_i)%20f(b_i)%20%5C%20db_i%0A"> …which unfortunately does not have a closed, analytical form for GLMMs.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://media.giphy.com/media/WpaVhEcp3Qo2TjwyI1/giphy.gif" class="img-fluid quarto-figure quarto-figure-center figure-img" alt="Oh no!"></p>
</figure>
</div>
<p>Luckily, several methods have been proposed throughout the years to approximate the value of integrals that do not have a closed-form; this is generally referred to as <em>numerical integration</em>.</p>
<p>Now, this is where we go off a tangent; buckle up, boys.</p>
<section id="numerical-integration" class="level2">
<h2 class="anchored" data-anchor-id="numerical-integration">Numerical integration</h2>
<p><a href="https://en.wikipedia.org/wiki/Numerical_integration">According to Wikipedia</a>,</p>
<blockquote class="blockquote">
<p>“[…] numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral.”</p>
</blockquote>
<p>As I mentioned above, the basic problem that numerical integration aims to solve is to approximate the value of a definite integral such as <img src="https://latex.codecogs.com/png.latex?%0A%5Cint_a%5Eb%20f(x)%20%5C%20dx%0A"> to a given degree of accuracy.</p>
<p>The term <em>numerical integration</em> first appeared in 1915 (who knew?!) and, as you can imagine, a plethora of approaches have been proposed throughout the years to solve this problem.</p>
<p>Several approaches are based on deriving interpolating functions that are easy to integrate, such as polynomials of low degree (e.g., linear or quadratic). The simplest approach of this kind is the so-called <em>rectangle rule</em> (or <em>midpoint rule</em>), where the interpolating function is a constant function (i.e.&nbsp;a polynomial of degree zero) that passes through the midpoint: <img src="https://latex.codecogs.com/png.latex?%0A%5Cint_a%5Eb%20f(x)%20%5C%20dx%20%5Capprox%20(b%20-%20a)%20f%5Cleft(%5Cfrac%7Ba%20+%20b%7D%7B2%7D%5Cright)%0A"> Of course, the smart thing about this approach is that we can divide the integral into a large number of sub-intervals, thus increasing the accuracy of the approximation. Let’s illustrate this by assuming that we are trying to integrate a standard normal distribution (the <code>dnorm()</code> function in R), e.g.&nbsp;<img src="https://latex.codecogs.com/png.latex?x%20%5Csim%20N(0,%201)">. With five sub-intervals, the approximation would look like:</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>And the approximated integral would be 1.59684 (the true value is one, given that we are integrating a distribution). With more sub-intervals, e.g.&nbsp;15, the accuracy of this approximation improves:</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>Now, the approximated integral is 1.00003, pretty much spot on. We can test how many sub-intervals are reguired to get a good approximation:</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>We can see that by using approximately 8-10 sub-intervals or more we can get a (very!) good approximation. Now, intuitively this works so well because we are trying to integrate a simple function; what if we had a more complex function, such as <img src="https://latex.codecogs.com/png.latex?%0Af(x)%20=%20%5Cfrac%7B1%7D%7B(x%20+%201)%20%5Csqrt%7Bx%7D%7D,%0A"> that we want to integrate between zero and 20?</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>As you can see from the plot above, the performance here is much worse than before.</p>
<p>The simple rectangle rule can now be improved by using more complex interpolating functions (e.g., using a polynomial of degree one, leading to a <em>trapezoidal rule</em>, or a polynomial of degree two, leading to a <em>Simpson’s rule</em>). We will however jump a million steps ahead (…sorry! The <a href="">Wikipedia article on numerical integration</a> includes a bunch more details if you want to read more about this), and talk about more elaborate numerical integration methods that are actually used when estimating the likelihood function of a GLMM.</p>
<p>Specifically, two approaches are routinely used, the first of which is <a href="https://en.wikipedia.org/wiki/Laplace%27s_method">Laplace approximation</a>. The Laplace approximation uses a second-order Taylor series expansion, based on re-writing the intractable integrand from the likelihood contributions, which allows deriving closed-form expressions for the approximation. This is generally the fastest approach, and it can be shown that the approximation is asymptotically exact for an increasing number of observations per random effect. See, e.g., <a href="http://www.imm.dtu.dk/~hmad/GLM/Slides_2012/week11/lect11.pdf">here</a> for more details on the Laplace approximation in the settings of GLMMs.</p>
<p>Nevertheless, in practical applications, the accuracy of the Laplace approximation may still be of concern. Thus, a second approach for numerical integration is often used to obtain more accurate approximations (while however being more computationally demanding): <a href="https://en.wikipedia.org/wiki/Gaussian_quadrature">Gaussian quadrature</a>.</p>
<p>A quadrature rule is an approximation of a definite integral of a function that is stated as a weighted sum of function values at specified points within the domain of integration. For instance, when integrating a function over the domain <img src="https://latex.codecogs.com/png.latex?%5B-1,%201%5D">, such rule (using <img src="https://latex.codecogs.com/png.latex?n"> points) is defined as: <img src="https://latex.codecogs.com/png.latex?%0A%5Cint_%7B-1%7D%5E%7B1%7D%20f(x)%20%5C%20dx%20%5Capprox%20%5Csum_%7Bi%20=%201%7D%5E%7Bn%7D%20w_i%20f(x_i)%0A"> This rule is <em>exact</em> for polynomials of degree <img src="https://latex.codecogs.com/png.latex?2n%20-%201"> or less. For our specific problem, the domain of integration is <img src="https://latex.codecogs.com/png.latex?(-%5Cinfty,%20+%5Cinfty)"> (the domain of the normal distribution of the random effects); this leads to a so-called <a href="https://en.wikipedia.org/wiki/Gauss–Hermite_quadrature">Gauss-Hermite quadrature rule</a>, which given the normal distribution that we are trying to approximate, has optimal properties. Specifically, Gauss-Hermite integration with a function kernel of <img src="https://latex.codecogs.com/png.latex?e%5E%7B-x%5E2%7D"> for a normal distribution with mean <img src="https://latex.codecogs.com/png.latex?%5Cmu"> and variance <img src="https://latex.codecogs.com/png.latex?%5Csigma%5E2">, leads to the following rule: <img src="https://latex.codecogs.com/png.latex?%0A%5Cint_%7B-%5Cinfty%7D%5E%7B+%5Cinfty%7D%20f(x)%20%5C%20dx%20%5Capprox%20%5Csum_%7Bi%20=%201%7D%20%5E%20n%20%5Cfrac%7Bw_i%7D%7B%5Csqrt%7B%5Cpi%7D%7D%20%5Cphi(%5Csqrt%7B2%7D%20%5Csigma%20z_i%20+%20%5Cmu),%0A"> with <img src="https://latex.codecogs.com/png.latex?%5Cphi(%5Ccdot)"> the density of a standard normal distribution (such as <code>dnorm()</code> in R).</p>
<p>Let’s get back to the example that we used above, integrating a standard normal distribution, to visualise Gauss-Hermite quadrature. This can be visualised, for 5-points quadrature, as:</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>For 11-points quadrature:</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>Testing convergence of this procedure:</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>As before, already with 7 quadrature points we get a good approximation of the density. Anecdotally, a larger number of quadrature points might be required to get a good approximation in practice, which would require more function evaluations and therefore a larger computational burden. Furthermore, this problem becomes exponentially more complex in higher dimensions, e.g., with more than one random effect: a univariate quadrature rule with 5 points requires 5 function evaluations, while a bivariate one with the same number of points requires 5<sup>2</sup>=25 evaluations every single time the intractable integral is evaluated. Interestingly, <a href="https://www.tandfonline.com/doi/abs/10.1080/10618600.1995.10474663">Pinheiro and Bates</a> showed that it is possible to <em>adapt</em> the integration procedure to be more computationally efficient in higher dimensions, by centring and scaling the quadrature procedure using conditional moments of the random effects distribution.</p>
<p>Now, we arrived at the end of this detour in the world of numerical integration; there are many more details that I skipped here, but the references linked above should provide a good starting point for the <em>interested reader</em>. Let’s get back to business.</p>
</section>
<section id="numerical-integration-in-glmms" class="level2">
<h2 class="anchored" data-anchor-id="numerical-integration-in-glmms">Numerical integration in GLMMs</h2>
<p>Remember that the individual contributions to the likelihood <img src="https://latex.codecogs.com/png.latex?L_i"> for a GLMM are intractable, thus requiring numerical integration: <img src="https://latex.codecogs.com/png.latex?%0AL_i%20=%20%5Cint_B%20f(Y_i%20%7C%20b_i,%20X_i)%20f(b_i)%20%5C%20db_i%0A"></p>
<blockquote class="blockquote">
<p>By now you can probably appreciate that accuracy of this approximation is key, but in practical terms, what does this mean? In other words, <em>what happens if we fail to accurately approximate that integral?</em></p>
</blockquote>
<p>Yes, my friends, it is finally time to get to the meat of this post, <em>a mere 2,000 words later</em>!</p>
</section>
<section id="simulation-simulation" class="level2">
<h2 class="anchored" data-anchor-id="simulation-simulation">Simulation? Simulation!</h2>
<p>We can try to answer that question using statistical simulation. If you know me, you might have noticed that I think that <a href="https://www.ellessenne.xyz/2021/02/a-short-ode-to-simulation">simulations are a fantastic tool for learning about complex statistical concepts</a>. What better than this to further prove the point, then?</p>
<p>Let’s start by describing the protocol of this simulation study. We use the <a href="https://onlinelibrary.wiley.com/doi/10.1002/sim.8086">ADEMP</a> framework to structure the protocol (and if you’re not on the ADEMP bandwagon, jump on, there’s still some space left).</p>
<section id="aim" class="level4">
<h4 class="anchored" data-anchor-id="aim"><strong>A</strong>im</h4>
<p>The aim of this simulation study is to test the accuracy of different numerical integration methods in terms of estimation of fixed and random effects in GLMMs.</p>
</section>
<section id="data-generating-mechanism" class="level4">
<h4 class="anchored" data-anchor-id="data-generating-mechanism"><strong>D</strong>ata-generating mechanism</h4>
<p>We use a single data generating mechanism for simplicity. We simulate binary outcomes for the i<sup>th</sup> subject, j<sup>th</sup> occasion from the following data-generating model:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Ctext%7Blogit%7D(Y_%7Bij%7D%20=%201)%20=%20(%5Cbeta_0%20+%20b_%7B0i%7D)%20+%20%5Cbeta_1%20%5Ctext%7BTreatment%7D_i%20+%20%5Cbeta_2%20%5Ctext%7BTime%7D_%7Bij%7D%20+%20%5Cbeta_3%20%5Ctext%7BTreatment%7D_i%20%5Ctimes%20%5Ctext%7BTime%7D_%7Bij%7D"></p>
<p>Note that:</p>
<ol type="1">
<li><p>We include a binary treatment which is assigned a random at baseline by drawing from a Bernoulli random variable with a success probability of 0.5;</p></li>
<li><p>Time between each measurement, in years, for each subject, is simulated by drawing from a Uniform(0, 1) random variable;</p></li>
<li><p>I allow for a maximum of 100 measurements per subject, and truncate follow-up after 10 years;</p></li>
<li><p><img src="https://latex.codecogs.com/png.latex?b_%7B0i%7D"> is a random intercept, simulated by drawing a subject-specific value from a normal distribution with mean zero and standard deviation <img src="https://latex.codecogs.com/png.latex?%5Csigma_%7Bb_0%7D"> = 4;</p></li>
<li><p>The regression parameters are assigned the values -2, -1, 0.5, and -0.1 for <img src="https://latex.codecogs.com/png.latex?%5Cbeta_0">, <img src="https://latex.codecogs.com/png.latex?%5Cbeta_1">, <img src="https://latex.codecogs.com/png.latex?%5Cbeta_2">, and <img src="https://latex.codecogs.com/png.latex?%5Cbeta_3">, respectively.</p></li>
</ol>
<p>Finally, every simulated dataset includes 500 subjects.</p>
</section>
<section id="estimands" class="level4">
<h4 class="anchored" data-anchor-id="estimands"><strong>E</strong>stimands</h4>
<p>The estimands of interest are the regression parameters <img src="https://latex.codecogs.com/png.latex?B%20=%20%5C%7B%5Cbeta_0,%20%5Cbeta_1,%20%5Cbeta_2,%20%5Cbeta_3%5C%7D"> and the standard deviation of the random intercept <img src="https://latex.codecogs.com/png.latex?%5Csigma_%7Bb_0%7D">.</p>
</section>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods"><strong>M</strong>ethods</h4>
<p>We fit the true, data-generating GLMM as implemented in the <code>glmer()</code> function from the {lme4} package in R, where we vary, however, the numerical integration method. In <code>glmer()</code>, this is defined by the <code>nAGQ</code> argument. We use the following methods for numerical integration:</p>
<ol type="1">
<li><p><code>nAGQ = 0</code>, corresponding to a faster but less exact form of parameter estimation for GLMMs by optimizing the random effects and the fixed-effects coefficients in the penalised iteratively reweighted least-squares step;</p></li>
<li><p><code>nAGQ = 1</code>, corresponding to the Laplace approximation. This is the default for <code>glmer()</code>;</p></li>
<li><p><code>nAGQ = k</code>, with <code>k</code> number of points used for the adaptive Gauss-Hermite quadrature method. I test values of <code>k</code> from 2 to 10, and then from 15 to 50 at steps of 5, for a total of 17 possible values of <code>k</code>. Remember that, as the <code>glmer()</code> documentation points out, <em>larger values of <code>k</code> produce greater accuracy in the evaluation of the log-likelihood at the expense of speed</em>.</p></li>
</ol>
<p>In total, 19 models are fit to each simulated dataset and compared with this simulation study.</p>
</section>
<section id="performance-measure" class="level4">
<h4 class="anchored" data-anchor-id="performance-measure"><strong>P</strong>erformance measure</h4>
<p>The main performance measure of interest is bias in the estimands of interest, to test the accuracy of the different numerical integration methods.</p>
</section>
<section id="number-of-repetitions" class="level4">
<h4 class="anchored" data-anchor-id="number-of-repetitions">Number of repetitions</h4>
<p>An important issue to keep in mind when running simulation studies is that of Monte Carlo error. Loosely speaking, there is a certain amount of randomness in the simulation and therefore uncertainty in the estimation of the performance measures of interest, thus we have to make sure that we can estimate them accurately (e.g., with low Monte Carlo error). This can be done iteratively, e.g., by running repetitions until the Monte Carlo error drops below a certain threshold (see <a href="https://www.ellessenne.xyz/2021/12/assessing-convergence-of-a-simulation-study">here</a> for more details).</p>
<p>For simplicity, we take a different approach here. What we do is the following:</p>
<ol type="1">
<li><p>Running 20 repetitions of this simulation study;</p></li>
<li><p>Using these 20 repetitions to estimate empirical standard errors and average model-based standard errors, for each estimand, and taking the largest value (denoted with <img src="https://latex.codecogs.com/png.latex?%5Cxi">);</p></li>
<li><p>Using the value that was just estimated to estimate how many repetitions would be needed to constrain Monte Carlo error to be less than 0.01, using the formula n<sub>sim</sub> = <img src="https://latex.codecogs.com/png.latex?%5Cxi%20/%20(0.01%5E2)">;</p></li>
<li><p>Rounding up the value of n<sub>sim</sub> to the nearest 100, to be conservative.</p></li>
</ol>
<p>n<sub>sim</sub> was finally estimated to be 1700, in this case, given <img src="https://latex.codecogs.com/png.latex?%5Cxi%20%5Capprox%200.1607">.</p>
</section>
<section id="results" class="level4">
<h4 class="anchored" data-anchor-id="results">Results</h4>
<p>At last, here are the results of this simulation study.</p>
<p>First, we assess whether all Monte Carlo errors for bias are below the threshold of 0.01. The plot below shows that the number of repetitions of this simulation study was enough to constrain Monte Carlo error within what we deemed acceptable.</p>
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/pconv.png" class="img-fluid" style="width:80.0%" alt="Monte Carlo standard errors for all estimands, to assess convergence."></p>
<p>Note that <code>(Intercept)</code> corresponds to <img src="https://latex.codecogs.com/png.latex?%5Cbeta_0">, <code>trt</code> corresponds to <img src="https://latex.codecogs.com/png.latex?%5Cbeta_1">, <code>time</code> corresponds to <img src="https://latex.codecogs.com/png.latex?%5Cbeta_2">, and <code>trt:time</code> corresponds to <img src="https://latex.codecogs.com/png.latex?%5Cbeta_3">. Unsurprisingly, <code>sd__(Intercept)</code> represents the standard deviation of the random intercept, <img src="https://latex.codecogs.com/png.latex?%5Csigma_%7Bb_0%7D">.</p>
<p>Second, we study bias for the fixed effects (e.g., the regression coefficients of the GLMM). The following plot depicts bias with 95% confidence intervals (based on Monte Carlo standard errors) for all fixed effects and across all integration methods:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/pbias_fixed.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%" alt="Bias, fixed effects."></p>
</figure>
</div>
<p>This shows that the <em>fast-but-approximate</em> method yields biased results, and that so does the Laplace approximation (in these settings, and except for the treatment by time interaction term). Overall, this shows that a larger number of points for the adaptive quadrature (e.g., 10) is required to obtain unbiased results for all regression coefficients.</p>
<p>Finally, the next plot shows bias for the standard deviation of the random intercept:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/pbias_var.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%" alt="Bias, variance components."></p>
</figure>
</div>
<p>Again, we can see that a larger number of quadrature points is required to get a good approximation with no bias.</p>
<p>Another thing that is interesting to study here is estimation time. As mentioned many times before, the larger the number of quadrature points the greater the accuracy, at the cost of additional computational complexity. But how much overhead do we have in these settings?</p>
<p>I’m glad you asked: the following plot depicts the distribution of estimation times for each GLMMs under each integration approach:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/ptime.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%" alt="Time it took to fit the models."></p>
</figure>
</div>
<p>The first thing to note is that the <em>fast-but-approximate</em> method is actually <em>really fast</em>! Second, and as expected, with more quadrature points the median estimation time was also larger. No surprises so far. Overall, though, <code>glmer()</code> seemed to be really fast in the settings, with estimation times rarely exceeding 25 seconds.</p>
</section>
</section>
<section id="an-actual-example" class="level2">
<h2 class="anchored" data-anchor-id="an-actual-example">An actual example</h2>
<p>Before we wrap up, let’s compare the different methods in practice by analysing a real dataset. For this, we use data from the 1989 Bangladesh fertility survey, which can be obtained directly from Stata:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(haven)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(dplyr)</span>
<span id="cb1-3"></span>
<span id="cb1-4">bangladesh <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read_dta</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"http://www.stata-press.com/data/r17/bangladesh.dta"</span>)</span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glimpse</span>(bangladesh)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Rows: 1,934
Columns: 8
$ district &lt;dbl&gt; 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ c_use    &lt;dbl+lbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, …
$ urban    &lt;dbl+lbl&gt; 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ age      &lt;dbl&gt; 18.440001, -5.559990, 1.440001, 8.440001, -13.559900, -11.559…
$ child1   &lt;dbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0…
$ child2   &lt;dbl&gt; 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0…
$ child3   &lt;dbl&gt; 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0…
$ children &lt;dbl+lbl&gt; 3, 0, 2, 3, 0, 0, 3, 3, 1, 3, 0, 0, 1, 3, 3, 3, 0, 3, 1, …</code></pre>
</div>
</div>
<p>We will fit a GLMM model for a binary outcome trying to study contraceptive use by urban residence, age and number of children; we also include a random intercept by district.</p>
<p>For illustration purposes, we fit three models:</p>
<ol type="1">
<li><p>GLMM estimated with the <em>fast-but-approximate</em> (<code>nAGQ = 0</code>) approach;</p></li>
<li><p>GLMM estimated with the Laplace approximation (<code>nAGQ = 1</code>);</p></li>
<li><p>GLMM estimated with adaptive Gauss-Hermite quadrature using 50 points (<code>nAGQ = 50</code>).</p></li>
</ol>
<p>The models can be fit with the following code:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(lme4)</span>
<span id="cb3-2"></span>
<span id="cb3-3">f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.0</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glmer</span>(c_use <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> urban <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> age <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> district), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">family =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">binomial</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> bangladesh, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nAGQ =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb3-4">f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.1</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glmer</span>(c_use <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> urban <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> age <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> district), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">family =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">binomial</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> bangladesh, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nAGQ =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-5">f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.50</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">glmer</span>(c_use <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> urban <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> age <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> child3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> district), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">family =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">binomial</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> bangladesh, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nAGQ =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>)</span></code></pre></div></div>
</div>
<p>…and we use the {broom.mixed} package to tidy, summarise, and compare the results from the three models:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(broom.mixed)</span>
<span id="cb4-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb4-3"></span>
<span id="cb4-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>(</span>
<span id="cb4-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tidy</span>(f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">conf.int =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nAGQ =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f.0"</span>),</span>
<span id="cb4-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tidy</span>(f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">conf.int =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nAGQ =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f.1"</span>),</span>
<span id="cb4-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tidy</span>(f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">conf.int =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nAGQ =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"f.50"</span>)</span>
<span id="cb4-8">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> nAGQ, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> estimate)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_errorbar</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymin =</span> conf.low, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymax =</span> conf.high), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n.breaks =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-13">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_bw</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-14">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">facet_wrap</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span>term, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">scales =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"free_y"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method (nAGQ)"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Point Estimate (95% C.I.)"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/models_comp-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
<p>Model coefficients from the three integration approaches are very similar in this case, thus we can be confident that the results of this analysis are likely not affected by that. Let’s thus print the summary of the model using adaptive quadrature:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.50</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Generalized linear mixed model fit by maximum likelihood (Adaptive
  Gauss-Hermite Quadrature, nAGQ = 50) [glmerMod]
 Family: binomial  ( logit )
Formula: c_use ~ urban + age + child1 + child2 + child3 + (1 | district)
   Data: bangladesh

     AIC      BIC   logLik deviance df.resid 
  2427.7   2466.6  -1206.8   2413.7     1927 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.8140 -0.7662 -0.5087  0.9973  2.7239 

Random effects:
 Groups   Name        Variance Std.Dev.
 district (Intercept) 0.2156   0.4643  
Number of obs: 1934, groups:  district, 60

Fixed effects:
             Estimate Std. Error z value Pr(&gt;|z|)    
(Intercept) -1.689295   0.147759 -11.433  &lt; 2e-16 ***
urban        0.732285   0.119486   6.129 8.86e-10 ***
age         -0.026498   0.007892  -3.358 0.000786 ***
child1       1.116005   0.158092   7.059 1.67e-12 ***
child2       1.365893   0.174669   7.820 5.29e-15 ***
child3       1.344030   0.179655   7.481 7.37e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
       (Intr) urban  age    child1 child2
urban  -0.288                            
age     0.449 -0.044                     
child1 -0.588  0.054 -0.210              
child2 -0.634  0.090 -0.382  0.489       
child3 -0.751  0.093 -0.675  0.538  0.623</code></pre>
</div>
</div>
<p>Furthermore, we can interpret exponentiated fixed effect coefficients as odds ratios:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fixef</span>(f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.50</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>(Intercept)       urban         age      child1      child2      child3 
  0.1846497   2.0798277   0.9738498   3.0526345   3.9192232   3.8344665 </code></pre>
</div>
</div>
<p>This shows that contraceptive use is higher in urban versus rural areas, with double the odds, that older women have lower odds of using contraceptives, and that the odds of using contraceptives are much higher in women with one or more children (odds ratios between 3 and 4 for women with 1, 2, or 3 children versus women with no children) – all else being equal. Nonetheless, remember that these results are conditional on a random intercept of zero, e.g., loosely speaking, for <em>an average district</em>.</p>
<p>Finally, there is also significant heterogeneity in contraceptive use between districts, with a variance for the random intercept of 0.2156. The distribution of the (log-) baseline odds of contraceptive use across districts can be visualised as:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coef</span>(f<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">.50</span>)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>district, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">(Intercept)</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_density</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_bw</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Baseline log-odds of contraceptive use"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Density"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/06/18/index_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:80.0%"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="closing-thoughts" class="level2">
<h2 class="anchored" data-anchor-id="closing-thoughts">Closing thoughts</h2>
<p>In conclusion, we saw how easy it is to get biased fixed effect estimates in GLMMs if we use a numerical integration method that is not accurate enough. Of course, this is just a single scenario and not a comprehensive study of estimation methods for GLMMs, but it should nonetheless show the importance of considering the estimation procedure when using (somewhat) more advanced statistical methods. This is, unfortunately, often overlooked in practice. My <em>friendly</em> suggestion would be to fit multiple models with different integration techniques, and if no significant differences are observed, then great: you got yourself some results that are robust to the numerical integration method being used.</p>
<p>This wraps up the longest blog post I’ve ever written (at ~3,500 words): if you made it this far, thanks for reading, and I hope you found this educational. And as always, please do get in touch if I got something wrong here.</p>
<p>Finally, R code for the simulation study, if you want to replicate the analysis or adapt it to other settings, is of course freely available on a GitHub repository <a href="https://github.com/ellessenne/glmer-nagq-simulation">here</a>.</p>
<p>Cheers!</p>


</section>

 ]]></description>
  <category>rstats</category>
  <category>simulation</category>
  <category>glmm</category>
  <guid>https://www.ellessenne.xyz/blog/2022/06/18/</guid>
  <pubDate>Fri, 17 Jun 2022 22:00:00 GMT</pubDate>
</item>
<item>
  <title>How to lose your talent in a simple step</title>
  <link>https://www.ellessenne.xyz/blog/2022/05/10/</link>
  <description><![CDATA[ 




<p><a href="https://twitter.com/ZoeSchiffer">Zoë Schiffer</a> from <em>The Verge</em> reports that the director of machine learning at Apple is leaving the company due to its return to work policy:</p>
<div class="container d-flex align-items-center justify-content-center">
  <blockquote class="twitter-tweet blockquote">
    <p lang="en" dir="ltr">Ian Goodfellow, Apple’s director of machine learning, is leaving the company due to its return to work policy. In a note to staff, he said “I believe strongly that more flexibility would have been the best policy for my team.” He was likely the company’s most cited ML expert.</p>— Zoë Schiffer (@ZoeSchiffer) <a href="https://twitter.com/ZoeSchiffer/status/1523017143939309568?ref_src=twsrc%5Etfw">May 7, 2022</a>
  </blockquote>
  <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
</div>
<p>I have had this conversation with peers and co-workers over and over again in the past few months: there is no going back to pre-pandemic life. And while I value <em>in-person</em> activities (looking at you, in-person research seminars), I am totally sure that organisations failing to accommodate what people want will lose talent to more flexible workplaces.</p>
<p><em>Update: as of May 18, 2022, it looks like <a href="https://www.bloomberg.com/news/articles/2022-05-17/ian-goodfellow-former-apple-director-of-machine-learning-to-join-deepmind">he is joining Alphabet’s DeepMind</a>.</em> <em>What a loss for Apple.</em></p>



 ]]></description>
  <category>news</category>
  <category>apple</category>
  <category>remotework</category>
  <guid>https://www.ellessenne.xyz/blog/2022/05/10/</guid>
  <pubDate>Mon, 09 May 2022 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Maximum likelihood estimation with {torch}</title>
  <link>https://www.ellessenne.xyz/blog/2022/04/28/</link>
  <description><![CDATA[ 




<p>Hi everyone!</p>
<p>Today’s blog post is a long time in the making, as I have been playing around with what we’re going to see today for quite a while now.</p>
<p>Let’s start with {torch}: what is that? Well, <a href="https://cran.r-project.org/package=torch">{torch}</a> is an R package wrapping the <code>libtorch</code> C++ library underlying the PyTorch open-source machine learning framework. It provides a variety of tools for developing machine learning methods, but there’s more: what we will focus on here is automatic differentiation and general-purpose optimisers.</p>
<p>Having these tools at our disposal lets us implement maximum likelihood estimation with state of the art tools. I will illustrate this using a simple parametric survival model, but as you can imagine, this generalises to more complex methods.</p>
<section id="parametric-survival-model" class="level2">
<h2 class="anchored" data-anchor-id="parametric-survival-model">Parametric survival model</h2>
<p>We will optimise an exponential survival model, for simplicity, whose log-likelihood function can be written as</p>
<p><img src="https://latex.codecogs.com/png.latex?%0Al(%5Ctheta)%20=%20d%20%5BX%20%5Cbeta%5D%20-%20%5Cexp(X%20%5Cbeta)%20t%20+%20d%20%5Clog(t)%0A"></p>
<p>Here <img src="https://latex.codecogs.com/png.latex?d"> is the event indicator variable, <img src="https://latex.codecogs.com/png.latex?X"> is the model design matrix, <img src="https://latex.codecogs.com/png.latex?%5Cbeta"> are regression coefficients, and <img src="https://latex.codecogs.com/png.latex?t"> is the observed time. Note that <img src="https://latex.codecogs.com/png.latex?X"> includes an intercept, which corresponds to the rate parameter <img src="https://latex.codecogs.com/png.latex?%5Clambda"> on the log scale.</p>
<p>Other parametric distributions (such as Weibull, Gompertz, etc.) are equally easy to implement, let me know <em>if you fancy trying this out for yourself</em>!</p>
</section>
<section id="simulating-data" class="level2">
<h2 class="anchored" data-anchor-id="simulating-data">Simulating data</h2>
<p>Let’s start by simulating some data. To simulate survival data from a given parametric distribution, I use the inversion method as described in <a href="https://doi.org/10.1002/sim.2059">Bender <em>et al</em>.</a>, assuming a single binary covariate (e.g., a binary treatment):</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">183475683</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># For reproducibility</span></span>
<span id="cb1-2"></span>
<span id="cb1-3">N <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100000</span></span>
<span id="cb1-4">lambda <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span></span>
<span id="cb1-5">beta <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb1-6">covs <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">id =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(N), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">trt =</span> stats<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rbinom</span>(N, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>L, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>))</span>
<span id="cb1-7"></span>
<span id="cb1-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Inversion method for survival times:</span></span>
<span id="cb1-9">u <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(N)</span>
<span id="cb1-10">T <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(u) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> (lambda <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(covs<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>trt <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> beta))</span></code></pre></div></div>
</div>
<p>We also apply administrative censoring at time 5:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">d <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.numeric</span>(T <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb2-2">T <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pmin</span>(T, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb2-3">s1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">id =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(N), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">eventtime =</span> T, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">status =</span> d)</span>
<span id="cb2-4">dd <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">merge</span>(covs, s1)</span></code></pre></div></div>
</div>
<p>We simulate data for 10^{5} subjects (which should be plenty enough to get pretty close to the truth if our implementation is correct), a rate parameter <img src="https://latex.codecogs.com/png.latex?%5Clambda"> of 0.2 and a regression coefficient <img src="https://latex.codecogs.com/png.latex?%5Cbeta"> of -0.5. If interested, a more general implementation of this method can be found in the <a href="https://CRAN.R-project.org/package=simsurv">{simsurv} package</a>.</p>
<p>As a test, let’s fit and plot a Kaplan-Meier survival curve:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(survival)</span>
<span id="cb3-2"></span>
<span id="cb3-3">KM <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">survfit</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Surv</span>(eventtime, status) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> dd)</span>
<span id="cb3-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(KM, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xlab =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ylab =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Survival"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2022/04/28/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="480"></p>
</figure>
</div>
</div>
</div>
<p>Looks alright!</p>
</section>
<section id="likelihood-implementation" class="level2">
<h2 class="anchored" data-anchor-id="likelihood-implementation">Likelihood implementation</h2>
<p>Now, we implement the (log-) likelihood function using {torch}. The important thing to remember here is that {torch} uses tensors, on which we need to operate e.g.&nbsp;using <code>torch_multiply</code> for matrix multiplication:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(torch)</span>
<span id="cb4-2"></span>
<span id="cb4-3">log_likelihood <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(par, data, status, time) {</span>
<span id="cb4-4">  ll <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_multiply</span>(status, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_mm</span>(data, par)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span></span>
<span id="cb4-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_multiply</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_exp</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_mm</span>(data, par)), time) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_multiply</span>(status, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_log</span>(time))</span>
<span id="cb4-7">  ll <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_sum</span>(ll)</span>
<span id="cb4-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(ll)</span>
<span id="cb4-9">}</span></code></pre></div></div>
</div>
<p>As a test, let’s define starting values for the model parameters (e.g., fixing their values at 1) and calculate the value of the (negative) log-likelihood function:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">xx <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb5-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log_likelihood</span>(</span>
<span id="cb5-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> xx,</span>
<span id="cb5-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">model.matrix</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span>trt, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> dd)),</span>
<span id="cb5-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">status =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(dd<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>status, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)),</span>
<span id="cb5-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(dd<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>eventtime, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb5-7">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>torch_tensor
1.71723e+06
[ CPUFloatType{} ]</code></pre>
</div>
</div>
<p>Looking good so far.</p>
</section>
<section id="likelihood-optimisation" class="level2">
<h2 class="anchored" data-anchor-id="likelihood-optimisation">Likelihood optimisation</h2>
<p>The final step consists of implementing the algorithm to optimise the likelihood. We start by re-defining starting values:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">x_star <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">requires_grad =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>, )</span></code></pre></div></div>
</div>
<p>Here we need to use the argument <code>requires_grad = TRUE</code> to use automatic differentiation and get gradients <em>for free</em>.</p>
<p>Next, we pick a general-purpose optimiser:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">optimizer <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">optim_lbfgs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">params =</span> x_star, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">line_search_fn =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"strong_wolfe"</span>)</span></code></pre></div></div>
</div>
<p>We pick the L-BFGS algorithm with strong Wolfe conditions for the line search algorithm, but any would do. Note that a comparable algorithm is implemented in base R as <code>optim()</code>’s <code>L-BFGS-B</code> method.</p>
<p>We also need to define one extra function that will be used in the optimisation loop to make each step towards the optimum:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">one_step <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>() {</span>
<span id="cb9-2">  optimizer<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zero_grad</span>()</span>
<span id="cb9-3">  value <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log_likelihood</span>(</span>
<span id="cb9-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> x_star,</span>
<span id="cb9-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">model.matrix</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span>trt, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> dd)),</span>
<span id="cb9-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">status =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(dd<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>status, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)),</span>
<span id="cb9-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(dd<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>eventtime, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb9-8">  )</span>
<span id="cb9-9">  value<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">backward</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">retain_graph =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb9-10">  value</span>
<span id="cb9-11">}</span></code></pre></div></div>
</div>
<p>We finally have all the bits to actually optimise the likelihood.</p>
<p>We define the required precision as <code>eps = 1e-6</code>, and we loop until the difference in log-likelihood is less than (or equal to) <code>eps</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">eps <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Precision</span></span>
<span id="cb10-2">converged <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Used to stop the loop</span></span>
<span id="cb10-3">last_val <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">Inf</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Need a value to compare to for the first iteration</span></span>
<span id="cb10-4">i <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Iterations counter</span></span>
<span id="cb10-5"></span>
<span id="cb10-6"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>converged) {</span>
<span id="cb10-7">  i <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> i <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb10-8">  obj_val <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> optimizer<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">step</span>(one_step)</span>
<span id="cb10-9">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.logical</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_less_equal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_abs</span>(obj_val <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> last_val), eps))) {</span>
<span id="cb10-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>(i) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># This will print how many iterations were required before stopping</span></span>
<span id="cb10-11">    converged <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb10-12">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">break</span></span>
<span id="cb10-13">  }</span>
<span id="cb10-14">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (i <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>) {</span>
<span id="cb10-15">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># For safety</span></span>
<span id="cb10-16">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stop</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Did not converge after 10000 iterations"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">call. =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb10-17">  }</span>
<span id="cb10-18">  last_val <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> obj_val</span>
<span id="cb10-19">}</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 3</code></pre>
</div>
</div>
<p>That’s it! The results of the optimisation are contained in the <code>x_star</code> object:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">x_star</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>torch_tensor
-1.6160
-0.4951
[ CPUFloatType{2,1} ][ requires_grad = TRUE ]</code></pre>
</div>
</div>
<p>…remember that the true values that we simulated data from were:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(lambda)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] -1.609438</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># and</span></span>
<span id="cb16-2">beta</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] -0.5</code></pre>
</div>
</div>
<p>Which is pretty close to what we had estimated. Of course, this is a single replication only, and we might want to test this with smaller sample sizes. Nevertheless, the test sample size is large enough that I would feel comfortable with this implementation.</p>
</section>
<section id="conclusions" class="level2">
<h2 class="anchored" data-anchor-id="conclusions">Conclusions</h2>
<p>One thing that is missing from the implementation above is the estimation of confidence intervals for the model parameters.</p>
<p>We get the gradients <em>for free</em>, so that should be straightforward after inverting the Hessian matrix at the optimum. Despite that, the R interface does not implement (yet) direct calculation of the Hessian via the <code>torch.autograd.functional.hessian</code> function so we need to work a little harder for that.</p>
<p>Specifically, we have to differentiate the gradients again to obtain the Hessian matrix:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb18-1">ll <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log_likelihood</span>(</span>
<span id="cb18-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> x_star,</span>
<span id="cb18-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">model.matrix</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span>trt, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> dd)),</span>
<span id="cb18-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">status =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(dd<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>status, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)),</span>
<span id="cb18-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">torch_tensor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(dd<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>eventtime, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb18-6">)</span>
<span id="cb18-7">grad <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autograd_grad</span>(ll, x_star, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">retain_graph =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">create_graph =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]</span>
<span id="cb18-8"></span>
<span id="cb18-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Using base R matrix here for simplicity</span></span>
<span id="cb18-10">hess <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NA</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nrow =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(x_star), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ncol =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(x_star))</span>
<span id="cb18-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (d <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(grad)) {</span>
<span id="cb18-12">  hess[d, ] <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_array</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autograd_grad</span>(grad[d], x_star, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">retain_graph =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]])</span>
<span id="cb18-13">}</span></code></pre></div></div>
</div>
<p>The variance-covariance matrix for the model coefficients will now be the inverse of the Hessian:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb19-1">vcv <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">solve</span>(hess)</span></code></pre></div></div>
</div>
<p>To wrap up, the fitted model coefficients (with standard errors) will be:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1">results <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">beta =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_array</span>(x_star), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">diag</span>(vcv)))</span>
<span id="cb20-2">results</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>        beta          se
1 -1.6159527 0.005639768
2 -0.4950813 0.008709507</code></pre>
</div>
</div>
<p>Hopefully, the {torch} package in R will soon port the automatic Hessian calculation, which will simplify things further.</p>
<p>Finally, for comparison, we fit the same model using the equivalent R implementation from the (experimental) <a href="https://github.com/ellessenne/streg">{streg} package</a>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb22" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb22-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(streg)</span>
<span id="cb22-2">expfit <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">streg</span>(</span>
<span id="cb22-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formula =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Surv</span>(eventtime, status) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> trt,</span>
<span id="cb22-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> dd,</span>
<span id="cb22-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">distribution =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"exp"</span></span>
<span id="cb22-6">)</span>
<span id="cb22-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(expfit)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Exponential regression -- log-relative hazard form

N. of subjects  = 100000 
N. of failures  = 54174 
Time at risk    = 345727.5 

Log likelihood  = -131655.9 

             Estimate Std. Error z value Pr(&gt;|z|)    
(Intercept) -1.615311   0.005637 -286.54   &lt;2e-16 ***
trt         -0.495611   0.008707  -56.92   &lt;2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1</code></pre>
</div>
</div>
<p>This also uses automatic differentiation (via {TMB}), but nevertheless… pretty close, isn’t it?</p>
<p>That’s it for today, and as always, thank you for reading and feel free to get in touch if I got something terribly wrong or if you just want to have a chat about it. Cheers!</p>


</section>

 ]]></description>
  <category>rstats</category>
  <category>likelihood</category>
  <category>torch</category>
  <category>optimisation</category>
  <guid>https://www.ellessenne.xyz/blog/2022/04/28/</guid>
  <pubDate>Wed, 27 Apr 2022 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Assessing convergence of a simulation study</title>
  <link>https://www.ellessenne.xyz/blog/2021/12/31/</link>
  <description><![CDATA[ 




<p>Hey everyone,</p>
<p>The blog is back! And just in time to wrap up 2021.</p>
<p>Let’s get straight to business, shall we? Today we’re going to talk about ways of assessing whether your simulation study has <em>converged</em>. This is something that’s been on my mind for a while, and I finally got some focused time to write about it.</p>
<p>To do so, let’s first define what we mean with <em>converged</em>:</p>
<blockquote class="blockquote">
<p>We define a simulation study as <em>converged</em> if:</p>
<ol type="1">
<li>Estimation of our key performance measures is stable, and</li>
<li>Monte Carlo error is small enough.</li>
</ol>
</blockquote>
<p>We are going to use the <code>MIsim</code> dataset, which you should be familiar with by now if you’ve been here before, and which comes bundled with the {rsimsum} package:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsimsum)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MIsim"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsimsum"</span>)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(MIsim)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 6 × 4
  dataset method      b    se
    &lt;dbl&gt; &lt;chr&gt;   &lt;dbl&gt; &lt;dbl&gt;
1       1 CC      0.707 0.147
2       1 MI_T    0.684 0.126
3       1 MI_LOGT 0.712 0.141
4       2 CC      0.349 0.160
5       2 MI_T    0.406 0.141
6       2 MI_LOGT 0.429 0.136</code></pre>
</div>
</div>
<p>To find out more about this data, type <code>?MIsim</code> in your R console after loading the {rsimsum} package.</p>
<p>Interestingly, this dataset already includes a column named <code>dataset</code> indexing each repetition of the simulation study. If that was not the case, we should create such a column at this stage.</p>
<p>Let’s also load the {tidyverse} package for data wrangling and visualisation with {ggplot2}:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span></code></pre></div></div>
</div>
<p>The first step consists of computing performance measures <em>cumulatively</em>, e.g.&nbsp;for the first <code>i</code> repetitions (where <code>i</code> goes from 10 to the total number of repetitions that we run for our study, in this case, 1000). We use 10 here as the starting point as we assume that we should run at least 10 repetitions to get any useful result.</p>
<p>This can be easily done using our good ol’ friend the <code>simsum</code> function and <code>map_</code> from the {purrr} package; specifically, we use <code>map_dfr()</code> to get a dataset obtained by stacking rows instead of a list object. Notice also that we only focus on <em>bias</em> as the performance measure of interest here; this could be done, in principle, for any other performance measure.</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">full_results <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map_dfr</span>(</span>
<span id="cb4-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.x =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,</span>
<span id="cb4-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.f =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(i) {</span>
<span id="cb4-4">    s <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(</span>
<span id="cb4-5">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(MIsim, dataset <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;=</span> i),</span>
<span id="cb4-6">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"b"</span>,</span>
<span id="cb4-7">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>,</span>
<span id="cb4-8">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,</span>
<span id="cb4-9">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">methodvar =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"method"</span>,</span>
<span id="cb4-10">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ref =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CC"</span></span>
<span id="cb4-11">    )</span>
<span id="cb4-12">    s <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(s)</span>
<span id="cb4-13">    results <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rsimsum<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tidy.summary.simsum</span>(s)</span>
<span id="cb4-14">    results <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(results, stat <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bias"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb4-15">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">i =</span> i)</span>
<span id="cb4-16">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(results)</span>
<span id="cb4-17">  }</span>
<span id="cb4-18">)</span>
<span id="cb4-19"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(full_results)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>  stat         est       mcse  method       lower      upper  i
1 bias 0.006621292 0.04267397      CC -0.07701815 0.09026074 10
2 bias 0.016931728 0.03830584 MI_LOGT -0.05814635 0.09200980 10
3 bias 0.008187965 0.03047714    MI_T -0.05154612 0.06792205 10
4 bias 0.009806267 0.03873124      CC -0.06610556 0.08571810 11
5 bias 0.021526182 0.03495223 MI_LOGT -0.04697892 0.09003129 11
6 bias 0.016194059 0.02870663    MI_T -0.04006990 0.07245802 11</code></pre>
</div>
</div>
<p>We can use these results to plot estimated bias (and corresponding Monte Carlo standard errors) over repetition numbers:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(hrbrthemes) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Provide some nicer themes and colour scales</span></span>
<span id="cb6-2"></span>
<span id="cb6-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(full_results, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> i, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> est, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> method)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_ipsum</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_ipsum_rc</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.justification =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Repetition #"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Estimated Bias"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bias over (cumulative) repetition number"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/12/31/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(full_results, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> i, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> mcse, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> method)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_ipsum</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_ipsum_rc</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.justification =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Repetition #"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Estimated Monte Carlo S.E."</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Monte Carlo S.E. over (cumulative) repetition number"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/12/31/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<blockquote class="blockquote">
<p>What shall we be looking for in these plots?</p>
</blockquote>
<p>Basically, if (and when) the curves <em>flatten</em>. Remember <em>flatten the curve</em>? That applies here as well.</p>
<p>For Monte Carlo standard errors, we want to also check when they reach a threshold of uncertainty (e.g.&nbsp;0.01) that we are willing to accept. We can also add this threshold to the plot to help our intuition:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(full_results, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> i, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> mcse, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> method)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_hline</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yintercept =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">linetype =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_ipsum</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_ipsum_rc</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.justification =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb8-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Repetition #"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Estimated Monte Carlo S.E."</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Monte Carlo S.E. over (cumulative) repetition number"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/12/31/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>The plots above seem to be stable enough after ~500 repetitions, while Monte Carlo errors cross our threshold after ~250 repetitions. If I had to interpret this, I would be satisfied with the convergence of the study.</p>
<p>The dataset with results (<code>full_results</code>) includes confidence intervals for estimated bias, at each (cumulative) repetition number, based on Monte Carlo standard errors. We can, therefore, further enhance the first plot introduced above:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(full_results, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> i, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> est)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_ribbon</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymin =</span> lower, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ymax =</span> upper, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> method), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> method)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_ipsum</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_ipsum</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_ipsum_rc</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.justification =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Repetition #"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Estimated Bias"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/12/31/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Adding confidence intervals surely enhances our perception of <em>stability</em>.</p>
<p>Now, we could stop here and call it a day, but where’s the fun in that? Let’s take it to the next level.</p>
<p>I think that an even better way of assessing convergence is to check whether <em>the incremental value of an additional repetition</em> affects the results of the study (or not). When running additional repetitions stop adding value (e.g.&nbsp;changing the results), then we can (safely?) assume we have converged to stable results.</p>
<p>Let’s, therefore, calculate this difference (e.g.&nbsp;bias at i<sup>th</sup> iteration versus bias at the (i-1)<sup>th</sup> iteration), and let’s do it for both estimated bias and Monte Carlo standard error:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">full_results <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>(full_results, method, i) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group_by</span>(method) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%&gt;%</span></span>
<span id="cb10-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(</span>
<span id="cb10-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lag_est =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lag</span>(est),</span>
<span id="cb10-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lag_mcse =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lag</span>(mcse),</span>
<span id="cb10-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">diff_est =</span> est <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> lag_est,</span>
<span id="cb10-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">diff_mcse =</span> mcse <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> lag_mcse</span>
<span id="cb10-8">  )</span></code></pre></div></div>
</div>
<p>This <em>difference</em> is what we now decide to plot:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">p1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(full_results, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> i, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> diff_est, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> method)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_ipsum</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_ipsum_rc</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.justification =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb11-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Repetition #"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Difference"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expression</span>(Bias <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> difference<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="er" style="color: #AD0000;
background-color: null;
font-style: inherit;">~</span> i<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>th <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> (i <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>th))</span>
<span id="cb11-7">p1</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/12/31/index_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">p2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(full_results, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> i, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> diff_mcse, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> method)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb12-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb12-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_color_ipsum</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb12-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_ipsum_rc</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb12-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.justification =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb12-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Repetition #"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Difference"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Method"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">title =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expression</span>(Monte <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> Carlo <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> S.E. <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> difference<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="er" style="color: #AD0000;
background-color: null;
font-style: inherit;">~</span> i<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>th <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> (i <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>th))</span>
<span id="cb12-7">p2</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/12/31/index_files/figure-html/unnamed-chunk-11-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>We save both plot objects <code>p1</code> and <code>p2</code>, as we might be using that again later on. What we want here are the curves to flatten around zero, which seems to be the case for this example.</p>
<p>The y-scale of the plot is highly influenced by larger differences early on, so we could decide to focus on a narrower range around zero, e.g.&nbsp;for point estimates:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_cartesian</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ylim =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.003</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.003</span>))</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/12/31/index_files/figure-html/unnamed-chunk-12-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>With this, we can confirm what we suspected a few plots ago: that <em>yes, the incremental value of running more than ~500 repetitions is somewhat marginal here</em>.</p>
<p>To wrap up, there is one final thing I would like to mention here: all we did is <em>post-hoc</em>. We already run the study, hence all we get to do is to check whether the results are stable and precise enough (in our loose terms defined above).</p>
<p>The implication is an interesting one, in my opinion: you <em>do not</em> need a bunch of iterations if you can avoid it, especially if each repetition is expensive (computationally speaking).</p>
<p>What you could do, if you really wanted to, is to define a <em>stopping rule</em> such as:</p>
<ul>
<li><p>After N (e.g.&nbsp;10) consecutive iterations with a difference below a certain threshold (for the key performance measure), stop the study;</p></li>
<li><p>After you reach a given precision for bias (in terms of Monte Carlo error), stop the study;</p></li>
<li><p>You name it.</p></li>
</ul>
<p>Of course, there are other things to consider such as the sequentiality of repetitions, especially if there are missing values (e.g.&nbsp;due to non-convergence of some iterations) – this is not meant to be, in any way, a comprehensive take on the topic. Feel free to reach out, as always, if you have any comments. Nevertheless, I think I will get back to this topic eventually, so stay tuned for that.</p>
<p>That’s all from me for today, then it must be closing time: talk to you soon and, in the meantime, take care!</p>



 ]]></description>
  <category>rsimsum</category>
  <category>simulation</category>
  <category>convergence</category>
  <guid>https://www.ellessenne.xyz/blog/2021/12/31/</guid>
  <pubDate>Thu, 30 Dec 2021 23:00:00 GMT</pubDate>
</item>
<item>
  <title>rsimsum 0.11.0</title>
  <link>https://www.ellessenne.xyz/blog/2021/10/26/</link>
  <description><![CDATA[ 




<p>Hello!</p>
<p>It’s been a while since the last post on this website… don’t worry (I am sure you didn’t), I’m still here, just been busy with a bunch of life- and work-related things.</p>
<p>This post is to introduce the latest release of the {rsimsum} R package, version 0.11.0, which landed on <a href="https://CRAN.R-project.org/package=rsimsum">CRAN</a> last week, on October 20<sup>th</sup>.</p>
<p>This is a minor release, with some bug fixes and (more interestingly) the introduction of a new feature that was suggested in <a href="https://github.com/ellessenne/rsimsum/issues/22">#22</a> on GitHub by <a href="https://github.com/ge-li">Li Ge</a>: <code>print()</code> methods for summary objects now invisibly return the formatted tables that are printed to the console.</p>
<blockquote class="blockquote">
<p>Ok, but why should I care about that?</p>
</blockquote>
<p>It’s simple: you can print subset of results (e.g.&nbsp;for a presentation) much more easily, as the internals of {rsimsum} will deal with some of the formatting for you.</p>
<p>Here’s an example, using the <code>MIsim</code> dataset that comes bundled with the package:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsimsum)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MIsim"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsimsum"</span>)</span></code></pre></div></div>
</div>
<p>We summarise this simulation study as showed in the documentation <a href="https://ellessenne.github.io/rsimsum/reference/simsum.html">here</a>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">s <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simsum</span>(</span>
<span id="cb2-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> MIsim,</span>
<span id="cb2-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">estvarname =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"b"</span>,</span>
<span id="cb2-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">se =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"se"</span>,</span>
<span id="cb2-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">true =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,</span>
<span id="cb2-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">methodvar =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"method"</span>,</span>
<span id="cb2-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ref =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"CC"</span></span>
<span id="cb2-8">)</span>
<span id="cb2-9">sums <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(s, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stats =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bias"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"cover"</span>))</span>
<span id="cb2-10">sums</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Values are:
    Point Estimate (Monte Carlo Standard Error)

Bias in point estimate:
              CC         MI_LOGT             MI_T
 0.0168 (0.0048) 0.0009 (0.0042) -0.0012 (0.0043)

Coverage of nominal 95% confidence interval:
              CC         MI_LOGT            MI_T
 0.9430 (0.0073) 0.9490 (0.0070) 0.9430 (0.0073)</code></pre>
</div>
</div>
<p>This is the standard workflow, and we focus on bias and coverage probability for simplicity. At this point, we could copy-paste a subset of the above results in our slides-making tool of choice.</p>
<p>However, that’s not particularly user-friendly, nor easily reproducible (e.g.&nbsp;if you run more iterations and need to update your results). Here’s where invisibly-returned formatted tables come handy:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">output <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>(sums)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Values are:
    Point Estimate (Monte Carlo Standard Error)

Bias in point estimate:
              CC         MI_LOGT             MI_T
 0.0168 (0.0048) 0.0009 (0.0042) -0.0012 (0.0043)

Coverage of nominal 95% confidence interval:
              CC         MI_LOGT            MI_T
 0.9430 (0.0073) 0.9490 (0.0070) 0.9430 (0.0073)</code></pre>
</div>
</div>
<p>Note here that the output of <code>sums</code> is printed once again, but it is also stored in a variable named <code>output</code>, which contains the formatted tables for each summary statistic of interest:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str</span>(output)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>List of 2
 $ Bias in point estimate                     :'data.frame':    1 obs. of  3 variables:
  ..$ CC     : chr "0.0168 (0.0048)"
  ..$ MI_LOGT: chr "0.0009 (0.0042)"
  ..$ MI_T   : chr "-0.0012 (0.0043)"
 $ Coverage of nominal 95% confidence interval:'data.frame':    1 obs. of  3 variables:
  ..$ CC     : chr "0.9430 (0.0073)"
  ..$ MI_LOGT: chr "0.9490 (0.0070)"
  ..$ MI_T   : chr "0.9430 (0.0073)"</code></pre>
</div>
</div>
<p>With this data at our disposal, we can finally print a better table using the general-purpose <code>kable()</code> function from the {knitr} package:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">w <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> output<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Bias in point estimate</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span></span>
<span id="cb8-2">knitr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kable</span>(</span>
<span id="cb8-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> w,</span>
<span id="cb8-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">align =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"c"</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ncol</span>(w))</span>
<span id="cb8-5">)</span></code></pre></div></div>
<div class="cell-output-display">
<table class="caption-top table table-sm table-striped small">
<thead>
<tr class="header">
<th style="text-align: center;">CC</th>
<th style="text-align: center;">MI_LOGT</th>
<th style="text-align: center;">MI_T</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;">0.0168 (0.0048)</td>
<td style="text-align: center;">0.0009 (0.0042)</td>
<td style="text-align: center;">-0.0012 (0.0043)</td>
</tr>
</tbody>
</table>
</div>
</div>
<p>Of course we would need to further improve on this for <em>production</em> (e.g.&nbsp;it now spans the whole width of the enclosing <code>&lt;div&gt;</code>, we might want to style it and resize it accordingly using css), but it is already a clear improvement. Most interestingly, consider including all of the above in a dynamically updated slides deck (e.g.&nbsp;created using the {xaringan} package): much better than before, with very little extra work, and plenty of room for further adjustments if you wish. Not bad!</p>
<p>Finally, the example above is for the <code>summary.simsum()</code> method: this is implemented for multiple estimands as well (in the <code>summary.multisimsum()</code> function), with an additional layer of nesting for the formatted output. I’m sure it’ll be straightforward to figure that out if you want to try it out, let me know if it isn’t.</p>
<p>That’s all for now, hope you find this useful and if you have further suggestions for features you’d like to see in {rsimsum}, don’t hesitate to get in touch or to <a href="https://github.com/ellessenne/rsimsum/issues">open an issue on GitHub</a>.</p>



 ]]></description>
  <category>rstats</category>
  <category>rsimsum</category>
  <category>release</category>
  <guid>https://www.ellessenne.xyz/blog/2021/10/26/</guid>
  <pubDate>Mon, 25 Oct 2021 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Simulation and Twitter riddles</title>
  <link>https://www.ellessenne.xyz/blog/2021/04/23/</link>
  <description><![CDATA[ 




<p>Hi everyone!</p>
<p>Earlier today I stumbled upon this tweet by <a href="https://twitter.com/seanjtaylor">Sean J. Taylor</a>:</p>
<div class="container d-flex align-items-center justify-content-center">
  <blockquote class="twitter-tweet blockquote">
    <p lang="en" dir="ltr">Without doing the math or looking it up, approximately how many coin flips do you need to observe in order to be 95% confident you know Prob(Heads) to within +/- 1%?</p>— Sean J. Taylor (@seanjtaylor) <a href="https://twitter.com/seanjtaylor/status/1385321273979379712?ref_src=twsrc%5Etfw">April 22, 2021</a>
  </blockquote>
  <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
</div>
<p>…and I asked myself:</p>
<blockquote class="blockquote">
<p>Can we answer this using simulation?</p>
</blockquote>
<p>The answer is that <em>yes, yes we can</em>! Let’s see how we can do that using R (of course). Looks like this is now primarily a statistical simulation blog, but hey…</p>
<p>Let’s start by loading some packages and setting an RNG seed for reproducibility:</p>
<div class="cell" data-hash="index_cache/html/unnamed-chunk-2_45c9fde6f0cd90aafac36fae919dce97">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidyverse)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(hrbrthemes)</span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3756</span>)</span></code></pre></div></div>
</div>
<p>Then, we define a function that we will use to replicate this experiment:</p>
<div class="cell" data-hash="index_cache/html/unnamed-chunk-3_f8fba42c89a7933c807654b7afc19502">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">simfun <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(i, .prob, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.diff =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.02</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.N =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100000</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.alpha =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>) {</span>
<span id="cb2-2">  draw <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rbinom</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> .N, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prob =</span> .prob)</span>
<span id="cb2-3">  df <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tibble</span>(</span>
<span id="cb2-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(.N),</span>
<span id="cb2-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cumsum</span>(draw),</span>
<span id="cb2-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mean =</span> x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> n,</span>
<span id="cb2-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lower =</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> (x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">qf</span>(.alpha <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> x, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))))<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),</span>
<span id="cb2-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">upper =</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> x) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> ((x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">qf</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> .alpha <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (n <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> x))))<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),</span>
<span id="cb2-9">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> upper <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> lower</span>
<span id="cb2-10">  )</span>
<span id="cb2-11">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tibble</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">i =</span> i, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prob =</span> .prob, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">diff =</span> .diff, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">min</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">which</span>(df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>width <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;=</span> .diff)))</span>
<span id="cb2-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(out)</span>
<span id="cb2-13">}</span></code></pre></div></div>
</div>
<p>This function:</p>
<ol type="1">
<li><p>Repeates the experiment for 1 to 100,000 (<code>.N</code>) draws from a binomial distribution (e.g.&nbsp;our coin toss), with success probability <code>.prob</code>;</p></li>
<li><p>Estimates the cumulative proportion of heads (our ones) across all number of draws;</p></li>
<li><p>Estimates confidence intervals using the <a href="https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Clopper–Pearson_interval">exact formula</a>;</p></li>
<li><p>Calculates the width of the confidence interval. For a +/- 1% precision, that corresponds to a width of 2% (or 0.02, depending on the scale being used);</p></li>
<li><p>Finally, returns the first number of draws where the width is 0.02 (or less). This will tell us how many draws we needed to draw to get a confidence interval that is narrow enough for our purpose.</p></li>
</ol>
<p>Then, we run the experiment <code>B = 200</code> times, with different values of <code>.prob</code> (as we want to show the required sample sizes over different success probabilities):</p>
<div class="cell" data-hash="index_cache/html/unnamed-chunk-4_27ef0af01630b660a068ff4ac72c9e3e">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">B <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span></span>
<span id="cb3-2"></span>
<span id="cb3-3">results <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map_dfr</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(B), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.f =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(j) {</span>
<span id="cb3-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simfun</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">i =</span> j, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.prob =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb3-5">})</span></code></pre></div></div>
</div>
<p>Let’s plot the results (using a LOESS smoother) versus the success probability:</p>
<div class="cell" data-layout-align="center" data-hash="index_cache/html/unnamed-chunk-5_6d25b59587b9ba3c3385f2bb5fd4d530">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(results, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> prob, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> y)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_smooth</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"loess"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#575FCF"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> scales<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>percent) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> scales<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>comma) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_cartesian</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xlim =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_ipsum</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_family =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Inconsolata"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot.margin =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unit</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lines"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Success probability"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Required sample size"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/04/23/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Looks like — as one would expect — the largest sample size required is for a success probability of 50%. What is the solution to the riddle then? As any statistician would confirm, <em>it depends</em>: if the coin is <em>fair</em>, then we would need approximately 10,000 tosses. Otherwise, the required sample size can be much smaller, even a tenth. Cool!</p>
<p>Finally, let’s repeat the experiment for a precision of +/- 3%:</p>
<div class="cell" data-hash="index_cache/html/unnamed-chunk-6_e9784e0b6b163421bc9df0fc29133beb">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">results_3p <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map_dfr</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(B), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.f =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(j) {</span>
<span id="cb5-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">simfun</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">i =</span> j, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.diff =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.06</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.prob =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb5-3">})</span></code></pre></div></div>
</div>
<p>Comparing with the previous results (code to produce the plot omitted for simplicity):</p>
<div class="cell" data-layout-align="center" data-hash="index_cache/html/unnamed-chunk-7_659c82aecbe63bea78d46bad2abb7677">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/04/23/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>For a required precision of +/- 3% the sample size that we would need is much smaller, but the shape of the curve (i.e.&nbsp;versus the success probability) remains the same.</p>
<p>Finally, the elephant in the room: of course we could answer this using our old friend <em>mathematics</em>, or with a bit of Bayesian thinking. But where’s the fun in that? This simulation approach lets us play around with parameters and assumptions (e.g.&nbsp;what would happen if we assume a difference confidence level <code>.alpha</code>?), and it’s quite intuitive too. And yeah, let’s be honest: programming and running the experiment is just <em>much</em> more fun!</p>
<p><em>As always, please do point out all my errors on <a href="https://twitter.com/ellessenne">Twitter</a>. Cheers!</em></p>



 ]]></description>
  <category>rstats</category>
  <category>binomial</category>
  <category>simulation</category>
  <category>statistics</category>
  <guid>https://www.ellessenne.xyz/blog/2021/04/23/</guid>
  <pubDate>Thu, 22 Apr 2021 22:00:00 GMT</pubDate>
</item>
<item>
  <title>A short ode to simulation</title>
  <link>https://www.ellessenne.xyz/blog/2021/02/26/</link>
  <description><![CDATA[ 




<p>I have been slowly catching up with talks from <a href="https://rstudio.com/conference/">rstudio::global</a>, and I was so impressed with a couple of presentations on statistical simulation that I had to write something about it!</p>
<blockquote class="blockquote">
<p>Yes, you read it right, you’re not dreaming: statistical simulation is now cool!</p>
</blockquote>
<p>As you might now, I am also a strong supporter of statistical simulation. Traditionally you might use it to test a new statistical method you are developing, or to compare different methods in unusual settings. You might also want to try to see where, when, and if a method breaks in edge cases.</p>
<p>However, two other (very important) use-cases are highlighted in the talks that I mentioned above:</p>
<ol type="1">
<li><p>Simulation for learning and teaching,</p></li>
<li><p>Simulation to drive agile infrastructures.</p></li>
</ol>
<p>The first scenario is described very well by <a href="https://twitter.com/ChelseaParlett">Chelsea Parlett-Pelleriti</a>:</p>
<div class="container d-flex align-items-center justify-content-center">
  <iframe width="560" height="315" src="https://www.youtube.com/embed/qaU2jXW2xcE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="">
  </iframe>
</div>
<p>There are three main take-home messages:</p>
<ol type="1">
<li><p>Statistical simulation encourages exploration,</p></li>
<li><p>Tests intuition,</p></li>
<li><p>And empowers a deeper understanding of complex statistical methods.</p></li>
</ol>
<p>I stand by all of these points.</p>
<p>In fact, that’s a great and accessible way to learn: by trying, and trying, and trying again until you finally get it. And if you follow a simulation exercise, it’s even easier: you can modify the parameters at will (starting from a solid foundation), explore, and follow your intuition. I mean, isn’t this the <em>scientific method</em> all along?</p>
<p>The second talk is by <a href="https://twitter.com/richard_vogg">Richard Vogg</a>:</p>
<div class="container d-flex align-items-center justify-content-center">
  <iframe width="560" height="315" src="https://www.youtube.com/embed/_ahUlw0HzSc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="">
  </iframe>
</div>
<p>During the talk, he gives a couple of examples that highlight the importance of being able to <em>compose data</em> at will:</p>
<ol type="1">
<li><p>When you cannot share the real dataset (e.g.&nbsp;for privacy reason), it is useful to have <em>real-like</em> data that can be shared more freely. There’s a decent amount of research (and growing interest) on the topic, see e.g.&nbsp;<a href="https://elifesciences.org/articles/53275">this paper</a> by <a href="https://twitter.com/dsquintana">Dan Quintana</a>;</p></li>
<li><p>When you don’t have the data you’re supposed to be building pipelines for (yet), it is useful to have data that is similar to what you expect receiving. That gives you a head start to start e.g.&nbsp;prototyping;</p></li>
<li><p>When you’re running internal training courses for staff, it is useful to use datasets that resemble what you actually work with (e.g.&nbsp;transactions, clients, etc.).</p></li>
</ol>
<p>…the more you think about it, the more you realise how useful that can be!</p>
<p>In conclusion, both talks are excellent, straight to the point, and engaging. I recommend you give it a look, it will (literally) take just 10 minutes of your time.</p>
<p>See you soon!</p>



 ]]></description>
  <category>simulation</category>
  <category>statistics</category>
  <category>learning</category>
  <guid>https://www.ellessenne.xyz/blog/2021/02/26/</guid>
  <pubDate>Thu, 25 Feb 2021 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Simulating survival times from a mixture cure model</title>
  <link>https://www.ellessenne.xyz/blog/2021/01/17/</link>
  <description><![CDATA[ 




<p>Last Friday I was talking with a friend about simulating survival times from an experiment where most (if not all) events are happening within a short amount of time from inception. In other words, an experiment where the survival probability drops rapidly during the first days, staying then approximately flat until the end of the study.</p>
<p>This made me think about mixed cure fraction models, that is a model of the kind:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0AS(t)%20=%20%5Cpi%20+%20(1%20-%20%5Cpi)%20S_u(t)%0A"></p>
<p>with <img src="https://latex.codecogs.com/png.latex?%5Cpi"> being the proportion cured and <img src="https://latex.codecogs.com/png.latex?S_u(t)"> the survival function for the uncured subjects. Assuming <img src="https://latex.codecogs.com/png.latex?S_u(t)"> follows an exponential distribution (for simplicity and from now onward), this corresponds to a survival curve like the following:</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/01/17/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>where <img src="https://latex.codecogs.com/png.latex?%5Cpi%20=%200.1">. With this model, the cure fraction creates an asymptote for the survival function, as <img src="https://latex.codecogs.com/png.latex?100%20%5Ctimes%20%5Cpi">% of subjects are deemed to be <em>cured</em>; of course, <img src="https://latex.codecogs.com/png.latex?%5Cpi"> can be estimated from data. Interpretation of mixed cure fraction models is awkward, as one is assuming that subjects are split into <em>cured</em> and <em>uncured</em> at the beginning of the follow-up (which might not be the most realistic assumption), but discussing cure models is outside the scope of this blog post (and there are many papers you could find on the topic).</p>
<blockquote class="blockquote">
<p>Back to the question, how to simulate from this survival function?</p>
</blockquote>
<p>Well, as with most things these days, we can use the inversion method! First, we actually have to define which individuals are cured and which are not. To do so, we draw from a Bernoulli distribution with success probability <img src="https://latex.codecogs.com/png.latex?%5Cpi">:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4875</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># for reproducibility</span></span>
<span id="cb1-2">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100000</span></span>
<span id="cb1-3">pi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span></span>
<span id="cb1-4">cured <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rbinom</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prob =</span> pi)</span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">prop.table</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">table</span>(cured))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>cured
      0       1 
0.80074 0.19926 </code></pre>
</div>
</div>
<p>Then, we simulate survival times e.g.&nbsp;from an exponential distribution (with <img src="https://latex.codecogs.com/png.latex?%5Clambda%20=%200.5">). We use the formulae described in <a href="https://onlinelibrary.wiley.com/doi/10.1002/sim.2059">Bender <em>et al.</em></a> (which is in fact the inversion method applied to survival functions):</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">lambda <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb3-2">u <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n)</span>
<span id="cb3-3">S <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(u) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> lambda</span></code></pre></div></div>
</div>
<p>Finally, we assign to cured individuals an infinite survival time (as they will not experience the event ever):</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">S[cured <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">Inf</span></span></code></pre></div></div>
</div>
<p>Of course, now we are required to censor individuals, say after 20 years of follow-up:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">status <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> S <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span></span>
<span id="cb5-2">S <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pmin</span>(S, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>)</span>
<span id="cb5-3">df <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(S, status)</span></code></pre></div></div>
</div>
<p>…and there you have it, it’s done.</p>
<p>What we want to do next is checking that we are actually simulating from the model we think we are simulating from. Let’s first check by fitting a Kaplan-Meier curve:</p>
<div class="cell" data-layout-align="center">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(survival)</span>
<span id="cb6-2">KM <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">survfit</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Surv</span>(S, status) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> df)</span>
<span id="cb6-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(KM)</span>
<span id="cb6-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">abline</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">h =</span> pi, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2021/01/17/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>…which looks similar to the theoretical curve depicted above. Good!</p>
<p>Then, we could try to fit the data-generating model using maximum likelihood. The density function of the cure model is defined as</p>
<p><img src="https://latex.codecogs.com/png.latex?%0Af(t)%20=%20%5Cfrac%7B%5Cpartial%20F(t)%7D%7B%5Cpartial%20t%7D%20=%20(1%20-%20%5Cpi)%20%5Ctimes%20f_u(t)%0A"></p>
<p>where <img src="https://latex.codecogs.com/png.latex?F(t)%20=%201%20-%20S(t)"> and <img src="https://latex.codecogs.com/png.latex?f_u(t)"> is the density function for the non-cured subjects (e.g.&nbsp;from the exponential distribution, from our example above). Therefore, the likelihood contribution for the i<sup>th</sup> subject can be defined as:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0AL_i(%5Ctheta%20%7C%20t_i,%20d_i)%20=%20f(t_i)%5E%7Bd_i%7D%20%5Ctimes%20S(t_i)%5E%7B1%20-%20d_i%7D%0A"></p>
<p>In R, that (actually, the log-likelihood) can be defined as:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">log_likelihood <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(par, t, d) {</span>
<span id="cb7-2">  lambda <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(par[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb7-3">  pi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> boot<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inv.logit</span>(par[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>])</span>
<span id="cb7-4">  f <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> pi) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> lambda <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>lambda <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> t)</span>
<span id="cb7-5">  S <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> pi <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> pi) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>lambda <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> t)</span>
<span id="cb7-6">  L <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> (f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>d) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (S<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> d))</span>
<span id="cb7-7">  l <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(L)</span>
<span id="cb7-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">return</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(l))</span>
<span id="cb7-9">}</span></code></pre></div></div>
</div>
<p>Some comments on the above function:</p>
<ul>
<li><p>I chose to model the parameter <img src="https://latex.codecogs.com/png.latex?%5Clambda"> on the log-scale, and <img src="https://latex.codecogs.com/png.latex?%5Cpi"> on the logit scale. By doing so, we don’t have to constrain the optimisation routine to respect the boundaries of the parameters, as <img src="https://latex.codecogs.com/png.latex?%5Clambda%20%3E%200"> and <img src="https://latex.codecogs.com/png.latex?0%20%5Cle%20%5Cpi%20%5Cle%201">;</p></li>
<li><p><code>f</code> contains the density function of an exponential distribution (which is <img src="https://latex.codecogs.com/png.latex?h(t)%20%5Ctimes%20S(t)">);</p></li>
<li><p><code>S</code> contains the survival function of an exponential distribution;</p></li>
<li><p>I take the log of the likelihood function (which generally behaves better when optimising numerically) and return the negative log-likelihood value (as R’s <code>optim</code> minimises the objective function by default).</p></li>
</ul>
<p>Now, all we have to do is define some starting values <code>stpar</code> and then run <code>optim</code>.</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">stpar <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ln_lambda =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">logit_pi =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb8-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">optim</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> stpar, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fn =</span> log_likelihood, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">t =</span> df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>S, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">d =</span> df<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>status, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"L-BFGS-B"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>$par
 ln_lambda   logit_pi 
-0.6945243 -1.3909736 

$value
[1] 185584.3

$counts
function gradient 
       8        8 

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F &lt;= FACTR*EPSMCH"</code></pre>
</div>
</div>
<p>The true values were:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(lambda)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] -0.6931472</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">boot<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logit</span>(pi)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] -1.386294</code></pre>
</div>
</div>
<p>…which (once again) are close enough to the fitted values. It does seem like we are actually simulating from the mixture cure model described above.</p>
<p>In conclusion, obviously this might not fit the actual experiment too well and therefore we might have to try other approaches before considering our work done. For instance, we could simulate from an hazard function that goes to zero after a given amount of time <img src="https://latex.codecogs.com/png.latex?t%5E*">: this approach can easily be implemented using the <a href="https://CRAN.R-project.org/package=simsurv">{simsurv}</a> package. <em>The proof is left as an exercise for the interested reader</em>.</p>
<p>Cheers!</p>



 ]]></description>
  <category>rstats</category>
  <category>survival-analysis</category>
  <category>simulation</category>
  <category>cure</category>
  <guid>https://www.ellessenne.xyz/blog/2021/01/17/</guid>
  <pubDate>Sat, 16 Jan 2021 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Will R work on Apple Silicon?</title>
  <link>https://www.ellessenne.xyz/blog/2020/11/18/</link>
  <description><![CDATA[ 




<p>Apple recently released new <em>entry level</em> devices using Apple-designed chips based on the ARM64 architecture, the so-called <em>Apple Silicon</em> that was <a href="https://en.wikipedia.org/wiki/Mac_transition_to_Apple_Silicon">introduced at WWDC earlier this year</a>.</p>
<p>Reviews just started coming out, and everyone seems to be praising the performance of both optimised and translated software running on these new low-powered devices. Synthetic benchmarks and real-life tests are matching the performance of e.g.&nbsp;a MacBook Air to a 2019 16” MacBook Pro or even a 2017 iMac Pro. Which is mad!</p>
<section id="should-i-just-run-out-and-buy-a-new-mac" class="level2">
<h2 class="anchored" data-anchor-id="should-i-just-run-out-and-buy-a-new-mac">Should I just run out and buy a new Mac?</h2>
<p>As an R user, should you just run out and buy a new macOS device with Apple Silicon? Well, not so fast, tiger! A <a href="https://developer.r-project.org/Blog/public/2020/11/02/will-r-work-on-apple-silicon/index.html">post on the R-developer blog</a> dives deeper on the current status of R on Apple Silicon:</p>
<ul>
<li><p>R seems to be running fine through the translation layer, but of course that is not optimised and performance <em>should</em> be worse than running natively;</p></li>
<li><p>R can already run on ARM32/ARM64 devices. Heck, it can even run on a Raspberry Pi Zero with a single-core 1 GHz CPU and 512 MB of RAM! However, before it can be compiled to Apple Silicon the whole stack of compilers will need to be updated/ported to the new architecture; work seems to be underway, so I believe it won’t take long.</p></li>
<li><p>There are inconsistencies with how <code>NA</code>/<code>NaN</code> are handled and propagated, but that is platform-specific and it is already <em>complicated</em> in the x86/x64 world. Writing code that can reliably preserve <code>NA</code> and <code>NaN</code> values will require ad-hoc checks.</p></li>
</ul>
<p>This is just my quick executive summary, read through the <a href="https://developer.r-project.org/Blog/public/2020/11/02/will-r-work-on-apple-silicon/index.html">blog post above</a> for more details.</p>
</section>
<section id="some-personal-considerations" class="level2">
<h2 class="anchored" data-anchor-id="some-personal-considerations">Some personal considerations</h2>
<p>If you know me, you know how much I like tiny, silent computers. I mean, the <a href="https://www.caseyliss.com/2017/6/25/macbook-adorable">MacBook Adorable</a> is my favourite macOS laptop I never got to own and I got <em>so</em> close to buying a Surface Go 2 when it was announced earlier this year. If I had to get a desktop computer, I would probably get an Intel NUC or build into a small Mini-ITX case. And don’t get me started on Raspberry Pi boards! You bet I am very excited about the idea of a next-generation, fanless MacBook Air with great performance and excellent battery life!</p>
<p>Another potentially big thing is the <em>Neural Engine</em> that is embedded into Apple Silicon chips. Will R be able to link to that dedicated chip to accelerate machine learning (and potentially other) computations? Time will tell…</p>
<p>Finally, my understanding is that tools like <code>brew</code> and <code>git</code> are currently broken(-ish) on Apple Silicon. I am sure they will be updated for the new architecture soon, though.</p>
<p>Nevertheless, I am happy that the Mac is now an exciting platform once again after years of stagnation. I look forward to seeing how this evolves in the coming months, and I am definitely not upgrading my current laptop any time soon.</p>
</section>
<section id="update-1-2020-11-23" class="level2">
<h2 class="anchored" data-anchor-id="update-1-2020-11-23">Update #1: 2020-11-23</h2>
<p>There is now <a href="https://www.mail-archive.com/r-sig-mac@r-project.org/msg05763.html">a thread</a> on the R-SIG-Mac mailing list where Prof.&nbsp;Brian Ripley gives first impressions and benchmarks on building CRAN’s R 4.0.3 on a M1 MacBook Air with 8 GB of RAM.</p>
<p>First results seems very promising: building on top of Rosetta is actually faster than building on a 2016 2.0 GHz i5 MacBook Pro, but lots of work remains to be done to have a build ready for general use.</p>
<p>Furthermore, did you see those <a href="https://machinelearning.apple.com/updates/ml-compute-training-on-mac">performance benchmarks for Mac-optimised TensorFlow training on Apple Silicon</a>? I guess <em>yes</em>, machine learning computations will be greatly accelerated on new Apple hardware once fully optimised!</p>


</section>

 ]]></description>
  <category>rstats</category>
  <category>apple</category>
  <guid>https://www.ellessenne.xyz/blog/2020/11/18/</guid>
  <pubDate>Tue, 17 Nov 2020 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Multiple linear regression with {TMB}</title>
  <link>https://www.ellessenne.xyz/blog/2020/10/</link>
  <description><![CDATA[ 




<p>I have been recently reading about the <a href="https://CRAN.R-project.org/package=TMB">TMB framework</a> to implement statistical models using automatic differentiation (and with a bunch of nice built-in features).</p>
<p><a href="https://en.wikipedia.org/wiki/Automatic_differentiation">Automatic differentiation</a> (in brief) is an algorithmic method that <em>automagically</em> and efficiently yields accurate derivatives of a given value. This is very interesting, as by using automatic differentiation we can easily get gradients and the Hessian matrix. Which are extremely useful when fitting models using the maximum likelihood method! On top of that, automatic differentiation is generally more efficient and accurate than symbolic or numerical differentiation.</p>
<p>There’s a bunch of examples of using {TMB} in practice, some are very simple and straightforward to follow, some are much more complex. I am just getting started, so I went straight to the <a href="https://kaskr.github.io/adcomp/linreg_8cpp-example.html">linear regression example</a> on the TMB documentation webpage. The C++ template for that simple model is:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode cpp code-with-copy"><code class="sourceCode cpp"><span id="cb1-1"><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">#include </span><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">&lt;TMB.hpp&gt;</span></span>
<span id="cb1-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">template</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> Type<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span></span>
<span id="cb1-3">Type objective_function<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span>Type<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;::</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">operator</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">()</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">()</span></span>
<span id="cb1-4"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb1-5">  DATA_VECTOR<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>Y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb1-6">  DATA_VECTOR<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb1-7">  PARAMETER<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>a<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb1-8">  PARAMETER<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>b<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb1-9">  PARAMETER<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>logSigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb1-10">  ADREPORT<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>exp<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>logSigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">));</span></span>
<span id="cb1-11">  Type nll <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>sum<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>dnorm<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>Y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">,</span> a<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>b<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">,</span> exp<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>logSigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">),</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">true</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">));</span></span>
<span id="cb1-12">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> nll<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span>
<span id="cb1-13"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span></span></code></pre></div></div>
<p>…which can be compiled from within R and passed to any general-purpose optimiser (such as <code>optim</code>) to obtain maximum likelihood estimates for a linear regression model.</p>
<p>It is however interesting to generalise this to any number of covariates, using matrix-by-array multiplication to efficiently scale up any problem. This is relevant to implement general statistical modelling packages. Surprisingly, I had to fiddle around way more than I expected to do that: therefore, I thought about writing a blog post (which hopefully could be useful to others trying to get started with TMB)!</p>
<p><em>Disclaimer:</em> I am no {TMB} nor C++ expert, so forgive me if I am missing something and <a href="https://twitter.com/ellessenne">let me know</a> if there’s anything that needs fixing here.</p>
<p>So, let’s get started. We first need to write our C++ function for the negative log-likelihood function, which is a simple adaptation of the template we saw before for a single-covariate model:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode cpp code-with-copy"><code class="sourceCode cpp"><span id="cb2-1"><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">#include </span><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">&lt;TMB.hpp&gt;</span></span>
<span id="cb2-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">template</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> Type<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span></span>
<span id="cb2-3">Type objective_function<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span>Type<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;::</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">operator</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">()</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">()</span></span>
<span id="cb2-4"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb2-5">  DATA_VECTOR<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>Y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb2-6">  DATA_MATRIX<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>X<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb2-7">  PARAMETER_VECTOR<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>b<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb2-8">  PARAMETER<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>logSigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">);</span></span>
<span id="cb2-9">  Type nll <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sum<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>dnorm<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>Y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">,</span> X<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>b <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">,</span> exp<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">(</span>logSigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">),</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">true</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">));</span></span>
<span id="cb2-10">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>nll<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span>
<span id="cb2-11"><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span></span></code></pre></div></div>
<p>Here we needed to change <code>X</code> to a <code>DATA_MATRIX()</code>, and <code>b</code> to a <code>PARAMETER_VECTOR</code>. Note that <code>X</code> here should be the design matrix of the model, e.g.&nbsp;obtained using the <code>model.matrix()</code> function in R, and that the matrix-array multiplication in C++ <code>X*b</code> is <em>not</em> the element-wise operation, and is equivalent to <code>X %*% b</code> in R. This is it: easy right?</p>
<p>Let’s now verify that this works as expected. We can compile the C++ function and dynamically link it using the tools from the {TMB} package (assuming <code>model.cpp</code> contains the function defined above):</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(TMB)</span>
<span id="cb3-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">compile</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"model.cpp"</span>)</span>
<span id="cb3-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">dyn.load</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">dynlib</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"model"</span>))</span></code></pre></div></div>
<p>We need to simulate some data now, e.g.&nbsp;with a single covariate for simplicity:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">42</span>)</span>
<span id="cb4-2">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span></span>
<span id="cb4-3">x <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mean =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sd =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb4-4">e <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n)</span>
<span id="cb4-5">b0 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span></span>
<span id="cb4-6">b1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb4-7">y <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> b0 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> b1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> e</span></code></pre></div></div>
</div>
<p>We create a dataset and a model matrix too:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">df <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(y, x)</span>
<span id="cb5-2">X <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">model.matrix</span>(y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> df)</span>
<span id="cb5-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(X)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>  (Intercept)        x
1           1 63.70958
2           1 44.35302
3           1 53.63128
4           1 56.32863
5           1 54.04268
6           1 48.93875</code></pre>
</div>
</div>
<p>Now we can use the <code>MakeADFun()</code> from {TMB} to construct the R object that brings all of this together:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">f <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MakeADFun</span>(</span>
<span id="cb7-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">X =</span> X, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Y =</span> y),</span>
<span id="cb7-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">b =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ncol</span>(X)), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">logSigma =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>),</span>
<span id="cb7-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">silent =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb7-5">)</span></code></pre></div></div>
</div>
<p>This passes the data <code>X</code> and <code>y</code> and sets the default values of the model parameters to zeros for all regression coefficients (<code>b0</code> and <code>b1</code> in this case) and for the log of the standard deviation of the residual errors.</p>
<p>We can now easily get the value of the (negative) log-likelihood function at the default parameters, gradients, and the Hessian matrix:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fn</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>par)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 51351.44</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gr</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>par)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>          [,1]      [,2]      [,3]
[1,] -9994.682 -497251.6 -99865.01</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">he</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>par)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>         [,1]       [,2]      [,3]
[1,]  1000.00   49741.76  19989.36
[2,] 49741.76 2574646.64 994503.22
[3,] 19989.36  994503.22 201730.02</code></pre>
</div>
</div>
<p>The cool part is, we didn’t even have to define analytical formulae for gradients and the Hessian and we got it <em>for free</em> with automatic differentiation! Now all we have to do is to pass the negative log-likelihood function and the gradients to e.g.&nbsp;<code>optim()</code> and that’s all:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1">fit <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">optim</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>par, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fn =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>fn, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">gr =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>gr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"L-BFGS-B"</span>)</span>
<span id="cb14-2">fit</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>$par
           b            b     logSigma 
 9.945838718  0.000981937 -0.014587405 

$value
[1] 1404.351

$counts
function gradient 
      35       35 

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F &lt;= FACTR*EPSMCH"</code></pre>
</div>
</div>
<p>If we exponentiate the value of <code>log(Sigma)</code>, we obtain the standard deviation of the residuals on the proper scale: 0.9855. Compare this with the true values (<code>b0</code> = 10, <code>b1</code> = 0, <code>log(Sigma)</code> = 0) and with the results of the least squares estimator:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lm</span>(y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> df))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
Call:
lm(formula = y ~ x, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9225 -0.6588 -0.0083  0.6628  3.5877 

Coefficients:
             Estimate Std. Error t value Pr(&gt;|t|)    
(Intercept) 9.9458454  0.1579727  62.959   &lt;2e-16 ***
x           0.0009818  0.0031133   0.315    0.753    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9865 on 998 degrees of freedom
Multiple R-squared:  9.964e-05, Adjusted R-squared:  -0.0009023 
F-statistic: 0.09945 on 1 and 998 DF,  p-value: 0.7526</code></pre>
</div>
</div>
<p>We are getting really close here!</p>
<p>Let’s now generalise this to multiple covariates, say 4 normally-distributed covariates:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb18-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set.seed</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">42</span>)</span>
<span id="cb18-2">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span></span>
<span id="cb18-3">x1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mean =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sd =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb18-4">x2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mean =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sd =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb18-5">x3 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mean =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sd =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb18-6">x4 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mean =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">40</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">sd =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb18-7">e <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> n)</span>
<span id="cb18-8">b0 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span></span>
<span id="cb18-9">b1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb18-10">b2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb18-11">b3 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span></span>
<span id="cb18-12">b4 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span></span>
<span id="cb18-13">y <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> b0 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> b1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> x1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> b2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> x2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> b3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> x3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> b4 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> x4 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> e</span>
<span id="cb18-14">df <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(y, x1, x2, x3, x4)</span>
<span id="cb18-15">X <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">model.matrix</span>(y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> x2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> x3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> x4, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> df)</span>
<span id="cb18-16"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(X)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>  (Intercept)        x1       x2        x3       x4
1           1 23.709584 43.25058 32.505781 33.14338
2           1  4.353018 25.24122 27.220760 32.07286
3           1 13.631284 29.70733 12.752643 35.92996
4           1 16.328626 23.76973  9.932951 28.51329
5           1 14.042683 10.04067 17.081917 51.15760
6           1  8.938755 14.02517 33.658382 31.20543</code></pre>
</div>
</div>
<p>All we have to do is re-create the <code>f</code> object using the <code>MakeADFun()</code> function and the new data:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1">f <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MakeADFun</span>(</span>
<span id="cb20-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">X =</span> X, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">Y =</span> y),</span>
<span id="cb20-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">parameters =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">b =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ncol</span>(X)), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">logSigma =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>),</span>
<span id="cb20-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">silent =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb20-5">)</span></code></pre></div></div>
</div>
<p>Let’s now fit this with <code>optim()</code> and compare with the least squares estimator:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">optim</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>par, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fn =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>fn, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">gr =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>gr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"L-BFGS-B"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>$par
         b          b          b          b          b   logSigma 
9.90325169 0.99992371 2.00125295 2.99795882 4.00296201 0.01661648 

$value
[1] 1435.576

$counts
function gradient 
     103      103 

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F &lt;= FACTR*EPSMCH"</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb23-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">summary</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lm</span>(y <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> x1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> x2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> x3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> x4, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> df))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
Call:
lm(formula = y ~ x1 + x2 + x3 + x4, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3354 -0.7382  0.0161  0.6893  3.0315 

Coefficients:
            Estimate Std. Error t value Pr(&gt;|t|)    
(Intercept) 9.903788   0.177893   55.67   &lt;2e-16 ***
x1          0.999929   0.003218  310.72   &lt;2e-16 ***
x2          2.001249   0.003277  610.69   &lt;2e-16 ***
x3          2.997959   0.003134  956.53   &lt;2e-16 ***
x4          4.002949   0.003266 1225.83   &lt;2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.019 on 995 degrees of freedom
Multiple R-squared:  0.9997,    Adjusted R-squared:  0.9997 
F-statistic: 7.294e+05 on 4 and 995 DF,  p-value: &lt; 2.2e-16</code></pre>
</div>
</div>
<p>Very close once again, and if you noticed, <em>we didn’t have to change a single bit of code from the first example with a single covariate</em>. Isn’t that cool?</p>
<p><em>I know, I know, maybe it’s cool for a specific type of person only, but hey…</em></p>
<p>Another nice thing is that given that we have gradients (in compiled code!), maximising the likelihood will be faster and generally more accurate if using a method that relies on gradients, such as the limited-memory modification of the BFGS quasi-Newton method that we chose here. See the following benchmark as a comparison:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb25-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(bench)</span>
<span id="cb25-2">bench<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mark</span>(</span>
<span id="cb25-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"with gradients"</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">optim</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>par, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fn =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>fn, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">gr =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>gr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"L-BFGS-B"</span>),</span>
<span id="cb25-4">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"without gradients"</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">optim</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">par =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>par, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fn =</span> f<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>fn, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">method =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"L-BFGS-B"</span>),</span>
<span id="cb25-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">iterations =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>,</span>
<span id="cb25-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">relative =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>,</span>
<span id="cb25-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">check =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span></span>
<span id="cb25-8">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 2 × 6
  expression          min median `itr/sec` mem_alloc `gc/sec`
  &lt;bch:expr&gt;        &lt;dbl&gt;  &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;
1 with gradients     1      1         3.85         1      NaN
2 without gradients  3.89   3.83      1            1      Inf</code></pre>
</div>
</div>
<p>In this case (a small and simple toy problem) the optimisation using gradients is ~4 times faster (on my quad-core MacBook Pro with 16 Gb of RAM). And all of that for free, just using automatic differentiation!</p>
<p>Remember that, if not provided with a gradient function, <code>optim()</code> will use finite differences to calculate the gradients numerically. Then, the benchmark above gives us a direct comparison with numerical differentiation too.</p>
<p>To wrap up, hats off to the creators of the {TMB} package for providing such an easy and powerful framework, and I didn’t even scratch the surface here: this is just a quick and dirty toy example with multiple linear regression, which doesn’t event needs maximum likelihood.</p>
<p>Anyway, I’m definitely going to come back to automatic differentiation and {TMB} in future blog posts, stay tuned for that. Cheers!</p>
<p><em>Update:</em> <a href="https://twitter.com/Benjami41993499/status/1312364776329273351?s=20">Benjamin Christoffersen shared on Twitter</a> a link to a thread on Cross Validated where the differences between maximum likelihood and least squares are discussed in more detail. It’s a very interesting read, remember to <a href="https://stats.stackexchange.com/questions/222233/residual-standard-error-difference-between-optim-and-glm/368510#368510">check it out</a>!</p>



 ]]></description>
  <category>rstats</category>
  <category>TMB</category>
  <guid>https://www.ellessenne.xyz/blog/2020/10/</guid>
  <pubDate>Wed, 30 Sep 2020 22:00:00 GMT</pubDate>
</item>
<item>
  <title>September packages updates</title>
  <link>https://www.ellessenne.xyz/blog/2020/09/03/</link>
  <description><![CDATA[ 




<p>Hey! I hope y’all had a good summer, it sure has been <em>something</em>…</p>
<p>Let’s get straight to business: this is a short post to announce that new releases of {rsimsum} and {KMunicate} just landed on CRAN! {rsimsum} is now at version 0.9.1, while {KMunicate} is now at version 0.1.0.</p>
<p>Both are mostly maintenance releases: some small bugs have been squashed, and new (hopefully useful) customisation options have been added to {KMunicate}; some typos in the documentation have been fixed too. If you want to read more, all details can be found in the NEWS files of <a href="https://cran.rstudio.com/web/packages/rsimsum/news/news.html">{rsimsum}</a> and <a href="https://cran.rstudio.com/web/packages/KMunicate/news/news.html">{KMunicate}</a>.</p>
<p>If you have either package already installed, just use <code>update.packages()</code> to obtain the new version; otherwise, <code>install.packages("rsimsum")</code> and <code>install.packages("KMunicate")</code> will do the trick. As always, feedback on the new releases is very much appreciated.</p>
<p>That’s all: I promised this was going to be short, and less than 150 words later I believe I delivered. But don’t worry, I’ll be back soon (<em>-ish</em>) with a follow-up on my post on academic conferences in the year 2020 after a whole season of remote conferences. In the meanwhile, take care and be safe!</p>



 ]]></description>
  <category>rstats</category>
  <category>updates</category>
  <category>rsimsum</category>
  <category>KMunicate</category>
  <guid>https://www.ellessenne.xyz/blog/2020/09/03/</guid>
  <pubDate>Wed, 02 Sep 2020 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Introducing the {KMunicate} package</title>
  <link>https://www.ellessenne.xyz/blog/2020/08/07/</link>
  <description><![CDATA[ 




<p>A few weeks ago, <a href="https://twitter.com/tmorris_mrc/status/1281330077217824769">Tim Morris launched a competition on Twitter</a> challenging his followers to write some code to create <a href="https://bmjopen.bmj.com/content/9/9/e030215">KMunicate-style</a> Kaplan-Meier plots. I (obviously) took on the challenge, and I must admit: I got <em>slightly</em> carried away… hence now introducing the {KMunicate} R package.</p>
<p>{KMunicate} is now on <a href="https://CRAN.R-project.org/package=KMunicate">CRAN</a>, and the development version lives on my <a href="https://github.com/ellessenne/KMunicate-package">GitHub</a> profile. You can install the CRAN version as usual:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"KMunicate"</span>)</span></code></pre></div></div>
<p>Alternatively, you can install the dev version of {KMunicate} from GitHub with:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># install.packages("devtools")</span></span>
<span id="cb2-2">devtools<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install_github</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ellessenne/KMunicate-package"</span>)</span></code></pre></div></div>
<section id="whats-the-kmunicate-style" class="level2">
<h2 class="anchored" data-anchor-id="whats-the-kmunicate-style">What’s the KMunicate style?</h2>
<p>KMunicate-style Kaplan-Meier plots include confidence intervals for each fitted curve and an extended table beneath the main plot including the number of individuals at risk at each time and the cumulative number of events and censoring events. Here’s an example from the KMunicate study itself:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://bmjopen.bmj.com/content/bmjopen/9/9/e030215/F5.large.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img" alt="Example of KMunicate-style plot from the KMunicate study"></p>
</figure>
</div>
<p>As you might imagine, that’s quite a lot of work to produce a plot like that.</p>
</section>
<section id="well-not-anymore" class="level2">
<h2 class="anchored" data-anchor-id="well-not-anymore">Well, not anymore!</h2>
<p>The {KMunicate} package lets you create such a plot with a <em>single line of code</em>. Isn’t that great?</p>
<p>Let’s illustrate the basic functionality of {KMunicate} with an example. We’ll be using once again data from the German breast cancer study, which is conveniently bundled with {KMunicate}:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data</span>(brcancer, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"KMunicate"</span>)</span>
<span id="cb3-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(brcancer)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>  id hormon x1 x2 x3 x4 x5 x6  x7 rectime censrec x4a x4b        x5e
1  1      0 70  2 21  2  3 48  66    1814       1   1   0 0.69767630
2  2      1 56  2 12  2  7 61  77    2018       1   1   0 0.43171051
3  3      1 58  2 35  2  9 52 271     712       1   1   0 0.33959553
4  4      1 59  2 17  2  4 60  29    1807       1   1   0 0.61878341
5  5      0 73  2 35  2  1 26  65     772       1   1   0 0.88692045
6  6      0 32  1 57  3 24  0  13     448       1   1   1 0.05613476</code></pre>
</div>
</div>
<p>The survival time is in <code>rectime</code>, and the event indicator variable is <code>censrec</code>; the treatment variable is <code>hormon</code>, a binary covariate.</p>
<p>First, we fit the survival curve by treatment arm using the Kaplan-Meier estimator:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(survival)</span>
<span id="cb5-2">fit <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">survfit</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Surv</span>(rectime, censrec) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> hormon, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> brcancer)</span>
<span id="cb5-3">fit</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Call: survfit(formula = Surv(rectime, censrec) ~ hormon, data = brcancer)

           n events median 0.95LCL 0.95UCL
hormon=0 440    205   1528    1296    1814
hormon=1 246     94   2018    1918      NA</code></pre>
</div>
</div>
<p>The plot that can be obtained via the <code>plot</code> method is ok but needs a bit of work to be good enough for a publication. For instance, this is the default:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(fit)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="576"></p>
</figure>
</div>
</div>
</div>
<p><em>No bueno</em>, right? Let’s improve it a bit:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(fit, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lty =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">conf.int =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb8-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">legend</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"bottomleft"</span>,</span>
<span id="cb8-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">col =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lty =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,</span>
<span id="cb8-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Control"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Treatment"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bty =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"n"</span></span>
<span id="cb8-5">)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="576"></p>
</figure>
</div>
</div>
</div>
<p>This is better, but still not great: the area defined by the confidence intervals is not shaded, and there is still no risk table.</p>
<p>Here’s when the {KMunicate} package comes to the rescue. First, we need to define the breaks for the x-axis; the risk table with be computed at those breaks. Say we want breaks every year:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">time_breaks <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">max</span>(brcancer<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>rectime), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">by =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">365</span>)</span>
<span id="cb9-2">time_breaks</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1]    0  365  730 1095 1460 1825 2190 2555</code></pre>
</div>
</div>
<p>Then, all we have to do is to</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb11-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(KMunicate)</span>
<span id="cb11-3"></span>
<span id="cb11-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">KMunicate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fit =</span> fit, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time_scale =</span> time_breaks)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="576"></p>
</figure>
</div>
</div>
</div>
<p><em>Easy peasy!</em></p>
<p>We might want to get proper arm labels too:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">brcancer<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>hormon <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(brcancer<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>hormon, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">levels =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Control"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Treatment"</span>))</span>
<span id="cb12-2">fit <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">survfit</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Surv</span>(rectime, censrec) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> hormon, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> brcancer)</span>
<span id="cb12-3"></span>
<span id="cb12-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">KMunicate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fit =</span> fit, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time_scale =</span> time_breaks)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="576"></p>
</figure>
</div>
</div>
</div>
<p>Nice. Next, we’ll show how to customise the plot.</p>
</section>
<section id="customising-kmunicate-style-plots" class="level2">
<h2 class="anchored" data-anchor-id="customising-kmunicate-style-plots">Customising KMunicate-style plots</h2>
<p>First, we might want to customise colours to use a colour-blind friendly palette via the <code>.color_scale</code> and <code>.fill_scale</code> arguments:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">KMunicate</span>(</span>
<span id="cb13-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fit =</span> fit,</span>
<span id="cb13-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time_scale =</span> time_breaks,</span>
<span id="cb13-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.color_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_colour_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>),</span>
<span id="cb13-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.fill_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>)</span>
<span id="cb13-6">)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="576"></p>
</figure>
</div>
</div>
</div>
<p>Then, we might want to use a custom font, such as my latest obsession <a href="https://rubjo.github.io/victor-mono">Victor Mono</a>, via the <code>.ff</code> argument:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">KMunicate</span>(</span>
<span id="cb14-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fit =</span> fit,</span>
<span id="cb14-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time_scale =</span> time_breaks,</span>
<span id="cb14-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.color_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_colour_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>),</span>
<span id="cb14-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.fill_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>),</span>
<span id="cb14-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.ff =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Victor Mono"</span></span>
<span id="cb14-7">)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="576"></p>
</figure>
</div>
</div>
</div>
<p>Finally, we customise the overall theme using e.g.&nbsp;<code>theme_minimal</code> from the {ggplot2} package:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">KMunicate</span>(</span>
<span id="cb15-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fit =</span> fit,</span>
<span id="cb15-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time_scale =</span> time_breaks,</span>
<span id="cb15-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.color_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_colour_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>),</span>
<span id="cb15-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.fill_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>),</span>
<span id="cb15-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.ff =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Victor Mono"</span>,</span>
<span id="cb15-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.theme =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_family =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Victor Mono"</span>)</span>
<span id="cb15-8">)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="576"></p>
</figure>
</div>
</div>
</div>
<p>When overriding the default theme, we need to re-define the font for the main plot using the <code>base_family</code> argument of a <code>theme_*</code> component. Overall, I think this is a much better plot!</p>
</section>
<section id="exporting-plots" class="level2">
<h2 class="anchored" data-anchor-id="exporting-plots">Exporting plots</h2>
<p>The final step consists of exporting a plot for later use, e.g.&nbsp;in manuscripts or presentations. That’s straightforward, being the output of <code>KMunicate()</code> a <code>ggplot2</code>-type object: all we have to do is use the <code>ggplot2::ggsave</code> function, e.g.&nbsp;in the next block of code.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1">p <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">KMunicate</span>(</span>
<span id="cb16-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fit =</span> fit,</span>
<span id="cb16-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">time_scale =</span> time_breaks,</span>
<span id="cb16-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.color_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_colour_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>),</span>
<span id="cb16-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.fill_scale =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_brewer</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">type =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"qual"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Dark2"</span>),</span>
<span id="cb16-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.ff =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Victor Mono"</span>,</span>
<span id="cb16-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">.theme =</span> ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">base_family =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Victor Mono"</span>)</span>
<span id="cb16-8">)</span>
<span id="cb16-9"></span>
<span id="cb16-10">ggplot2<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggsave</span>(p, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filename =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"export.png"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">height =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">dpi =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">300</span>)</span></code></pre></div></div>
</section>
<section id="closing-remarks" class="level2">
<h2 class="anchored" data-anchor-id="closing-remarks">Closing remarks</h2>
<p>Further details on {KMunicate} can be found on <a href="https://ellessenne.github.io/KMunicate-package">its website</a>, with more examples and a better explanation of the different arguments and customisation options. Let me know if you find the package useful, and if you find any bug (I’m sure there’ll be some) please <a href="https://github.com/ellessenne/KMunicate-package/issues">file an issue on GitHub</a>.</p>
<p>And what about Tim’s challenge that led to the inception of {KMunicate}, you might ask? Well, I got myself a beautiful hand-crafted wooden spoon:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/08/07/spoon.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img" alt="Hand-crafted wooden spoon by Tim."></p>
</figure>
</div>
<p>Isn’t that great!?</p>


</section>

 ]]></description>
  <category>rstats</category>
  <category>KMunicate</category>
  <category>release</category>
  <category>survival-analysis</category>
  <guid>https://www.ellessenne.xyz/blog/2020/08/07/</guid>
  <pubDate>Thu, 06 Aug 2020 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Weekly blog news</title>
  <link>https://www.ellessenne.xyz/blog/2020/06/04/</link>
  <description><![CDATA[ 




<p>…here we go again with the weekly update on blog maintenance and housekeeping. I promise the fun content will be back soon!</p>
<section id="talks-section-is-live" class="level3">
<h3 class="anchored" data-anchor-id="talks-section-is-live">Talks section is live</h3>
<p>As I mentioned in the previous blog post, I have been working on a new section will a list of talks I have given in the past few years, including slides (whenever possible).</p>
<p>Well, that section now live! The cool thing is that it builds dynamically using Hugo content, and it can easily be updated by simply adding a new markdown file with the appropriate YAML header. I think the overly popular <a href="https://sourcethemes.com/academic/">Academic theme</a> uses a similar approach for the various kind of contents it supports, but I am not sure.</p>
</section>
<section id="analytics-are-gone" class="level3">
<h3 class="anchored" data-anchor-id="analytics-are-gone">Analytics are gone</h3>
<p>I had been thinking about this for a while, and I finally decided to remove all analytics from all the websites I run: basically, this website and the <code>pkgdown</code> websites for the <a href="https://ellessenne.github.io/rsimsum/">{rsimsum}</a> and <a href="https://ellessenne.github.io/comorbidity/">{comorbidity}</a> packages.</p>
<p>I mean, it’s cool to see how many people visit each every month, and I was flattered to see people from all over the world accessing my website… but I don’t really need any of that data (funny thing, coming from a <em>data all the things</em> kind of person). You don’t need further tracking while you visit the web, that for sure.</p>
<p>Incidentally, I recently bought a <a href="https://www.raspberrypi.org/products/raspberry-pi-zero-w/">Raspberry Pi Zero W</a> on which I am now running <a href="https://pi-hole.net/">Pi-hole</a>, a DNS sinkhole that blocks an incredible amount of junk at a network level. I mean, look at all that crap:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.ellessenne.xyz/blog/2020/06/04/pi-hole-screenshot.png" class="img-fluid quarto-figure quarto-figure-center figure-img" alt="Screenshot of junk blocked by Pi-hole over the past 24 hours"></p>
</figure>
</div>
<p>… and that’s just over the past 24 hours!</p>
<p>An additional bonus is <em>less</em> JavaScript to load, which means that… the website is even faster! Yeah, I know, it must be an ongoing joke by now.</p>
</section>
<section id="anyway" class="level3">
<h3 class="anchored" data-anchor-id="anyway">Anyway</h3>
<p>I want to play around with the Raspberry Pi more, so I will be probably writing about it more. It’s a fun, tiny little computer that can do all sorts of computer stuff — and if you didn’t know, I have an irrational obsession for tiny and cute little computers!</p>
<p>I will keep it short: this is all for now, see you next week for a new episode of <em>blog updates weekly</em>, a new series on Netflix. Just kidding, Netflix is totally not paying for this… right? Happy to be proved wrong here though 😂 Cheers!</p>


</section>

 ]]></description>
  <category>blogdown</category>
  <category>hugo</category>
  <category>website</category>
  <category>housekeeping</category>
  <guid>https://www.ellessenne.xyz/blog/2020/06/04/</guid>
  <pubDate>Wed, 03 Jun 2020 22:00:00 GMT</pubDate>
</item>
</channel>
</rss>
