Moderna has announced encouraging preliminary results for their COVID-19 vaccine. In this post, I assess the available data and explain what the vaccine’s effectiveness really means. I also look at Moderna’s experimental design and examine how it incorporates statistical procedures and concepts that I discuss throughout my blog posts and books.

These concepts include experimental designs, random assignment, power and sample size considerations, controlling the false positive error rate, and the directional nature of the hypotheses being tested, amongst others.

My goal is to provide a scientist’s view of how researchers designed this experiment, analyzed the data, and obtained the results.

Importantly, this COVID-19 vaccine is showing promising results earlier than anticipated. I’ll cover the reasons why the study was able to obtain these results early and *safely*.

Pfizer reportedly has a vaccine that uses similar technology and has a similar effectiveness. In this post, I primarily look at Moderna’s COVID-19 vaccine. However, much of the discussion also applies to Pfizer’s COVID-19 vaccine.

## COVID-19 mRNA Vaccines

Vaccines teach your immune system to recognize an infection more quickly than it would naturally, allowing you to start fighting the virus before it replicates widely. Most vaccines work by introducing either a dead or weakened virus that your immune system detects and creates antibodies for. These antibodies render viruses harmless.

Moderna and Pfizer used new technology to create an mRNA (messenger RNA) COVID-19 vaccine. This type of vaccine inserts mRNA that contains the code for your body to create the distinctive spike protein on the surface of the SARS-CoV-2 virus. Your immune system detects this protein and makes the antibody for it. If the real virus enters your system, it will have this spike protein, and the antibodies that your immune system is already producing will know to attack it immediately.

Because the mRNA creates only the spike and not the rest of the virus, there is no possibility that this vaccine can infect you.

While the creation of this new type of vaccine appears rapid, there have already been many years of development put into the mRNA vaccine platform that laid the groundwork for creating these COVID-19 vaccines.

## Moderna’s COVE Phase 3 Study

The technology behind Moderna’s COVID-19 vaccine (aka candidate vaccine mRNA-1273) is new, but its experimental design uses the tried and true approach for experiments involving medications. This is a large-scale randomized study that uses blinding, a placebo, and stratification. Let’s take a look at these aspects of the experiment.

## Sampling Considerations and Characteristics

In inferential statistics, the goal is to use a sample to learn about populations. Scientific studies always use inferential statistics. Why? Scientists don’t want to understand how well the vaccine (in this case) works in just their relatively small group of participants. They need to generalize the results to the larger population. Using inferential statistics, scientists can learn whether the vaccine will work in the population at large and estimate its effectiveness.

To learn more about inferential statistics, read my post about inferential vs. descriptive statistics, or get my Introduction to Statistics book.

The sample must represent the target populations for which the scientists want to generalize the results. With this in mind, Moderna intentionally recruited a representative sample of racial and ethnic minority participants in the study. The sample includes a mix of younger low risk, younger higher risk, and older at-risk participants.

The study uses a stratified random sample. Stratification ensures that each stratum, or subgroup, has sufficient representation within the sample so analysts can assess each of these subgroups. For this study, the scientists want to determine both the vaccine’s effectiveness and safety for these strata.

In this study, there are three strata:

- ≥ 65 years.
- < 65 years and at increased risk for COVID-19 complications (“at risk”).
- < 65 years “not at risk” for COVID-19 complications.

Moderna calculates risk using the subjects’ relevant medical history. Their experimental plan calls for at least 25-40% of participants to be either ≥ 65 years of age or < 65 years of age and at risk.

This approach allows scientists to evaluate vaccine effectiveness and safety using a racially, ethnically, medically, and age-diverse sample. Furthermore, analysts can determine whether efficacy and safety change across these groups.

## COVID-19 Vaccine Experimental Design

The experimental design has two arms, the treatment and control groups.

**Treatment**: Receives one intramuscular (IM) injection of 100 micrograms (ug) mRNA-1273 on Day 1 and Day 29.**Control**: Receives one IM injection of saline solution on Day 1 and Day 29 (placebo).

The experimental design calls for approximately 30,000 participants and randomly assigns them to either the treatment or control group, split equally between the two groups. The study uses “blinding”—neither the researchers nor participants know their group membership. Blinding helps prevent conscious or unconscious biases in an experiment. These biases can affect care, attitudes, assessments, and ultimately the final results.

The random assignment is crucial because it allows the study to determine whether the vaccine causes a reduction in the number of COVID-19 infections. Randomization helps avoid the problems associated with correlation not implying causation. If the scientists used a non-random process for assigning subjects to the experimental groups, that process might explain the results at the end rather than the vaccine itself!

For more information about randomization, read my post about using random assignment in experiments.

## Experimental Definition of COVID-19 Infections

Because this experiment assesses the vaccine’s effectiveness for reducing the number of COVID-19 infections, it has a rigorous, operational definition for what counts as an infection. This study assesses only symptomatic COVID-19 cases.

To count as a COVID-19 infection in this experiment, the participant must experience:

- At least TWO of the following systemic symptoms: Fever (≥ 38ºC), chills, myalgia, headache, sore throat, new olfactory, and taste disorder(s).
- OR at least ONE of the following respiratory symptoms: cough, shortness of breath, or difficulty breathing.
- OR have clinical or radiographical evidence of pneumonia.

AND the participant must have at least one NP swab, nasal swab, or saliva sample (or respiratory sample, if hospitalized) positive for SARS-CoV-2 by RT-PCR.

In short, the subject must have symptoms plus a positive test result.

## Power and Sample Size Analysis for the COVID-19 Vaccine’s Effectiveness

In statistics, we talk about the statistical power of the design. Power is the probability that a hypothesis test will detect an effect in a sample when that effect exists in the population. Bear in mind that, because we’re working with samples, an effect in the population might not be evident in a random sample drawn from that population. Obviously, you want to avoid a false negative like that! Hence, it’s crucial to perform a power and sample size analysis before the study.

For more information, read my post about power and sample size analysis.

The primary metric in this experiment is vaccine effectiveness. The researchers want to determine whether this vaccine is effective in reducing COVID-19 infections. I’ll discuss vaccine effectiveness in more detail in later sections.

During the initial power and sample size analysis, the scientists estimated a sample size and devised an experimental design allowing the study to have a good chance of detecting vaccine effectiveness.

Specifically, the researchers used a planning value of 60% effectiveness for their power analysis. They estimated that 151 COVID-19 cases would yield a 90% chance of rejecting the null hypothesis if the vaccine is at least 60% effective in the population.

## Statistical Analysis of the COVID-19 Vaccine Data

Let’s look at their plans for the statistical analyses and then we’ll move on to a discussion about the preliminary results and vaccine effectiveness.

The statistical hypotheses for this study are the following:

**Null**: Vaccine effectiveness is ≤ 30%.**Alternative**: Vaccine effectiveness is > 30%.

The test is one-sided test, hence the greater than and less than signs in the hypotheses. This test determines whether vaccine effectiveness in the population is greater than 30%. You’ll remember that the analysts used a planning value of 60% effectiveness in the power analysis. However, that is the effect’s point estimate. The lower bound of the confidence interval incorporates a margin of error below the point estimate. The analysts estimate that a point estimate of 60% equates to a lower bound of 30%.

The study will produce significant results if the lower CI bound is greater than 30%.

### Type of Regression Analysis for the Vaccine Study

To analyze the COVID-19 vaccine data, statisticians will use a stratified Cox proportional hazard regression model to assess the magnitude of the difference between treatment and control groups using a one-sided 0.025 significance level.

Analysts frequently use this type of model in medical settings to evaluate the reduction in risk associated with a treatment while incorporating patient characteristics and risk factors. By including these other relevant factors, the model can control for the other variables and avoid confounding, which occurs when another factor actually causes the observed treatment effect. Additionally, analysts can include interaction terms to determine whether vaccine effectiveness varies by these factors. For example, is vaccine effectiveness lower in older people?

Vaccine effectiveness is a hazard ratio, making this a great type of regression model to assess it. More on vaccine effectiveness in the results sections!

**Related posts**: Understanding one- and two-tailed tests, Confounders and Omitted Variable Bias, How Confidence Intervals Work, and Understanding Interaction Effects

## Interim Assessments and Mitigating the Risk of False Positives (Type I Errors)

In this experiment, a Type I error occurs when the statistical analysis produces statistically significant results but the vaccine is not effective in the population. This type of false positive is dangerous because it indicates the vaccine is effective against COVID-19 when it is not. Fortunately, the experimental design incorporates several statistical methods that mitigate this risk.

In clinical studies like this one, clinicians like to assess interim data. If the test produces significant results before the experiment ends, they can use the treatment earlier than otherwise. In this study, analysts evaluate the data at several points while accumulating COVID-19 cases over time. However, when you assess the data multiple times, it increases the Type I error rate (false positives).

Fortunately, this experiment uses the O’Brien-Fleming approach to control the Type I error rate so that it does not increase due to the multiple assessments. This method controls the false positive rate so that it equals the significance level. It is similar in concept to using post hoc tests for ANOVA that control the error rate for multiple comparisons between groups. In this scenario, the O’Brien-Fleming method controls the error rate to account for multiple data assessments.

The study also uses a significance level of 0.025 for their one-sided test. This value is lower than the standard significance level of 0.05 and it reduces the Type I/false positive error rate. Presumably, they’ve lowered the significance level to counteract the problem of increased false positives in the direction of interest with one-tailed tests.

Moderna’s COVID-19 vaccine (and Pfizer’s) produced statistically significant results during an interim analysis. Fortunately, their design allows them to assess the results early without increasing the false positive risk. Consequently, their vaccine will likely receive emergency use authorization in the United States and elsewhere earlier than if they had to wait until the study’s end.

**Related posts**: Type I and Type II Errors, Understanding Significance Levels, and When You can Use One-Tailed Tests

## Statistical Results: COVID-19 Vaccine Effectiveness and Safety Interim Analysis

The full data and analyses are currently unavailable, but we can evaluate their interim analysis report. Moderna (and Pfizer) are still assessing the data and will present their analyses to Federal agencies in December 2020.

Moderna reports that in the approximately 30,000 subjects, there were the following counts of COVID-19 cases:

**Vaccine**: 5 (0 severe)**Control**: 90 (11 severe)

Both groups should have approximately the same number of participants. There are about 30,000 subjects, and the researchers split them evenly between the two groups. While attrition rates were likely a bit unequal, the group sizes should be roughly equal with about 15,000 in each. Using a simple visual assessment, you can see that the vaccine group had many fewer cases and no severe COVID-19 cases.

### COVID-19 Effectiveness in Detail

Let’s see how analysts use these counts to calculate vaccine effectiveness! As I mentioned earlier, vaccine effectiveness is a hazard ratio. Typically, hazard ratios in medical studies are the probability of an event occurring in the treatment group divided by the probability of the event in the control group. Vaccine effectiveness calculations subtract the hazard ratio from one, as shown below:

As the numerator of the ratio decreases relative to the denominator, the value of that ratio decreases. Because we’re subtracting the ratio from 1, a smaller ratio produces a value close to 1 or 100%. For this study, a smaller probability in the numerator relative to the denominator indicates that the vaccine reduces COVID-19 infections. Just by looking at the numbers, we know this is true.

Now, let’s use the reported number of cases and an estimated sample size of 15,000 per group to calculate vaccine effectiveness!

The Data Safety and Monitoring Board (DSMB) also reported that these findings are statistically significant. Statistical significance indicates the data favor the theory that this effect exists in the population. In other words, we have reason to believe that using the vaccine on people outside the experimental sample will reduce the prevalence of COVID-19 infections in the overall population.

Moderna is still collecting safety data. Before the FDA can approve the vaccine, it must be demonstrably safe and effective. As of now, the DSMB reports that participants tolerate the vaccine well, and it did not identify any significant safety concerns. The most common side effects are typical for vaccinations: pain at the injection site, fatigue, and aching muscles and joints.

## How this COVID-19 Vaccine Study Obtained Early Results

I showed how using the O’Brien-Fleming method allowed the experimenters to assess the data early without increasing the false positive rate. That was instrumental for being able to produce these significant results at this interim point.

The higher than expected vaccine effectiveness also plays a role. The study’s analysts used a planning value of 60% effectiveness for the power and sample size calculations. However, the actual effectiveness is much higher at 94.5%. This higher value is easier for the hypothesis test to detect.

Additionally, the study planned to assess the data at intervals defined by the number of cases in their sample. Because COVID-19 is spreading faster in communities than expected, the study obtained these prespecified case counts earlier than anticipated. While this faster spread is unfortunate, it has a silver lining. It allowed these vaccine studies to prove their effectiveness more quickly.

In short, the analytical procedures, higher effectiveness, and the rapid community spread of COVID-19 allowed this study to demonstrate that the vaccine is effective earlier than initially anticipated.

For an interesting comparison to flu vaccinations, read my post about the effectiveness of flu shots!

KS Lam says

FDA issued a guidance regarding Statistical Considerations stated below.

“To ensure that a widely deployed COVID-19 vaccine is effective, the primary efficacy endpoint point estimate for a placebo-controlled efficacy trial should be at least 50%, and the statistical success criterion should be that the lower bound of the appropriately alpha-adjusted confidence interval around the primary efficacy endpoint point estimate is >30%”. See the reference.

To satisfy these two criteria (50% efficacy and 30% CI lower limit), let me use an example of 50 cases arising in those vaccinated and 100 cases arising in those given placebo to show

1. Vaccine Efficacy (VE) = [(attack rate in the unvaccinated group ARU – attack rate in the vaccinated group ARV) / attack rate in the unvaccinated group ARU] * 100% i.e VE = (100-50)/100 *100%= 50%. VE 50% is a point estimate which is not dependent on the sample size. If we have 5 cases in the vaccination group and 10 cases in the vaccination group, the VE will still be 50%. A commercial software can be used to calculate the sample size.

2. Statistical power calculation is based on Bayesian Beta Binomial distribution. Use Excel function BETA.DIST(0.41176,50.700102,101,TRUE) to compute. The calculated posterior probability is 97.6% with prior beta (0.700102, 1) provided by Pfizer.

Reference

https://www.fda.gov/regulatory-information/search-fda-guidance-documents/development-and-licensure-vaccines-prevent-covid-19

Scott Shannon says

Thanks for the feedback. I have been trying to make sense of all the statistics flying around in the media and attain a better understanding of how reliable they are. Yes, I have been thinking these are early days also. Your discussion on this has helped. The sample sizes being utilized to make statistical inferences about the population just seem to be so small for making reliable estimates. I am thinking how a study with 90-180 symptomatic COVID cases from a group of 30,000 be big enough to make a reliable estimate about the effectiveness of the vaccine. The initial percentage of symptomatic COVID cases by Moderna for the placebo group was 90/15000=6/1000, which is in the range of the current death rate for the entire US population and probably much lower than the symptomatic rate. It just looks like they are taking a sample of a sample and basing a vaccine effectiveness on this data. But what is too small, statistically speaking? As you explain in your post about the Moderna trial, the statistical power and sample size analysis estimated 151 COVID cases should give the study a good chance of detecting vaccine effectiveness. I probably need to attain an understanding of the statistical power and sample size analysis to get a better feel for this.

KS Lam says

A simple formula for calculating confidence intervals by means of a Taylor series variance approximation has been recommended by WHO for gauging the precision of estimates of vaccine efficacy.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2491112/

Based on the given formula, I developed Excel template to compute the Pfizer-BioNTech and Moderna mRNA vaccine clinical data submitted for FDA EUA recently.

Scott Shannon says

In the article you wrote about the flu vaccine, you presented both the relative and absolute risks associated with the vaccine. Is it possible to assess the absolute risk from the early COVID test results you presented in this article? I am wondering about this since the effectiveness of the COVID vaccine are computed from early or incomplete results, which makes the resultant percentages for the placebo and vaccinated proportions very low.

Jim Frost says

Hi Scott,

I agree, I wouldn’t take the proportions of infections within each group as the true probability of being infected by COVID. For the flu studies, they’re the infection rates over the entire flu season and so they make sense as your absolute risk. As you point out, we don’t have that for the COVID studies. You could calculate an annualized rate but I wouldn’t recommend that given the short observation time. And, at least for the Moderna study, they’re just looking at symptomatic cases. We should be able to know the absolute symptomatic risk but that won’t account for the asymptomatic cases. From a disease standpoint, that’s ok. What you’re risk of being a symptomatic case. However, from a virus spread standpoint, we don’t know if it prevents people from spreading the disease. They might be asymptomatic but still spread it, which affects modeling of COVID’s spread and prevalence.

It’s early days and we’ll need to wait for more data and over a longer time to really get a good handle on these questions!

Andreas Lehmann says

Thanks for the great article! In times of headlines that vaccine A is better than B, I would be highly interested in the question, how to calculate the confidence interval for the hazard ratio. How can this be done, do I need to use the clopper-pearson confidence interval? I would love to see an article on this ;).

Sean says

Hi Jim

As always the articles are good, thanks

Most of this is new to me so a question please

Were the participants specifically exposed to the CV or were they supposed to get exposed to it in the open per chance?

How do we know there was even a chance of infection or is this where the control vs non control group in the same physical area/region are compared side by side to know there was or was not a CV prevelance? Or even any correction for mask vs non mask

usage/other hygene practices etc. Is it assumed to be of little relevance based on sample size?

Thanks

Sean

Jim Frost says

Hi Sean,

This type of trial tries to incorporate real world aspects. So, the participants are given the vaccine or placebo and then they’re allowed to live their lives normally. The only change in their routine is the requirement to visit the lab several times. They are not intentionally infected. Instead, the idea is to see the real-world impact of these vaccines.

In some cases, there are trials that restrict subjects and/or intentionally infect them. I’m not sure if they did that in the earlier phases for this vaccine or not. If they did, that would’ve been an entirely different set of volunteers.

The subjects were randomly assigned. Consequently, the treatment and controls groups should be roughly equal in terms of behaviors, opinions about the pandemic, and how seriously they take precautions.

The study also gathered subjects from all over the United States to cover regional differences. There might have been other trials elsewhere (I’m not sure) but this trial covered most of the United States.

sandesh says

Hi Jim,

Thanks for sharing such a great article. It really helped to understand the entire process.

Couple of queries, on basis of whatever I have read / heard about COVID ..

1. COVID affects individuals with lower immunity / higher stress. While aligning individuals to treatment / control group, was this managed and if yes, how?

2. As per the procedure, to be labeled as “COVID”, the subject must have symptoms plus a positive test result. If it would have been only “positive test result” would the results have been different, as many individuals are asymptotic carriers.

3. Are the subjects kept in any centralized location? Are there any restrictions on them or they can continue with their life as usual?

Regards

Sandesh

Jim Frost says

Hi Sandesh,

As I indicate in the article, they intentionally gathered a sample that had various categories of at-risk participants. These would include the elderly and other people with weaker immune systems. these people were randomly assigned to the vaccine and placebo groups. Consequently, they’ll be able to determine how well the vaccine protects people with weakened systems.

It’s hard to predict whether the results would’ve been different if they considered asymptomatic carriers. I don’t see the vaccine being less effective in the asymptomatic people but could it actually be more effective. I would not hazard a guess. However, most importantly, it does reduce the number of cases with symptoms.

Subjects continue to live their lives normally. No restrictions. One of the ideas for this study is understand the real-world effect of this vaccine. The only change in their routine is the requirement to visit the lab several times, record how they’re feeling in a diary, and several calls from the lab.

sandip says

Very interesting article to read how statistical inference gets used to make such great decision ….thanks Jim for such wonderful article…just one small doubt while selecting participant in experimentation were they all healthy participant means not already infected with covid ? or mix of healthy and infected?

hope you understand my subject selection problem

thanks

Jim Frost says

Hi Sandip,

Thanks so much, I appreciate the kind words!

For this study, they excluded anyone who had COVID previously. They only accepted people who never had COVID before. However, they could have other health conditions besides COVID.

Great question!

Shahram says

Thank you so much for this amazing post. Would you please describe more about the 60% effectiveness that researchers used for power analysis. How did they find this 6% effectiveness ? How did they to the 90% chance of rejecting null hypothesis for151 cases? This part was confusing for me!

Thanks again

Jim Frost says

Hi Shahram,

When studies are in the planning stages, they need to estimate the size of an effect that they want to detect and then perform what is known as a power and sample size analysis to estimate a good sample size. Click the link to better understand that process. If they don’t do that power analysis, there’s a chance that they might use a sample size that has very little chance of detecting a meaningful effect. For this study, they used the planning value of 60% vaccine effectiveness as an effect that they wanted to detect with a high probability. Using that information, they estimate the number of cases they’d need, which also impacts the sample size. Basically, they designed the experiment around the goal of having a 90% probability of having significant results in their sample data if the vaccine effectiveness is 60% in the population.

That’s the chain of logic they used for getting to the 151 cases. Unfortunately, I have not seen the actual math and assumptions beyond what I write about. Additionally, the software I have used cannot perform power analysis for the type of regression model they use. So, I can’t replicate it. But, if you read the link in this comment, you can see how that process works for other examples.

Stan Alekman says

You use the term effectiveness. There is also the term efficacy. Please distinguish between the two terms.

Efficacy is calculated by the difference in infection proportions of the two groups. When efficacy is reported in the media, it is without confidence intervals. The CIs can be very wide.

Jim Frost says

Hi Stan,

Effectiveness vs. efficacy is an interesting topic with some grey areas. As I understand it, the main differences are the conditions under which the vaccines are administered and to whom they are administered.

From the CDC website:

Vaccine efficacy refers to vaccine protection measured in RCTs usually under optimal conditions where vaccine storage and delivery are monitored and participants are usually healthy.

Vaccine effectiveness refers to vaccine protection measured in observational studies that include people with underlying medical conditions who have been administered vaccines by different health care providers under real-world conditions.

This study has characteristics of both. It is an RCT but vaccines were administered to people with underlying medical conditions in a variety of locations—a mix of both those definitions.

Moderna’s literature about the experimental design and statistical analyses refer to “vaccine effectiveness” and I used their terminology.

I too would like to know the CIs for their results. From their power analysis, we know they expect a 60% effectiveness point estimate to correspond with a 97.5% lower bound of 30%.

However, considering the very large sample size, much higher than expected vaccine effectiveness, and statistically significant results, I take it all as promising news!

Scott Stevens says

Excellent job. Thank you. I’ll be using this with my stat students!

Jim Frost says

Hi Scott, thanks! And thanks for sharing with your students!

Glenn Gee says

It was a very informative analysis of the methodology used for the vaccine. I would like your permission to use the article in my experimental design class. Thank you.

Jim Frost says

Hi Glenn,

Thanks for writing and I’m glad you found it information. I just sent you an email.

Dr Virendra Mishra says

Sir, you have done a wonderful task.

Jim Frost says

Thank you!