The coronavirus cluster that sprung up after the Boston Biogen conference back in February was one of the first moments when the pandemic really hit home locally. It turns out the event may have had a bigger impact than anyone had realized. A new study suggests that thousands more cases of COVID-19 can be connected to the Biogen conference. One of the study's authors, Massachusetts General Hospital infectious disease physician Dr. Jacob Lemieux, spoke with WGBH All Things Considered host Arun Rath. This transcript has been edited for clarity.

Arun Rath: So casting our minds back, nearly 100 people who attended the Biogen conference at the Long Wharf Marriott Hotel came down with COVID-19. Could you explain how we got from 100 cases of the virus to the estimate of 20,000 cases?

Dr. Jacob Lemieux: Sure, and I think casting our minds back is really the right way to frame it, because it was a very different time and we had a very different and much more limited understanding of the coronavirus at that time.

We have a group that conducts what's called genomic epidemiology, which is we sequence the genetic code of the virus and we connect that information up with epidemiological information and use the genetic fingerprints of the virus to make inferences about ancestral processes in space and time that we weren't able to observe. So we conducted a study that is released in preprint, and I emphasize it hasn't been through the normal peer review process yet, in which we studied several clusters linked to clusters of COVID-19 in the Boston area.

One of those was a set of infections linked to a professional conference, and we found that all the infections that we looked at where patients had direct links to the conference shared a common genetic signature. And we found that that genetic signature was the dominant one in the Boston area. This likely occurred because that introduction and spreading event occurred early in the pandemic.

Pandemics are exponential amplification processes, so events that occur early may have a greater or do have a greater probability of having a major impact. It also occurred at a time when there was a much different understanding, much less understanding, of the way that rapid transmission can occur, of the risks of having groups of people indoors in close quarters. And it occurred at a time when we simply didn't realize that that COVID-19 was circulating Boston.

It's important to emphasize that the goal of the study wasn't to identify the total number of individuals affected by any one event, and that figure is really a ballpark estimate, nothing more, nothing less. The point is that super-spreading events, major clusters, can have far-reaching implications. They can influence a regional epidemic and they can measurably impact national and international spread, and that is the case here.

Rath: Reading about your research, one of the most fascinating aspects is -- you just talked about tracing the genetic signature -- that there are, in terms of different varieties of COVID-19, different genetic varieties, you found that there were about 80 distinct types.

Lemieux: Well, we found that there were at least 80 introductions into the Boston area. And that is types that were circulating. We found at least 80 occurrences in which new viral genetic sequences were introduced into the area. Most of those came through Europe, and many of the ones that came through Europe came to other places in the United States and then were imported into Boston through regional spread and other domestic introductions.

So I think the message there is that there was an overwhelming force of COVID-19 entering into Boston, and it was coming here with or without any particular introduction, including the one which was amplified by the conference.

Rath: In doing this research, how do you get the the virus samples? That's not a standard part of the coronavirus test that people get.

Lemieux: We were involved in early efforts to develop diagnostic assay at Mass. General just to be able to diagnose the virus. Once we had successfully worked with the team to create a diagnostic assay that worked, we realized that we could use residual material from the nasal pharyngeal swabs to attempt whole genome sequencing. When we used our standard off the shelf approaches to whole genome sequencing from these nasal pharyngeal swabs, it actually worked surprisingly well. And that was the genesis of this study.

Rath: In tracing this, do you use this genetic information alongside contact tracing information? Or what other data are you pulling into this?

Lemieux: Well, that's it exactly. These are complementary data types. So we pull in conventional or sometimes what we call shoe leather epidemiology, things like when exposures occurred, what contacts somebody had, as well as genetic sequence. They're complementary and synergistic, because when we put them together, we're able to glean insights that we wouldn't be able to ascertain with only one type of information alone. For example, we can connect using the genetic sequence clusters of infections that did not at first glance appear to be connected. We can also ask whether infections that may be connected are linked by chains of transmission, and we have examples of both in the pre-print.

Rath: You cautioned when we started talking that this study hasn't been peer reviewed. For people who don't read scientific publications, could you put that in context, what that means for what we're talking about, and why you're bringing this forward now before it has been peer reviewed.

Lemieux: Well, the normal process of sharing science, which has been the process for hundreds of years, really, has been that scientists submit articles to journals and they're revised, reviewed by expert peers who recommend modifications or in some cases that the article not be published at all. The authors respond to those critiques to the satisfaction of the reviewers, and it's through that process of expert peer review that scientific articles enter the literature.

There has been a shift in recent years for several reasons, one of which is that that process can take time. Sometimes it can take months or even years, and that delays the pace of scientific research. If I have an important scientific finding to share that may influence the research you do, we don't necessarily want to wait a year and a half for that particular finding to be available for you to assess.

So there's a movement to share findings prior to publication in scientific journals using pre-prints. These pre-prints need to be viewed with a little more skepticism. I think there's an important role for asking outside experts to weigh in on the findings, and some of the reporting that's been done on the preprint acknowledges that and has looked to outside experts.

We are a group of over 50 researchers, so it isn't like we just threw this up. You know, we checked and double-checked and discussed the findings. But I do think that there is a difference between a pre-print and a scientific finding that's gone through the process of peer review. And so it's a new time, we're all learning. And it doesn't mean that that work that has gone through peer review is perfect, but it's an added layer of scrutiny.