Dr. Leithen M’Gonigle Lab, SFU

   Research Assistant & Data Analyst

   May 2020 - Present


  • Masters Degree: Project yet to be determined

  • Undergraduate Thesis: Analysis of continent-wide bumble bee presence/abscence data with occupancy modelling

  • Undergraduate Researcher: Inferring bumble bee’s preferred habitat characteristics with occupancy modelling


  • Managed, reshaped, and analysed bumble bee habitat data in R

  • Ran occupancy models and interpreted output biologically

  • Scientific reporting

Image by Ilana Grostern


This is the project that taught me how much I love data. I always had a suspicion that the joy I felt while working on statistics and data visualization for my classes was a sign of a data-based career path that might be open to me, but the M'Gonigle Lab made it clear to me that this was true.

It initially started out as a field job. We were going to go to Tweedsmuir Provincial Park in Northern British Columbia to collect data on bumbe bees and flowers. It was possibly the most excitement I've felt since I was a small child version of Hanna who learned she was going to Disneyland.

However, just like all data has variation and unexpected problems, this plan too would have high variation and many unexpected problems. A global pandemic being the main source.....

So, with field work cancelled and hopes of a summer in the mountains squashed, we set our sights on making the best of our summer research grants anyways. This is where I have to thank Dr. Leithen M'Gonigle for being such a generous supervisor. We spent the summer fully engaged in learning to be fluent with R, and settled on a project based on what we had learned. With last summers' field data in hand, we decided to do an analysis via a very interesting tool called an occupancy model...

What is an occupancy model?

For details see the notes on occupancy models section (coming soon!)

We visit a site multiple times and use these repeat visits as a way to understand detection probability. At sites where the species was seen at least one of the visits, we look at the % of visits of that sites where the species, that we KNOW is there, was seen. This is detection probability in a nutshell.

We then use this measure of our ability to detect species that are there to estimate how many sites did indeed have the species (which is based on occupancy probability) but where we did not detect them. We also can put predictors on detection such as sampling effort to help the model parse variation in detection probability between visits and sites.

Detection probability is used to modify occupancy probability, as our observations of which sites were occupied and which were not were biased based on this imperfect detection probability. Just like detection probability, we can also put predictors on occupancy probability. These are generally site-level characteristics like food, water, shelter, and space resources that are generally thought to influence where an organism may or may not live (occupy).

How did we run the occupancy model?

The most common way to run an occupancy model is through R with a program and package called JAGS.

The main inputs are all of the species observations (in the form of 1's and 0's corresponding to  presence and absence) in the form an array. This means there is site by visit matrices for each species "stacked" on top of each other to form an array.

Other inputs are visit and site level predictors in the form of vectors and matrices respectively that can then in the model be used as predictors on occupancy and detection probability.

The model is then created, using mathematical interpretations of the theory from the previous section to tell the model how to assign values to the parameters based on the data given.

This then told us which parameters (site/visit level predictors) had a significant effect on either occupancy or detection!

What did we find?

As much as I would love to tell you, we're in the process of publishing our results!

Stay tuned for the full paper :)