Community to its Core
NCEAS’ popular five day course “Reproducible Research Techniques for Synthesis” was recently given a refresh by our two new data training program managers, Halina Do-Linh and Camila Vargas Poulsen. The course, now called “coreR,” was updated to stress the importance of diverse participation and community building in data science alongside its core data science curriculum. We spoke with Halina and Camila about their vision for the course– and their experience teaching it for the first time in April.
You can learn more about the Learning Hub on our website, including information on registering for our upcoming courses. coreR will be offered on the first week of October and April each year.
Can you tell me a little about coreR and who it is meant for?
Halina: This course brings people together across environmental fields to take their work to the next level – and data science is the way to get to that new level. We teach data science tools and reproducible workflows to improve people’s science and help them reach their own goals.
Camila: coreR is meant for environmental scientists and, more broadly, environmental professionals. You don’t need to be doing really hard core data science to use these tools we introduce. I think the materials are beneficial for anyone using environmental data in any capacity.
What’s your favorite topic that you teach in the course?
Camila: I just love data visualization! The beauty of {ggplot} is that plotting isn’t overly complicated - but it still took me a while to understand the mechanics behind it. It’s satisfying to make that understanding so explicit. What I love about our {ggplot} lesson is that there is a lot of encouragement for individuals to explore – and that’s such an important skill in data science.
Halina: This answer might change over time, but I think teaching Git and GitHub. It was such a challenging tool to learn, but now I feel like I understand it well enough to teach it, demystify it, and even debug it! I feel like I’ve come full circle with Git and GitHub.
How did you approach redesigning the structure of the five day course?
Halina: We really wanted the course to be a mixture of lessons and hands-on collaborative practice. So we structured it to be lessons in the morning and then leading participants through group and individual exercises in the afternoons. Anyone can learn these technical skills - that’s why so many are self taught in data science - but a course like this offers you practice in problem solving with a built-in community of support when you use the tools for the first time.
Camila: It was also important to us to integrate both technical and non-technical aspects of environmental data science. So for example, we talked about technical best practices with “tidy data,” but we also had a module on important data principles such as FAIR and CARE. We talk about the importance of documenting your data, and the process of archiving data. Whenever we are talking about data, it’s important to make space for the conversation about why we have this data and how we are using it – and that’s the data ethics side. We need to have these conversations sooner rather than later in our data science journeys.
With so many data science programs online, why do you think it’s important to do a course like coreR?
Camila: There is something about the energy of a classroom community, all learning coding together for five days. It feels special – and it builds a lot of support and trust. And the course format means we can move beyond just showcasing tools. So it’s not just teaching the {tidyverse} or {dplyr}, it's all the building blocks for people to implement reproducible workflows.
Halina: Learning data science is hard! When you teach yourself, it can be really discouraging when you hit a wall and can’t debug an error because you don’t understand the concepts fully. Knowing you are in a room where it is safe to struggle and ask for support – from each other, the instructors, and even the greater NCEAS community – makes the course a really valuable experience.
How has your own journey with data science and finding your community impacted teaching coreR?
Halina: I primarily learned data science through the Master of Environmental Data Science program here at NCEAS and the Bren school. It was a small cohort and that allowed me to build a close community of people that I still reach out to today. I also joined other, larger communities like Minorities in R and the R-Ladies Santa Barbara chapter. Since I’ve learned so much environmental data science amongst communities, I see the value of teaching within a community as well. And I think because I just learned coding within the last year or so, I can relate to the struggles of participants more than someone who has been practicing data science for years. In teaching, I want to make data science as approachable as possible.
Camila: My data science journey also started with a very supportive team – the Ocean Health Index. When I started learning, I was amazed at how many people were putting time and effort into teaching each other. Everyone was just so willing to help. For coreR, that experience translated into an idea for me that everyone who wants to learn can do it. I encourage new learners to really search for these communities, because they do exist, and I try to replicate that supportive and welcoming environment in the class, empowering participants into their journey using data science tools.
How do you support diversity and inclusion in the coreR course? Can you talk a bit about the Director’s scholarship?
Halina: I never felt like my identity or experiences were reflected in my data science instructors. Communities like Minorities in R and R-Ladies have made me feel more included in what it means to be a data scientist. It’s really important to us in the Learning Hub to minimize barriers to learning environmental data science as much as possible. The Director’s scholarship is one way, offering a full ride each time we host coreR. That includes tuition, transportation, accommodations, and per diem. It’s awesome we can provide support in this holistic way so participants can focus on learning and not be distracted by financial hardships.
Camila: For me, it was really eye opening to realize that there were more people that felt excluded in data science – and were working to make it more inclusive. It made me want to contribute. NCEAS gives me the space to think about what this inclusion can look like and incorporate more diversity. The Director’s scholarship is our main funding opportunity to make the course more accessible to a wider audience. We have already seen its positive effects and I hope we can expand these types of offerings.
Written by NCEAS Science Communication and Policy Officer Alexandra A Phillips after an interview with the Learning Hub team.