Open Science for Synthesis
This hands-on data science course was designed for both early career and established researchers to gain skills in data science, including scientific synthesis, reproducible science, and data management.
Participants came to NCEAS for three weeks of intensive training in scientific computing and scientific software for reproducible science. We held this course three times from 2013-2017.
This page archives the descriptions and materials for those courses.
Course Format
The training revolves around scientific computing and software for reproducible science. Our instructors emphasize integrating statistical analysis into well-documented workflows with the use of open-source, community-supported programming languages. Participants learn skills for rapid and robust implementation of open source scientific software. These approaches are explored and applied to ecological, environmental, evolutionary, Earth, and marine science synthesis.
The course focuses on techniques for data management, scientific programming, synthetic analysis, and collaboration techniques through the use of open-source, community-supported tools. Participants learn skills for rapid and robust use of open source scientific software.
The course weaves together several core themes which are reinforced through daily work on group synthesis projects. Core training themes address:
- Collaboration modes and technologies, virtual collaboration
- Data management, preservation, and sharing
- Data manipulation, integration, and exploration
- Scientific workflows and reproducible research
- Agile and sustainable software practices
- Data analysis and modeling
- Communicating results to broad communities
Throughout the course participants will receive a solid foundation in computing fundamentals for doing synthetic research in today’s computational- and data-intensive era. This includes:
- Instruction on languages like R and Python for data manipulation, analysis, and visualization
- Analytical techniques for synthesis research, including meta-analysis and systematic reviews
- Survey of general programming constructs, paradigms, and best practices
- Exposure to the Linux/UNIX command line environment and useful tools
- Demystification of modern computers that have bearing on effective science
- Discussion of cyberinfrastructure trends supporting open, networked, reproducible science
Group Synthesis Projects
Participants form small synthesis teams that focus on utilizing the software skills they learn each day in the context of cross-cutting science research projects. Using an open community engagement process, participants maximize their success in collaborative research, which can lead to publishable results.
Previous Courses
This course, held July 10-28, focused on synthesis skills for understanding the complex environmental, human, and energy systems in the Gulf of Mexico, especially following large disturbance events like the Deepwater Horizon oil spill in 2010. Participants applied their learning in scientific synthesis projects related to the Gulf of Mexico’s human, environmental, and energy systems, thereby increasing capability and efficiency in synthesis research among Gulf researchers. Funding for this course was provided by the Gulf Research Program, which is dedicated to improving understanding of the Gulf of Mexico’s human, environmental, and energy systems in response to the Deepwater Horizon oil spill.
Go to course curriculum on GitHub >>
Instructors
- Matt Jones, NCEAS
- Amber Budden, DataONE
- Tracy Teal, Data Carpentry
- Mark Schildhauer, NCEAS
- Bryce Mecum, NCEAS
- Chris Lortie, NCEAS and York University
- Leah Wasser, EarthLab University of Colorado, Boulder
- Julien Brun, NCEAS
NCEAS co-led this course with University of North Carolina’s Renaissance Computing Institute (RENCI) from July 21st to August 8th, with participants in both Santa Barbara, CA and Chapel Hill, NC. The training was sponsored by the Institute for Sustainable Earth and Environmental Software (ISEES) and the Water Science Software Institute (WSSI), both of which are conceptualizing an institute for sustainable scientific software. Participants received hands-on guided experience using best practices in the technical aspects that underlie successful open science and synthesis – from data discovery and integration to analysis and visualization, and special techniques for collaborative scientific research, including virtual collaboration over the Internet.
Go to course materials on GitHub >>
Instructors
- Stanley C. Anhalt, Renaissance Computing Institute (RENCI), UNC Chapel Hill
- Nancy Baron, COMPASS
- Ben Bolker, McMaster University
- Stephanie Hampton, Washington State University
- Jeff Heard, RENCI and TerraHub LLC
- Matt Jones, NCEAS
- Chris Lenhardt, RENCI
- Karthik Ram, Berkeley Initiative for Global Change Biology, UC Berkeley
- Stacy Rebich Hespanha, NCEAS
- Mark Schildhauer, NCEAS
- Michael Stealey, RENCI
- Greg Wilson, Software Carpentry
This three-week intensive training in ecological analysis and synthesis was offered from June 19 through July 10, 2013. Participants received hands-on guided experience using best practices in the technical aspects that underlie successful synthesis – from data discovery and integration to analysis and visualization, and special techniques for collaborative scientific research. The Packard Foundation provided generous support for the institute.
Go to course materials on GitHub >>
Instructors
- Ben Bolker, McMaster University
- Stephanie Hampton, NCEAS
- Matt Jones, NCEAS
- Jim Regetz, NCEAS
- Mark Schildhauer, NCEAS