Latin American R/BioConductor Developers Workshop 2018
General information
Conference webpage: http://www.comunidadbioinfo.org/r-bioconductor-developers-workshop-2018/
Level: intermediate – advanced
Language: english
When: July 30 – August 3, 2018
Where: Classroom #4 of the Undergraduate Program on Genomic Sciences at the Center for Genomic Sciences, Cuernavaca, Mexico
Twitter: @CDSBMexico
Github: https://github.com/ComunidadBioInfo/R-BioConductor-Developers-Workshop-2018
Pre-requirements
Requisitos de conocimientos previos
- Los participantes deberán tener conocimientos básicos del lenguaje de programación R: asignación de variables, lectura de archivos: read.csv, read.delim, read.table; estructuras de datos: matrix, dataframe, list; tipos de datos: character, numeric, factor, logical, etc; instalación y uso de paquetes.
- Saber instalar paquetes de R.
- Saber usar RStudio.
Requisitos técnicos
- Computadora Personal
Un mínimo de 8 GB de RAM, un ratón y espacio de disco suficiente para archivos de texto y archivos de imagen. Privilegios de administrador para instalar y ejecutar utilidades de RStudio.
Introduction
In recent years, biology has seen a rise in the use of technologies that enable high-throughput, quantitative, data-rich profiling of cellular states. As a result, the field now faces computational challenges to analyse such data. The R/Bioconductor project is an open source, open development software platform that provides tools to translate complex data sets into biological knowledge.
This workshop is aimed at students and researchers interested in the analysis of biological data. We encourage applications from experts in diverse disciplines, including but not limited to biologists, bioinformaticians, data scientists, software engineers and programmers and R users at large. The main goals of the workshop are:
- Teach participants the principles of reproducible data science through the development of R/Bioconductor packages.
- Turn bioinformatic software users into bioinformatic software developers.
- Foster the exchange of expertise and establish multidisciplinary collaborations.
- Create a community of Latin American scientists committed to the development of software and computational pipelines for biological data analysis.
- Help train bioinformatics instructors that can continue to grow in their local communities.
This workshop is part of a long-term project to create a community of developers from Latin America. We hope to hold regular meetings in the future (similar to BioC, EuroBioc and BioCAsia) where attendees present their own software contributions. To provide a welcoming environment please follow our code of conduct.
Program
Day 1: July 30, 2018 | ||
---|---|---|
09:00 – 10:00 | Inauguration in the main auditorium | |
10:00 – 10:30 | Keynote Lecture I: From learning to using to teaching to developing R | Leonardo Collado-Torres |
10:30 – 11:00 | Talk I: Example of Bioinformatics in Mexico | Daniel Piñero |
11:00 – 11:20 | Coffee break | |
11:20 – 12:20 | Creating a package | Alejandro Reyes |
12:20 – 12:40 | Break | |
12:40 – 14:00 | Version control with git and GitHub | Selene Fernandez-Valverde |
14:00 – 15:30 | Lunch break | |
15:30 – 16:15 | Open source software projects and collaborative development | Selene Fernandez-Valverde |
16:45 – 17:30 | Package documentation | Alejandro Reyes |
17:30 – | Welcome cocktail | |
Day 2: July 31, 2018 | ||
9:00 – 10:00 | Keynote Lecture II | Martin Morgan |
10:00 – 10:30 | Talk II: Example of Bioinformatics in Mexico: Using R-Shiny in Agrobiodiversity | Alejandro Ponce-Mendoza |
10:30 – 11:00 | Coffee break | |
11:00 – 12:00 | Best practices for writing efficient functions | Martin Morgan |
12:00 – 12:30 | Break | |
12:30 – 14:00 | Bioconductor: core package, common objects and extending classes | Benilton de Sá Carvalho |
14:00 – 15:30 | Lunch break | |
15:30 – 17:30 | S4 – system for object oriented programming | Martin Morgan |
17:30 – 18:30 | Poster session | |
Day 3: August 1, 2018 | ||
9:00 – 10:00 | Collaborative project organization and introduction | Daniela Ledezma-Tejeida |
10:00 – 10:30 | Vignette writing with markdown/BiocStyle | Benilton de Sá Carvalho |
10:30 – 10:50 | Coffee break/Event Photo (to be confirmed) | |
10:50 – 11:30 | Unit testing and R CMD check | Martin Morgan |
11:30 – 12:10 | Rcpp (Adding C/C++ code to R packages) | Benilton de Sá Carvalho |
12:10 – 12:30 | Break | |
12:30 – 14:00 | Debugging and Parallelization | Martin Morgan |
14:00 – 15:30 | Lunch break | |
15:30 – 17:30 | Working on a Collaborative Project | |
Day 4: August 2, 2018 | ||
9:00 – 10:00 | Keynote Lecture IV | Benilton Carvalho |
10:00 – 10:30 | Talk III: Example of Bioinformatics in Mexico RLadies Community Experience |
Teresa Ortíz |
10:30 – 11:00 | Coffee break | |
11:00 – 14:00 | Working on a Collaborative Project | |
14:00 – 15:30 | Lunch break | |
15:30 – 17:30 | Working on a Collaborative Project | |
Day 5: August 3, 2018 | ||
9:00 – 10:00 | Working on a Collaborative Project | |
10:00 – 10:30 | Teams: Concluding remarks about the experience | |
10:30 – 11:00 | Coffee break | |
11:00 – 12:00 | Presentation of the package developed Learned lessons during the project collaborative develop. |
|
12:00 – 13:00 | Evaluate projects and award ceremony | |
12:00 – 13:00 | Closing remarks and community building |
Instructors
-
Martin Morgan (Roswell Park Comprehensive Cancer Center, Buffalo, USA)
-
Benilton Carvalho (University of Campinas, Campinas, Brazil)
-
Selene Fernandez-Valverde (National Laboratory of Genomics for Biodiversity, Irapuato, Mexico)
Organizing Committee
-
Heladia Salgado (Center for Genomic Sciences, Cuernavaca, Mexico)
-
Leonardo Collado-Torres (Lieber Institute for Brain Development, Baltimore, USA)
-
Alejandra Medina-Rivera (International Laboratory for Human Genome Research, Juriquilla, Mexico)
-
Alejandro Reyes (Dana-Farber Cancer Institute, Boston, USA)
-
Delfino García (Center for Genomic Sciences, Cuernavaca, Mexico)
-
Daniela Ledezma-Tejeida (Center for Genomic Sciences, Cuernavaca, Mexico)
-
Laura Gómez (Center for Genomic Sciences, Cuernavaca, Mexico)
Instructors
Martin Morgan, PhD
Dr. Morgan spent 10 years as an Assistant and then Associate Professor at Washington State University, before joining the Fred Hutchinson Cancer Research Center in 2005. At the Hutch, Dr. Morgan worked on the Bioconductor project for the analysis and comprehension of high-throughput genomic data; he has led Bioconductor since 2008. Dr. Morgan recently moved to Roswell Park Comprehensive Cancer Center in Buffalo, NY, where the Bioconductor project is now based.
Benilton S Carvalho, PhD
Statistician (B.S., M.Sc.), Biostatistician (Ph.D.)
Instructor: Statistics – Regression Models;
Database designer and developer: MySQL, PHP;
Teacher Assistant: Biocomputing, Statistical Computing and Statistical Methods in Public Health;
Developer: BioConductor – oligo, makePlatformDesign, crlmm, pdInfoBuilder.
Specialties: High-throughput genotyping, microarray preprocessing, statistical modelling, gene-expression analyses, genetic epidemiology, statistical computing, programming (R, C/C++, Matlab).
Selene Fernández V., PhD
Selene Fernandez is a bioinformatician/genomic data scientist studying the evolution of gene regulatory mechanisms underlying the phenotypic diversity and cell differentiation in multicellular eukaryotes. She is particularly interested in the evolution of regulatory roles of non-coding RNAs.
In addition to her academic roles, she participate in initiatives that link scientists and the general public (such as Mas Ciencia por Mexico, Clubes de Ciencia Mexico), programs to introduce researchers to computing (Software/Data Carpentry) as well as mentoring programs (Ekpapalek).
Expertise: Analysis of high-throughput (next generation) sequencing data, transcriptomics, genome annotation, non-coding RNAs.
Alicia Mastretta Yanes, PhD
The broad aspect of the research of Alicia is how Mexican biodiversity has evolved from a genetic perspective. This includes changes on species distributions due to historical climate fluctuations (e.g. the Pleistocene glacial ages) as well as the effect of human management and domestication.
María Teresa Ortiz, PhD
Maria is a data analyst with experience in ecology and marketing. My interests include Statistical modeling (hierarchical models, spatial statistics, Bayesian networks), Survey design and Survey data analysis, Machine learning, and Data visualization.
Alejandro Ponce-Mendoza
Hi!!! I studied Food Technology at UIA and my masters and PhD in Biotechnology and Bioengineering at CINVESTAV. I have several years in postdocs and jobs at many institutions (ECOSUR, UNSIJ, INIFAP, CONABIO, UAM-X). I’m interested in numerical ecology, data visualization and agrobiodiversity. Finally, I practice touring bike and admire Thomas Bernhard, W.G. Sebald and Glenn Gould.
Alejandra Medina Rivera, PhD.
Alejandra obtained her Ph.D. in 2012 from the Biomedical Sciences Program at the National University of Mexico (UNAM). Since her Ph.D., she has been focused on developing bioinformatic tools, and strategies to study gene regulatory mechanisms, most of the developed tools are now part of the Regulatory Sequence and Analysis Tools suite (RSAT, http://rsat.eu/). Currently, using computational approaches, her research will incorporate functional genomics data into Genome Wide Association Studies (GWAS), aiming to identify variants that lead to misregulation of gene expression.
Alejandro Reyes, PhD
Alejandro Reyes is a postdoctoral research fellow in Rafael Irizarry’s laboratory at Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health. He is interested in (1) understanding how transcript isoforms contribute to cellular phenotypes and disease conditions and (2) integrating multi-omic data to unravel molecular cancer phenotypes. In order to ensure reproducibility of results, he implements analyses in documented workflows, software packages and graphic interphases. He contributes to the Bioconductor project.
Leonardo Collado-Torres, PhD
Leonardo is a data scientist working with Andrew Jaffe at the Lieber Institute for Brain Development. He uses R packages daily and contributes to the Bioconductor project. Leonardo is interested in high-throughput genomics assays such as RNA-seq, developments in R and helping others get started in their R journey. He has been learning & teaching about R since 2008 and is a co-founder of the LIBD rstats club.