Join our gene pool.
Research & Science
Sunnyvale, California, United States
With the world’s largest database of more than ten million genotyped customers, 23andMe is at the forefront of using human genetics to advance biomedical research and transform healthcare. Join our growing research team in translating genetic discoveries into insights into molecular biology of human traits and diseases.
As a Scientist I/II, Bioinformatics (Genome Annotation) at 23andMe, Inc. the candidate will work closely with our statistical genetics and computational biology teams to source, curate, and implement databases for the analysis of genomic data in support of 23andMe’s drug target discovery and other research efforts.
DUTIES AND RESPONSIBILITIES:
- Identify, ingest and provide access to a comprehensive set of genomic and functional datasets.
- Develop and implement a strategy for how to use and deploy genomics and functional annotations data at 23andMe, in support of interpretation of human genetic variation.
- Develop tools and APIs for serving diverse genomics data types in a high-throughput and flexible manner.
- Follow genetics, functional genomics, bioinformatics literature and attend appropriate conferences to maintain awareness of desirable and state-of-the-art annotations and methods.
- Install, test, and deploy external tools to support curation, processing and high performance presentation of annotations to other researchers.
- Design and implement in-house tools and APIs when third-party tools are insufficient.
- Develop, evaluate, and apply methods and test plans to assure data integrity and consistency.
- Significant bioinformatics experience (at least three years) in a research setting, with a Ph.D. or M.S. in Bioinformatics, Computational Biology, Computer Science, Human Genetics, Molecular Biology or related field.
- Expertise with tools and methods for predicting functional consequences of genetic variants (VEP, snpEff, etc.), and underlying models and data sources.
- Proficiency with major bioinformatics resources (Ensembl suite, UCSC Genome Browser suite, TCGA, gnomAD/ExAC, GTEx, etc.) and in-depth understanding of at least one of these ecosystems.
- Hands-on experience with relational databases (MySQL, Postgres, etc.) and fluency with at least one advanced data storage and retrieval technologies (such as Amazon Redshift, Hadoop, HDFS, etc.).
- Fluency in R/Python and occasionally other languages as needed for data processing and API development.
- Experience with workflow development using shell scripting and standard GNU tools.
- Experience working in cross-functional teams.
- Understanding of gene expression and gene regulation datasets, such as e/p/mQTLs, chromatin marks, etc.
- Experience with cloud computing with AWS.
- Ability to engineer and tune large scale databases for performant read and write access of large datasets.