Red Sea as a source for bioprospecting
December 12th, 2015
The 4th International Conference on Biotechnology and Bioengineering (ICBB2015), Singapore
Dr. Rimantas Kodzius http://www.kaust.edu.sa
Overview
• KAUST university
• Bioprospecting from extreme environments
• 16S RNA library to access taxonomic diversity
• (Metagenomic DNA sequencing)
• Instrumentation for the single cell experiments
• Acknowledgements
KAUST in Middle East
King Abdullah University of Science and Technology (KAUST)
• Established on September 5th, 2009, with endowment of US$20 billion
• First mixed-gender university campus in Saudi Arabia
• Current president Jean-Lou Chameau, the former president of the California Institute of Technology
• Graduate education and research (as of September 2014, 840 students), mainly Chinese, Indian, Saudi, other 60 nationalities
• Three academic divisions
• Physical Science and Engineering (PSE)
• Computer, Electrical, and Mathematical Science and Engineering (CEMSE)
• Biological and Environmental Science and Engineering (BESE)
The beauty of KAUST
KAUST student housing Laboratory buildings and town mosque
KAUST’s core campus, located on the Red Sea at
Thuwal, Kingdom of Saudi Arabia
KAUST Core Labs and Major Facilities
• Coastal and Marine Resources, including research vessels for marine exploration
• Biosciences and Bioengineering – including genomic and proteomic instrumentation
• HiSeq, Miseq, Ion Torrent, PacBio RSII, Fluidigm single cell
• Mass spectrometers LTQ-Orbitrap Velos, TripleTOF 5600, Maxis Q-TOF and Ultraflex III MALDI-TOF
• Analytical Core, with the focus on spectroscopy, chromatography and mass spectrometry, trace metals analysis, wet chemistry, and surface analysis
• Nanofabrication, Imaging, and Characterization
• A 2000 square meters of Class-100 and Class-1000 cleanroom space
• 11 nuclear magnetic resonance (NMR) spectrometers; TEM, SEM, confocal, and Raman Spectroscopy
• Visualization: CORNEA is a fully immersive, six-sided virtual reality
• Supercomputer: Shaheen II (2015), additionally CRBC center high performance computing cluster
KAUST/ CBRC News:
Installation of High-speed camera
CBRC news 2015, From
http://cbrc.kaust.e du.sa/cbrcweb/ho me/news.php
Computing cluster (mlj USD)
Single droplet setup:
• Nikon fluorescent inverted light
microscope
• Photron SA-Z High-Speed Camera
KAUST King Abdullah University of Science and Technology 6
Computational Bioscience Research Center (CBRC)
Research Centers at KAUST
• Advanced Membranes and Porous Materials
• Catalysis
• Desert Agriculture
• Clean Combustion
• Computational Bioscience
• Geometric Modeling and Scientific Visualization
• Red Sea
• Solar and Photovoltaic Engineering
• Water Desalination and Reuse
CBRC: a combination of both experimental research and biocomputational analysis
Comparative Genomics and Genetics lab (CGG) is a part of CBRC
600 m2 CBRC center laboratory
• Common benches for molecular biology and microbiology work
• Clean room for single cell work
Environment as a source for bioprospecting/ I
Design, Modeling, and Development of Microbial Cell Factories (MCF)
1. Collect microbial samples in large scale from extreme environments, including marine, soil, and oil
• Environmental samples are collected from coastal and offshore areas of the Red Sea, and the nucleic acid is extracted
2. Identify desired genes and metabolic pathways in microbe samples using metagenomic analysis and computational modeling
• Extracted DNA is subjected to Next Generation Sequencing
• Computational analysis of whole microbe genome and the 3D modeling of their proteins and metabolic networks is performed
• Desired genes and metabolic pathways are determined
Environment as a source for bioprospecting/ II
Design, Modeling, and Development of Microbial Cell Factories (MCF)
3. Utilize the microbial genes and pathways for the improvement of producer
• Utilize genome editing technology such as CRISPR
4. Express desired genes and pathways in
suitable host for mass production of food, fine chemicals, and energy
• The host (such as microalgae or B. subtilis) transformed with bacterial genes are
cultured to produce mass quantities of desired products
Metabolic model
Single cell sorting
Experimental platform for Gene Editing in the cell
Target gene DB
Pipeline: from discovery to production
Sea water samples
DNA extraction
NGS Database search
Functional prediction Structural prediction
Judgement of usefulness
Organism identification
Single cell manipulation
Gene or genomic identification
Genome editing
Judgement of usefulness
No Yes
No
Yes
Quantity
Large Small
Low High
Value / Unit
Triangle value
Outputs Output
Novel drugs
Food
Animal feed
Oil
Samples / Extraction / Sequencing In silicoAnalysis Cell functions
Information integration Biological network
The research strategy
Motivated by the industry needs and new horizons
Enzyme X, product Y, etc. Preferable extreme (soil, water, air)
• Phylogenetic marker genes – microorganism taxonomic diversity
• Metagenomics – the composition of genes/ pathways, metabolic potential
Current work, based on Thuwal cold seep brine pool sample
The need &
new discoveries
Choose
environment Sampling
Brine Sample
Check if products of
interest are present Isolation
The analysis of the environmental sample
Environmental site
Sample
collection Analysis
Bioprospecting products
Objectives:
1. To understand better the site & sample 2. The ultimate – bioprospecting products!
The discovery process starts with environmental sampling
The discovery process involves marine sampling, DNA sequencing and contig generation. Previously unknown genes, pathways and even whole genomes are being discovered
Improvement of a producer strain using genetic engineering tools to modify genomic DNA by
mutagenesis and pathway perturbation, followed by high-throughput screening. The best producer is used to scale up the production process.
“Wild” producer improvement
Challenges posed by the application of
metagenomic methods & suggested solutions
Nr. Expected challenge Offered solution
1. Selection of sampling site Go to extreme environments to make new discoveries 2. Huge data generation Use computer power to construct analysis pipelines
3. Host species selection
Select from available pathways based on what should be expressed
4. Host improvement Select CRISPR as easiest and fastest method
5.
Selection of best
performing producer cells
Use high-throughput methods such as single-droplet microfluidics
From R. Kodzius and T. Gojobori, "Marine metagenomics as
a source for bioprospecting," Mar Genomics, Aug 11 2015.
Extreme environment – brine sample
• Sample collected in 2012 from Red Sea Thuwal cold brine pool
• gDNA extraction, QC control - pass
• 16S rRNA library sequencing and CLCbio analysis of bacterial/
archaea diversity
• Library size ~600 bp
• ~1 million paired end reads obtained by MiSeq
• Merged paired reads ~600k
• After quality & adapter trimming ~591k
• Fixed length & high coverage trimming ~543k
• 7512 OTUs, using GreenGenes DB
About the cold seep
• From Wikipedia: “A cold seep (sometimes called a cold vent) is an area of the ocean floor where hydrogen sulfide, methane and other hydrocarbon-rich fluid seepage occurs, often in the form of a brine pool”
• Exchange of energy through organic-rich fluids
• Carbon and sulfur cycling
• Anaerobic oxidation of methane (AOM)
• Sulfate reduction
• Anaerobic ammonia oxidizing bacteria
• Thuwal cold seep 850 m deep, discovered in 2010
• Sampling was done in years 2011, 2012, and 2013
Thuwal Seep location
Sampling strategy
KAUST King Abdullah University of Science and Technology 20
From [1] H. L. Cao, W. P.
Zhang, Y. Wang, and P. Y. Qian,
"Microbial community changes along the active seepage site of one cold seep in the Red Sea," Frontiers in Microbiology, vol. 6, Jul 21 2015.
Operational taxonomic unit
Taxonomy Combined % Bacteria 346.673 89%
Archaea 34.889 9%
N/A 9.033 2%
Phylogram of the top 100 most abundant OTU
Alpha diversity
Number of species in a single sample
Remaining work on Brine DNA sample
Next steps, for the better understanding of the extreme environment community:
• Process sediment samples collected next to the brine pool
• Chemical & physical parameters of the sample
• Other organism diversity (such as 18S rRNA library for eukaryotes)
• Metagenomics library sequencing (provides information also about viral diversity)
• Short read Illumina sequencing
• Long read PacBio RSII sequencing
• RNA extraction & Metatranscriptome sequencing
Objective – to see if products of interest are present in the sample If not – plan trip to extreme environments suitable for specific sample
collection
Newly discovered Red Sea brine pools
• As published in H. L. Cao, W. P. Zhang, Y. Wang, and P. Y. Qian,
"Microbial community changes along the active seepage site of one cold seep in the Red Sea," Frontiers in Microbiology, vol. 6, Jul 21 2015.
• Conclusion – brine pools are very dynamic; appearing/
disappearing in short geological time
• Look for new hot & cold brine pools, analyze the thriving organism communities
• Perturbation is possible (submerge surfaces for interactions), as published in W. Zhang, Y. Wang, S. Bougouffa, R. Tian, H. Cao, Y. Li, et al.,
"Synchronized dynamics of bacterial niche-specific functions during biofilm development in a cold seep brine pool," Environ Microbiol, vol. 17, pp. 4089- 104, Oct 2015.
• The expected outcome – potential new producers, or genes/
pathways for the inserting to existing producers
Enzymes used in various industrial segments and their applications
From O. Kirk, T. Damhus, T. V. Borchert, C. C. Fuglsang, H. S. Olsen, T. T.
Hansen, et al., "Enzyme Applications, Industrial," in Kirk-Othmer
Encyclopedia of Chemical
Technology, ed: John Wiley & Sons, Inc., 2000.
Industrial enzymes and their use
From S. C. Charnock and B. V. McCleary,
"Enzymes: Industrial and Analytical
Applications," Revue des Enology, vol.
116, pp. 1 - 5, 2005.
Single droplet/ single-cell instrumentation
Why we need single cell research?
• The complexity of large numbers of various microorganisms in an environment
• Many attempts to enrich culture of a specific organism of interest (targeted-metagenomics approach):
• Processing a small number of cells that are relevant to a specific function in the environment
• FACS sorting – based on cell shape, size, density
• Single-cell genomics helps to reduce the heterogeneity of complex organism populations
Differences between metaomic and single-cell genomic methods
From
R. Kodzius and T.
Gojobori, "Single-cell technologies in
environmental omics,"
Gene, Oct 16 2015.
Single-cell isolation and processing
• The cell needs to be captured and lysed and then the released genetic material is amplified and the genomic library can be prepared and
sequenced
• Random encapsulation of cells, the simplest method – cells are randomly selected by serial dilution until one cell is contained in a small volume of medium
• Microwells
• Cell encapsulation by microdroplets
• The Fluidigm C1 Single-Cell Auto Prep System – single-phase system using multiple valves
• FACS is know for its high throughput, high sorting speed, and ability to sort live cells
• FACS drawbacks are the equipment is expensive, specialized technical expertise is required, individual cells cannot be visualized during the process, cell throughput is limited, and the processing volume is high (both sample and reagent). The cells less viable (high electric charge)
Various techniques to analyze the organisms
Single-celled
Technology Virus Archaea Bacteria Eukaryote Multicellular Note
Cultivation Yes (need host) Yes Yes Yes Yes Media requirement; Low throughput
Dilution Possible Yes Yes Yes Yes Low throughput
FACS Yes Yes Yes Yes Possible Slow growth after FACS; One-phase system
Droplet Possible Yes Yes Yes Yes High throughput; Incubation possible
Fluidigm No No No Yes No Chips handle single cells 3–25 μm
• Eukaryotic algae can be used as cultivable virus-host system
• Archaea and bacteria have a similar cell structure; only cell composition and organization set these two domains apart. Bacterial cells are
ordinarily 0.5–5.0 μm in length, while archaea can be larger than 15 μm in length
• Fluidigm chips are currently only suitable to handle organisms from the sizes of 3–5 μm, and up to 25 μm
• Multicellular organisms can theoretically be sorted and processed by various techniques
Microdroplet technology
• Emulsification occurs when water and oil phases are vigorously mixed – the cells are encapsulated in droplets
• PCR performed on an emulsion is much more specific
than in the water phase alone, yielding greater amounts of reaction products and forming fewer chimeric byproducts
• Microfluidics allow for control of droplet size to the
nanoliter or even the picoliter, providing aqueous uniform (monodisperse) droplets in oil
• Cell, even molecules like DNA can be encapsulated in the droplets. Long term incubation (hours, days) for the cell proliferation or enzymatic reactions is possible
• Droplets can be sorted by FACS or on chip, based on cell specific fluorescence measurements
Advantages of microfluidics technology
• High potential for parallelization and automation (can be controlled by a computer program such as C++ or LabView)
• Gentle cell handling
• Low potential for contamination
• Low sample and reagent consumption
• Physical phenomena such as diffusion and heat transfer take place quickly in small volumes, speeding up the whole process
• A small droplet volume facilitates a high sample concentration in the droplet and more efficient PCR and MDA reactions
• It is possible to reach high concentrations of cell-secreted molecules
• Transcription reaction to convert small amounts of mRNA to cDNA more efficiently (in tube 12% versus in droplet 54%)
• The droplet acts as an independent well or reaction chamber where cells can be grown and later screened for expressed products
Variations of microdroplet technologies
• Bacterial, yeast, plant, insect, mammalian cells, even multicellular organisms can be encapsulated and
analyzed using microdroplet technology, also on
agarose or other microgels – also with following FACS sorting
• Directed enzyme evolution – using fluorescent
substrates, cells can be regrown and enzymes with improved function and stability can be expressed
• There are already commercial solutions using
microdroplets for quantitative absolute real-time PCR
• Customized system can be built – we built based on
• L. Mazutis, J. Gilbert, W. L. Ung, D. A. Weitz, A. D. Griffiths, and J.
A. Heyman, "Single-cell analysis and sorting using droplet-based microfluidics," Nature Protocols, vol. 8, pp. 870-891, May 2013.
• Microfludics companies are providing suitable chips, even pump/ microscope setups
Differences between FACS and the two- phase microdroplet system
• FACS can detect the presence of cells in a droplet, reject empty droplets, and direct the cells with desired properties into multiwell plates.From here, cells can be lysed and genomes can be amplified in single wells
• In microdroplets, genotype and phenotype are connected through
compartmentalization such that the cells and their secreted molecules are
contained within droplets, even during sorting and analyzing. The emulsion can be broken down with isolation of genetic material
Genome sequencing from individual cells
• SCG provides direct access to fine-scale heterogeneity of complex microorganism populations
• Process flow: FACS on 96-well plates, cell lysis, and WGA. An amplified genome from a single cell can then be sequenced
• Data from sequencing provides quantitative information on genomic variability in microorganism populations
• Gene insertions, losses, duplications, and genome rearrangements can be analyzed on a single-cell level
• Complex metabolic pathways can be analyzed for an individual cell
• Data from metagenomics and SCG are complementary –
together they provide information about the metabolic potential and evolution of specific microorganisms
Single cell in a single droplet
High-throughput studies, tens of thousand of single-cells can be studied
• Genome
• Transcirptome
• Proteome
• Metabolome
• Other “Omes”
Barcodes (such as beads with coded oligonucleotides) enable pooling, later on assigning the single read-out to a specific cell
[1] E. Z. Macosko, A. Basu, R. Satija, J. Nemesh, K. Shekhar, M. Goldman, et al., "Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets," Cell, vol. 161, pp.
1202-1214, May 21 2015.
[2] A. M. Klein, L. Mazutis, I. Akartuna, N. Tallapragada, A. Veres, V. Li, et al., "Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells," Cell, vol. 161, pp. 1187-1201, May 21 2015.
In summary
• Our objective is related to discovery of industrially related genes, pathways and producers
• An extreme environment sample from Red Sea Thuwal cold brine pool is available and have been analyzed
• Can be producers, genes, pathways used in “usual”
environment production?
• There are databases to extract bioprospecting features from various sequencing projects, even from the depths of 4000 m
• The availability of real sample allows to obtain producers
growing on specific media, without the need of synthesizing the genes de novo
References
Recent publications about
bioprospecting and the single-cell technologies for marine research
R. Kodzius and T. Gojobori, "Marine metagenomics as a source for
bioprospecting," Mar Genomics, vol.
24, Part 1, pp. 21-30, Aug 11 2015.
R. Kodzius and T. Gojobori, "Single-cell technologies in environmental
omics," Gene, vol. 576, pp. 701-707, 2/1/ 2016.
Acknowledgements
• KAUST/ CBRC/ CEMSE; Distinguished Prof. Takashi Gojobori
• ICBB, ICAMR for invitation