2% 6%0%4%
3%1%
3%0%
10%
7%
10%
9% 4%
4%
8%
3%
2%
4%
1%
11%
8%
2% 5%0%3%
3%1%
4%0%
10%
7%
9%
10% 4%
4%
8%
3%
2%
4%
1%
11%
8%
D M N O T U V A J K L C E F G H I P Q R S
TMC 3115 BF3 PRI 1 S6
NCTC13001
MGYG-HGUT-02396 JCM 1255
BGN4 PRL2010 S17 Inner ring
Outer ring
a
D, 1%
M, 4%
N, 0%
O, 3%
T, 3% U, 1%
V, 2%
W, 0%
Y, 0%
Z, 0%A, 0%
B, 0%
J, 7%
K, 5%
L, 7%
C, 3%
E, 7%
F, 3%
G, 7%
H, 3%
I, 2%
P, 3%
Q, 1%
R, 8%
S, 6%
_ , 23%
[D] Cell cycle control, cell division, chromosome partitioning [M] Cell wall/membrane/envelope biogenesis
[N] Cell motility
[O] Post-translational modification, protein turnover, and chaperones [T] Signal transduction mechanisms
[U] Intracellular trafficking, secretion, and vesicular transport [V] Defense mechanisms
[W] Extracellular structures [Y] Nuclear structure [Z] Cytoskeleton
[A] RNA processing and modification [B] Chromatin structure and dynamics
[J] Translation, ribosomal structure and biogenesis [K] Transcription
[L] Replication, recombination and repair [C] Energy production and conversion [E] Amino acid transport and metabolism [F] Nucleotide transport and metabolism [G] Carbohydrate transport and metabolism [H] Coenzyme transport and metabolism [I] Lipid transport and metabolism [P] Inorganic ion transport and metabolism
[Q] Secondary metabolites biosynthesis, transport, and catabolism [R] General function prediction only
[S] Function unknown [-] Not assigned
0 100 200 300 400 500 600
1
METABOLISM NOT ASSIGNED
INFORMATION STORAGE AND PROCESSING
POORLY CHARACTERIZED CELLULAR PROCESSING AND SIGNALING
Supplementary Figure 3.1. Distribution of Cluster of Orthologues (COG) functional categories in TMC3115 genome. (a) The COG subcategories distribution. (b) Top four COG categories distribution.
a
b
98
BF3, 40
PRI 1, 147
S6, 0
TMC3115, 105 NCTC13001, 10
MGYG-HGUT-02396, 0 JCM1255, 4
BGN4, 49 PRL2010, 110
S17, 93
b
Supplementary Figure 3.2. Comparative genomics of B. bifidum TMC3115. (a) Distribution of COG categories among the strains. The numbers highlighted in black shows the average percentage of genes for each category while the number in red shows the percentage for the TMC3115 strain. COG classification: [D] Cell cycle control, cell division, chromosome partitioning; [M] Cell wall/membrane/envelope biogenesis;[N] Cell motility;[O] Post-translational modification, protein turnover, and chaperones;[T] Signal transduction mechanisms;[U] Intracellular trafficking, secretion, and vesicular transport;[V] Defense mechanisms;[A] RNA processing and modification; ;[J] Translation, ribosomal structure and biogenesis;[K] Transcription;[L] Replication, recombination and repair;[C] Energy production and conversion;[E] Amino acid transport and metabolism;[F] Nucleotide transport and metabolism;[G] Carbohydrate transport and metabolism;[H] Coenzyme transport and metabolism;[I] Lipid transport and metabolism;[P] Inorganic ion transport and metabolism;[Q] Secondary metabolites biosynthesis, transport, and catabolism;[R] General function prediction only;[S] Function unknown. (b) The number of unique genes present in each strain.
2% 6%0%4%
3%1%
3%0%
10%
7%
10%
9% 4%
4%
8%
3%
2%
4%
1%
11%
8%
2% 5%0%3%
3%1%
4%0%
10%
7%
9%
10% 4%
4%
8%
3%
2%
4%
1%
11%
8%
D M N O T U V A J K L C E F G H I P Q R S
TMC 3115 BF3 PRI 1 S6
NCTC13001
MGYG-HGUT-02396 JCM 1255
BGN4 PRL2010 S17 Inner ring
Outer ring
a
99
Supplementary Table 3.1. Sortase dependent pili clusters in B. bifidum strains. The strains are grouped in four groups based on number of pili and their pilin motifs.
CWSS Pilin Motif E box CWSS Pilin Motif E box CWSS Pilin Motif E box
PRL2010 3 LPGTGGNATLTVSTK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
KGALPTVVKK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
NCIMB 41171 3 LPGTGKGALPTVVKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
GNATLTVSTK DTLKVTVDNK GGAAATVYAK
GKTLLTVTMK VGKNVTVEYK
IGAGVTVGVK
BGN4 3 LPGTGKGALPTVVKK YTLTETEAPAGYLPLTG KGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK GTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
MJR8628B 3 LPGTGKGALPTVVKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVGTAATVTFK YTVTETAVADGY
DNTLLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
A8 3 LPGTGKGNLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
324B 3 LPGTGKGNLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
BF3 3 LPGTGKGNLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
Bbif1887B 3 LPGTGKGNLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
LMG 11582 3 LPGTGKGDLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
IGAGVTVGVK VGKTVTVEYK
LMG 13195 3 LPGTGKGDLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
IGAGVTVGVK VGKTVTVEYK
Calf96 3 LPGTGKGDLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DNTLLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
S6 3 LPGTGKGNLPTVDKK YTLTETKAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
HGUT02396 3 LPGTGKGNLPTVDKK YTLTETKAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK STRAIN
No of Pili
MAJOR PILINS
fimA fimA fimP Groups
G1
100
CWSS Pilin Motif E box CWSS Pilin Motif E box CWSS Pilin Motif E box IGAGVTVGVK
S17 3 LPGTGKGDLPTVDKK YTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
ATCC 29521 3 LPGTGKGDLPTVDKKYTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
LMG 11041 3 LPGTGKGDLPTVDKKYTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
IGAGVTVGVK VGKTVTVEYK
DSM 20456 3 LPGTGKGDLPTVDKKYTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
JCM 1255 3 LPGTGKGDLPTVDKKYTLTETEAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
IGAGVTVGVK VGKTVTVEYK
NCTC13001 3 LPGTGKGDLPTVDKK YTLTETKAPAGYLPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
TMC3115 3 LPGTGKGALPTVVKK YTLTETEAPAGYLPLTG IGAGVTVGVK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK GGAAATVYAK
JCM 1254 3 LPGTGNNNTLTVAMKYTLTETEAPAGYLKYTGNGYQFTVSDK YTIEEIAAPNGY LPKTGGGAAATVYAKYTVTETAVADGY DTLKVTVDNK
3 LPGTGKGDLPTVDKKYTLTETEAPAGYLKYTGNGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK DTLKVTVDNK GGAAATVYAK
156B 2 LPGTGKGALPTVVKK YTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK GGAAATVYAK
ICIS-310 2 LPGTGKGDLPTVDKKYTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK GGAAATVYAK
2789STDY56088772 LPGTGKGNLPTVDKKYTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK GGAAATVYAK
791 2 LPGTGKGNLPTVDKKYTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK GGAAATVYAK
BI-14 2 LPGTGKGNLPTVDKKYTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
NNNTLTVAMK GGAAATVYAK
IPLA 20015 2 LPGTGKGDLPTVDKKYTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
GNATLTVSTK GGAAATVYAK
85B 2 LPGTGGNATLTVSTK YTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
KGDLPTVDKK GGAAATVYAK
IPLA 20017 2 LPGTGKGDLPTVDKKYTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
GKTLLTVAMK GGAAATVYAK
LMG 11583 2 LPGTGDNATLTVSTK YTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
KGALPTVVKK GGAAATVYAK
G1971 2 LPGTGKGDLPTVDKKYTLTETEAPAGY LPKTGVDTAATVTFK YTVTETAVADGY
GGAAATVYAK
62-13 2 LPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
ASM157686v1 2 LPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
CAG234 2 LPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
PRI1 2 LPLTG NGYQFTVSDK YTIEEIAAPNGY LPKTGVDTAATVTFK YTVTETAVADGY
DTLKVTVDNK GGAAATVYAK
VGKNVTVEYK IGAGVTVGVK
ASM157689v1 1 LPKTGVDTAATVTFK YTVTETAVADGY
GGAAATVYAK STRAIN
No of Pili
MAJOR PILINS
Groups
fimA fimA fimP
G2
G3
G4
G1
101