ContentslistsavailableatScienceDirect
Data in Brief
journalhomepage:www.elsevier.com/locate/dib
Data Article
A high-resolution mass spectrometry based proteomic dataset of human regulatory T cells
Harshi Weerakoon
a,b,c, John J. Miles
d,e, Ailin Lepletier
d,f, Michelle M. Hill
a,g,∗aPrecision and Systems Biomedicine Laboratory, QIMR Berghofer Medical Research Institute, Herston, QLD, Australia
bSchool of Biomedical Sciences, The University of Queensland, Brisbane, QLD, Australia
cDepartment of Biochemistry, Faculty of Medicine and Allied Sciences, Rajarata University of Sri Lanka, Saliyapura, Sri Lanka
dHuman Immunity Laboratory, QIMR Berghofer Medical Research Institute, Herston, QLD, Australia
eAustralian Institute of Tropical Health and Medicine, James Cook University, Cairns, QLD, Australia
fLaboratory of Vaccines for the Developing World, Institute for Glycomics, Southport, QLD, Australia
gCentre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane, QLD, Australia
a rt i c l e i n f o
Article history:
Received 18 September 2021 Revised 21 November 2021 Accepted 3 December 2021 Available online 6 December 2021 Keywords:
Regulatory T cell Conventional T cell Proteomics
Tandem mass spectrometry LC-MS/MS
a b s t r a c t
Regulatory T cells (Tregs) play a core role in maintain- ingimmunetolerance,homeostasis, and hosthealth. High- resolutionanalysisoftheTregproteomeisrequiredtoiden- tify enrichedbiological processesand pathways distinct to thisimportantimmune celllineage. Wepresent acompre- hensiveproteomicdatasetofTregspairedwithconventional CD4+ (ConvCD4+)Tcells inhealthyindividuals.Tregs and Conv CD4+ T cells were sorted to high purity using dual magneticbead-basedandflowcytometry-basedmethodolo- gies.Proteinsweretrypsin-digestedandanalysedusinglabel- free data-dependent acquisition mass spectrometry (DDA- MS) followed by label free quantitation (LFQ) proteomics analysis using MaxQuant software. Approximately 4,000 T cellproteinswereidentifiedwitha1%falsediscoveryrate,of whichapproximately2,800proteinswereconsistentlyiden- tified and quantified in all the samples. Finally, flow cy- tometry with amonoclonal antibodywas used to validate the elevated abundance ofthe protein phosphatase CD148 inTregs.Thisproteomicdatasetservesasareference point for future mechanistic and clinical T cell immunology and
∗ Corresponding author.
E-mail address: [email protected] (M.M. Hill).
https://doi.org/10.1016/j.dib.2021.107687
2352-3409/© 2021 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
identifies receptors, processes, and pathways distinct to Tregs. Collectively, these datawill lead to a better under- standing of Tregimmunophysiology and potentially reveal novelleadsfortherapeuticsseekingTregregulation.
© 2021TheAuthor(s).PublishedbyElsevierInc.
ThisisanopenaccessarticleundertheCCBY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/)
SpecificationsTable
Subject Immunology
Specific subject area Regulatory T cells (Tregs) play a key role in directing adaptive immunity, particularly enforcing immune tolerance. However, few studies have examined the ex vivo Treg proteome. High-resolution and fully quantitative MS was used to profile the Treg and conventional CD4 +(Conv CD4 +) T cell proteomes from healthy human blood.
Bioinformatics yielded reliable, reproducible, and quantitative data.
Differentially abundant proteins were identified between subsets, and flow cytometry was performed to validate elevated cell surface levels of CD148 in Tregs.
Type of data Tables and Figures.
How data were acquired Orbitrap Fusion TMTribrid TM(Thermo Fisher Scientific, USA) inline coupled to nano ACQUITY UPLC (Waters, USA) was used to acquire label-free proteomic data using data-dependent acquisition (DDA-MS).
Data format Raw and analysed data.
Parameters for data collection CD3 +, CD4 +, CD25 high, CD127 low, FOXP3 +and CD3 +, CD4 +, CD25 −, FOXP3 −cells were identified as Tregs and Conv CD4 +, respectively.
Peptides were separated using a 160 mins chromatographic gradient at 0.3 μl/min flow rate while ionised at 1900 V and 285 °C. MS1 ranged, and resolutions were 380 – 1500 m/z and 120,000 FWHM,
respectively. Injection time for MS1 and MS2 were 50 ms and 70 ms, respectively. Fifteen ions were selected to trigger MS2 at 90 s dynamic exclusion. The total cycle time was two seconds.
Description of data collection Label-free proteomics data were acquired on peptide samples prepared from Treg and Conv CD4 +from peripheral blood mononuclear cells (PBMC) from three healthy volunteers (age 30-35 years) at the QIMR Berghofer Medical Research Institute (QIMRB), Brisbane, Australia.
Data source location Raw proteomic data are available via ProteomeXchange [1] . Data accessibility Repository name: ProteomeXchange
Data identification number: PXD022095 Direct URL to data:
http://www.ebi.ac.uk/pride/archive/projects/PXD022095 FTP Download:
ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2021/08/PXD022095 Related research article H. Weerakoon, J. Straube, K. Lineburg, L. Cooper, S. Lane, C. Smith, S.
Alabbas, J. Begun, J.J. Miles, M.M. Hill, A. Lepletier, Expression of CD49f defines subsets of human regulatory T cells with divergent
transcriptional landscape and function that correlate with ulcerative colitis disease activity., Clin. Transl. Immunol. 10 (2021) e1334.
https://doi.org/10.1002/cti2.1334 .
ValueoftheData
• This label-free quantitative dataset provides a high-resolution landscape of Treg andConv CD4+TcellproteomesfromprimaryTcellsubsetsincirculatinghumanblood.
• Tcell immunologists canusethisdataset toexplore TregandConv CD4+ Tcellproteomic landscapesandprovidenovelinsightsintothefundamentalworkingsofTcellsfromdifferent lineages.
• ClinicalresearcherscanusethisdatasettoexploreTregproteindysfunctionindifferentclini- calsettings.Thesestudieswillhelpcorrectdysfunctionthroughnewinterventionsandresult inbetterpatientmanagementandoutcomesacrossTregassociateddiseases.
• Researchers in T cell biology andimmunology can use the described methods to achieve furtherknowledgegainintheimportantfieldofimmunoproteomics.Themethodologieswill alsoaidresearchersworkingwithsmallamountsofvolunteerorpatientmaterial.
1. DataDescription
TheproteomicdatasetdescribedherewasgeneratedfromregulatoryTcells(Tregs),andcon- ventionalCD4+ Tcells(Conv CD4+) purifiedfromperipheral blood mononuclearcells(PBMC) of three healthyvolunteers (age 30-35 years). The workflow across samplecollection,protein processing, anddata-dependentacquisitionmassspectrometry(DDA-MS)analysisaredepicted in Fig. 1 and detailedin the experimental design, materials, and methods section. The DDA- MSdatawereanalysedusingMaxQuantsoftware(Release1.6.0.16)[2]againstUniProt/SwissProt humanrevieweddatabase[3]using1% falsediscoveryrate(FDR)cut-off tofilterpeptidespec- tralmatchesandtheproteins.The.rawfilesobtainedfromDDA-MSanalysisandtheMaxQuant search resultsare available throughthe ProteomeXchangedatarepository (PXD022095)asde- tailedinTable1.Thesepubliclyavailablerawdatacanbere-analysedusingdifferentparameters anddatabasestoretrievemorespecificinformationdependingontheresearchinterest.Thecon- sistencyofproteomicdataacrossdonorsareshowninFig.2,includingMS/MSspectralcountper sample(Fig.2A)andperprotein(Fig.2B),thenumberofpeptidespersample(Fig.2C)andper protein(Fig.2D),percentageofaminoacidsequencecoverageinpeptides(Fig.2E)andpersam- ple(Fig.2F).Oftheidentified totalproteins(n=4,177),92%wereidentified withsingleUniProt IDsinUniProt/SwissProtdatabase(Fig.2G).Asimilarnumberofproteinswasidentifiedforeach donorinboththeTregandConvCD4+ Tcells(Fig.2H).
NormalisedproteinintensityvaluesobtainedfromMaxLFQ[4]inMaxQuantwerethenused indifferentialabundanceanalysistoidentifyproteinsenrichedinTregsrelativeto pairedConv CD4+. First, data cleaning retrieved proteins with (i) at least two unique razor peptides, (ii) an m-scoreof five and(iii) lessthan 50% missing intensity valuesacross samples. The miss- ing intensityvalueswere imputedusingthe maximumlikelihood estimate algorithm(Rpack- age) [5].Next, differentially abundantproteins were identified using multiplet-test withFDR correctionfromapublishedtwo-stagelinearstep-upprocedure [6](Table S1,Fig.1C). Analysis
Table 1
Description of the data files deposited in ProteomeXchange data repository under the data identification number of PXD022095.
Title of the file/folder Description
1 Rep1_Treg.raw .raw file of Treg cells - Replicate 1
2 Rep2_Treg.raw .raw file of Treg cells - Replicate 2
3 Rep3_Treg.raw .raw file of Treg cells - Replicate 3
4 Rep1_nonTreg.raw .raw file of Conv CD4 +T cells - Replicate 1 5 Rep2_nonTreg.raw .raw file of Conv CD4 +T cells - Replicate 2 6 Rep3_nonTreg.raw .raw file of Conv CD4 +T cells - Replicate 3
7 search.zip MaxQuant ouput files resulted from the
analysis of the above .raw files against UniProt/SwissProt human reviewed proteome
8 parameters.txt Parameters used in the data analysis through
MaxQuant search engine
9 human_proteome_reviewed_25102017.fasta UniProt/SwissProt proteome database used in the analysis
10 Treg_nonTreg_protein_quantification.txt MaxQuant ouput files giving the protein quantification data and LFQ normalised protein intensities
Fig. 1. Study workflow. A. Workflow for sorting Tregs and Conventional (Conv) CD4 +. PBMC obtained from three volun- teers underwent two rounds of magnetic-activated cell sorting (MACS) followed by flow cytometry based cell sorting (FACS), yielding Tregs and Conv CD4 + at high purity. B. Proteomic sample preparation and mass spectrometry (LC- MS/MS) analysis. Peptide samples were prepared from whole cell lysate using protein co-precipitation with trypsin in methanol. The resulting tryptic peptides were desalted before LC-MS/MS analysis on Obitrap Fusion Tribrid inline coupled to nano ACQUITY UPLC. C. DDA-MS data were deconvoluted using the MaxQuant search engine against the UniProt/SwissProt human proteome. Differential expression analysis was performed on only high-quality label-free pro- tein intensity data D. Orthogonal validation of CD148 ( PTPRJ ) enrichment in Treg cells.
returned 227 differentially abundantproteins betweenTreg andConv CD4+ Tcells, with157 (69%oftotal)proteinssignificantlyenrichedinTregs.Theenrichedbiologicalprocessesineach celltypeweredeterminedusinggeneontology(GO)andpathwayanalysiscomprisingKyotoEn- cyclopaediaofGenesandGenomes(KEGG)analysis[7],andReactomepathways[8].Functional enrichmentanalysisusingtheSTRINGnetwork[9]revealednoenrichedbiologicalprocessesfor ConvCD4+Tcells,whileTregcellswereenrichedwithnegativeregulationofinterferon-gamma
Fig. 2. Summary of DDA-MS proteomic dataset in human Tregs. Label-free DDA-MS data were analyzed using MaxQuant search engine against UniProt/SwissProt human reviewed proteome database. MS/MS spectral data determined at 1% FDR were selected for further analysis across Tregs (blue) and Conv CD4 +(orange). A . The number of MS/MS spectra detected in each sample. B. Mean MS/MS spectra per protein per sample. C. Total number of peptides and unique + razor peptides detected in each sample. D . Mean unique + razor peptides per protein per sample. E. Percentage of amino acid sequence coverage from the peptide dataset (unique + razor). F. Mean percentage of amino acid sequence coverage per peptide per sample (unique + razor). G. Total number of proteins quantified with single or multiple UniProt entries. Here, 92% of proteins had single entries (light grey), and 8% of proteins had multiple peptide entries (dark grey). H. The number of proteins quantified per sample. Error bars with standard deviation are shown.
production and T cell activation and leukocyte cell-cell adhesion and nuclear protein export (Fig.3A).Further, extracellularmatrix organisation, mitoticprophase andnuclearenvelopere- assembly were the most enriched pathways in Tregs. Conv CD4+ showed enrichment of vi- ralmRNAtranslation,selenocysteinesynthesisandpeptidechainelongationpathways.Detailed
Fig. 3. Tregs and Conv CD4 +T cell exhibit divergent proteomes. A. Word clouds of biological processes and reactome pathways enriched in Treg versus Conv CD4 +. B. Volcano plot showing differential protein abundance of proteins be- tween Treg and Conv CD4 +, using q < 0.05 and log 2FC > 1 or -1 as cut-off (dotted lines). When comparing Conv CD4 +, proteins significantly upregulated in Treg (red) and downregulated (blue) are shown, along with CD49f ( ITGA6 ) and CD148 (PTPRJ) which are shown in green. Gene lists of the top 10 most upregulated and top 10 most downregulated proteins are shown. C. Proteomic data quantifying CD148 levels in Treg and Conv CD4 +. Pairwise comparison showed higher abundance of CD148 in Treg cells in all donors (q < 0.0 0 01, multiple t-test with false discovery rate correction).
D. Flow cytometry data for cell surface CD148 in Treg and Conv CD4 +T cells in six healthy donors denoted by different colors. Note some of the data points overlap ( ∗p < 0.05, Mann Whitney U test). E. Representative histogram from one donor displaying the fluorescence intensity difference of CD148 between two T cell populations.
results,including enrichmentscore, thenumber ofproteins detected, andgenenomenclature, areprovidedinTableS2.
To validate the proteomicsdataset by orthogonal methods, we selected two Treg-enriched cellsurfaceproteinswithavailablemonoclonalantibodiesforflowcytometry,specificallyCD49f (ITGA6) andCD148 (PTPRJ). The relative abundance of these andother proteins are shownin volcanoplot(Fig.3B). Wehaverecentlyreportedtheflow cytometryandfunctionalvalidation of CD49finTregs [10].CD148 isa protein tyrosinephosphatase that regulates T cellreceptor (TCR) signalling through Srcfamily kinases(SFK), andhasdual inhibitory/activatory functions andimmunecell-specificpatterning[11].InagreementwiththemarkedlyenrichedbyDDA-MS (Log2FC=3.02, p < 0.001, Fig. 3C), flow cytometry validation confirmed CD148enrichment in Tregs(Fig.3Dand3E).
2. ExperimentalDesign,Materials,andMethods 2.1. Experimentaldesign
TheexperimentalphasesaredepictedinFig.1,includingTcellisolation(Fig.1A),proteomic samplepreparationanddata acquisition (Fig. 1B), proteomicsdatadeconvolutionand analysis (Fig.1C),andtheorthogonalvalidation(Fig.1D).
2.2. HumanTcellisolation
Sequential magnetic-activated cell isolation (MACS) and flow cytometry-based cell sorting (FACS)wereusedtoisolateTregsandConvCD4+TcellsfromPBMCtohighpurity(Fig.1A).
2.2.1. IsolationofhumanPBMCfromvenousblood
PBMCwere isolatedfromfreshvenousbloodfromvolunteersatQIMRB,Brisbane,Australia.
Ficoll-PaqueTM densitygradientmedium(GE-Healthcare,USA)wasusedtoseparatePBMCfrom peripheralblood.Sampleswere centrifugedat434xgfor20minswithano-brakedeceleration.
PBMCwerethenwashedthreetimeswithRoswellParkMemorialInstituteMedium(RPMI-1640) medium.
2.2.2. MACSpurificationofTcellsubsets
CD3+Tcellsnext purifiedfromPBMC(n=3)usingMACS.Here,ahumanpanTcellisola- tionkit(MiltenyiBiotec,Germany)wasusedfornegativecellselectionasperthemanufacturer’s instructions.TheunlabeledCD3+Tcellflow-throughwascollectedandthenmovedtoanother round ofMACS cell isolation. Here, purifiedCD3+ T cellswere labelledwithCD25-PE (BioLe- gend,USA)for20mins at4°C, washedwithcoldMACS buffer(Miltenyi Biotec,Germany)and thenlabelledwithanti-PEmagneticbeads(MiltenyiBiotec,Germany)andpurifiedpertheman- ufacturer’sinstructionsusingLScolumns(MiltenyiBiotec,Germany).CD25high (columnbound) andCD25low(flow-through)Tcellswerecollected,washedinR10medium(RPMI-1640contain- ing 10% FCS) andtransferred into5ml tubesfor FACS-based sorting.Thisdual MACSmethod reducesFACStimeandincreasessortyields.
2.2.3. FACSpurificationofTregsandConvCD4+Tcells
MACSsortedCD25high,andCD25lowTcellswerenextstainedwithLIVE/DEAD®FixableAqua (LifeTechnologies,USA),CD3-APCe780(eBioscience, ThermoFisher, Scientific,USA),CD4-VB711 andCD127-BV786(BDBiosciences,USA)for20minsat4°C,washedwithcoldFACSbufferthree timesforsorting.SinglecellsweresortedonaBDFACSAriaIII(BDBiosciences,USA)gatedusing FSC-WandFSC-H,andnon-viablecellsweredumped.Insorting,firsttheviablesinglecellswere gated toselectCD3+ andCD4+ Tcell population.OfthemTregCD25high,CD127low cellswere
sortedasTreg cellswhileCD25− cellswere sortedas Conv CD4+ Tcells[10].Lowflow rates were usedto increase yield througha reduction in streamabort events.Approximately 106 T cellsweresortedforTregandConvCD4+intandemforeachsample.EachsortedTcellsample waswashed three times withphosphate bufferedsaline (pH – 7.2)and then lysedin 100 μl lysisbuffercontaining1%sodiumdodecylsulphate(Biorad,USA)in100mMTriethylammonium bicarbonate (Sigma-Aldrich,USA) and 1 x Roche complete protease inhibitor cocktail (Sigma- Aldrich,USA)andstoredat-80°Cforproteomicsamplepreparation.
2.2.4. Proteinextractionandtrypsindigestion
Eachcelllysatewasnextthawedonice,and200ngofovalbumin(Sigma-Aldrich,USA)was addedas an internal standard. The amount of protein ineach cell lysatewas quantified at a wavelengthof562nmusingPiercebicinchoninicacid(BCA)proteinassay(ThermoFisherScien- tific,USA),followingthe manufacturer’sinstructions.Fortrypsindigestion, 20μg proteinfrom eachsamplewasreducedwith10mMofTris(2-carboxyethyl)phosphinehydrochloride(Thermo FisherScientific,USA)at60°Cfor30minsfollowedbyalkylationwithfreshlyprepared40mM chloroacetamide (Sigma, USA) at37°C inthe dark for45 mins in a volume of 100 μl. Deter- gentsinthesampleswerethenremovedusingproteinco-precipitationwithtrypsininmethanol [12,13]using1μlof1μg/μlsequencinggrademodifiedporcinetrypsin(Promega,USA)and1ml of100%cold(-20°C)chromARgrademethanol(HoneywellResearchChemicals,USA)toobtain a final lysate: methanolratio of1:10. Samples were mixedthoroughly by vortexing andkept overnightat-20°Cforproteinprecipitation.Thenextday,sampleswerecentrifugedfor15mins at16,100xg at4°C,andthesupernatantwasaspirated carefullywithout disturbingtheprotein pellet.The protein pelletswere washed (15 minsat 16,100xg at4°C)two timeswith1 mlof 90%and100%cold methanol,respectively.Methanolwasremovedafterthesecond centrifuga- tionthroughcarefulpipettingandairdryingfor2-3mins.Proteinpelletswerethenresuspended in50mMTEABcontaining5%acetonitrile(ACN,Honeywellresearchchemicals,USA)andmixed thoroughlybyrepeatedpipettingandvortexing.Sampleswerenextincubatedat37°Cfor2hrs withshaking. One μl of 1 μg/μl (1:100) trypsin wasthen added, vortexed andincubated for 12 hrs at37°C. The enzymaticreaction wasnext inhibited by adding25 μl of5% formicacid (FA,Sigma,USA)anddesaltedusingstrata-xpolymericreversed-phase10mg/mlC18cartridges (Phenomenex,USA). Conditioning,equilibrating, and washingsteps in thedesalting procedure wereperformedasfollows.First,theC18cartridgeswereconditionedthrough2washesof100%
ACNandequilibratedvia 2washeswith0.1% FA.Sampleswere then incubatedinconditioned cartridgesfor1mintoallowadsorptionofionisedpeptides.Cartridgeswerewashedtwicewith 0.1%FA,andadsorbedpeptideswereelutedintonew1.5mlmicrocentrifugetubesusing500μl 80%ACNin0.1%FA.Ineachofthesesteps1mloftherelevantsolutionwasused.Peptidesam- pleswerefinallydriedinaspeedvacuumat35°Candstoredat-80°Cuntilmassspectrometry (LC-MS/MS)analysis.
2.2.5. PeptidequantificationandpreparationforLC-MS/MSanalysis
ForLC-MS/MSanalysis,eachsamplewasresuspendedin20μlof0.1%FA,andpeptidequan- tificationwasperformedusingPiercemicroBCAproteinestimationkit(ThermoFisherScientific, USA)asperthemanufacturer’sinstructions.Afteradjustingthepeptideconcentration0.2μg/μl, 5μlofsolution(1μgpeptide)fromeachsamplewasinjectedintotheLC-MS/MSsystem.
2.2.6. LC-MS/MSproteomicsampleanalysis
HybridLC-MS/MS system; Orbitrap FusionTM TribridTM (Thermo Fisher Scientific,USA) was usedtoanalysethepeptidesamplesusingDDA-MS.TheLC-MS/MSparametersusedinsample acquisitionaredetailedinTable2.
2.2.7. DDA-MSdatadeconvolution,proteinidentificationandintensitynormalisation
MaxQuant(Release1.6.0.16)softwarewasusedintheanalysisof.rawfilesfromDDA-MS,by searchingagainst theUniProt/SwissProthuman reviewedproteome databasecontaining20,242 entries(October,2017). Parametersusedinpeptideandproteinidentificationinthisstudyare
Table 2
Chromatographic and mass spectrometry parameters used in sample analysis.
Parameter Description/Settings
Mass Spectrometer LC system
Orbitrap Fusion TMTribrid TM(Thermo Fisher Scientific, USA) nanoACQUITY UPLC (Waters, USA)
Total LC gradient 175 minutes
Buffers A 0.1% FA
B 100% ACN + 0.1% FA
LC gradient
(buffer B concentration) 5% at 3 minutes, 9% at 10 minutes, 26% at 120 minutes, 40% at 145 minutes, 80% at 152 minutes, 80% at 157 minutes and 1% at 160 minutes
Trap column Symmetry C18 trap, 2G VM trap (Waters, USA),
100 ˚A, 5 μm particle size, 180 μm ×20 mm
Column BEH C18 (Waters, USA), 130 ˚A, 1.7 μm particle size,
75 μm ×200 mm
Flow rate 0.3 μl/ minutes
Ion source EASY-Max NG TMion source (Thermo Fisher Scientific, USA
Ion spray voltage 1900 V
Heating temperature 285 °C
Data acquisition method DDA-MS
MS1 mass range 380 – 1500 m/z
Injection time 50 milliseconds
Resolution 120,0 0 0 FWHM
Charge state Rejecting + 1, Selecting + 2 to + 7
No. of ions selected to trigger MS2
Fragmentation 15
Higher Energy C-trap Dissociation (HCD) Injection time
Resolution
70 milliseconds 30,0 0 0 FWHM
Dynamic exclusion 90 seconds
Cycle time 2 seconds
Table 3
Parameters used in DDA-MS data analysis and label-free quantification using MaxQuant software and MaxLFQ.
Parameter Settings
Digestive enzyme Trypsin
Maximum number of miscleavages 2
Fixed modification Carbamidomethylation (Cystiene)
Variable modifications Oxidation (Methionine), Acetylation (Protein N-terminal)
Precursor mass tolerance ±20 ppm
Product mass tolerance ±40 ppm
Maximum peptide charge + 7
Match between runs Yes (0.7 minutes match time window and 20 minutes alignment time window)
Protein FDR 0.01
Peptide spectral match (PSM) FDR 0.01
Minimum peptides 1
Peptide for protein quantification Unique + Razor
Peptide selection for quantification Unmodified peptides and only the peptides modified with oxidation and N terminal acetylation
No. of peptides for quantification ≥2
summarisedinTable3.MaxLFQtool inMaxQuantsoftwarewasusedtonormalisepeptideand proteinintensitiesforlabel-freequantification.Forrobustness,theinternalcontrol(chickenoval- bumin,UniProtIDP01012)andthecommonMScontaminantdatabaseinbuiltinMaxQuantsoft- warewereusedforbenchmarking.
2.2.8. Datafilteringandmissingvalueimputation
Common contaminants and false-positive proteins detected duringMaxQuant search were removed manually from further analysis. Proteins with more than one UniProt/SwissProt
accessionsidentified with≤ 1 peptideand/or a m_score of≤ 5 were removed. The resulting protein lists were further examined forthe percentage of proteins with unrecorded intensity values,and onlythe proteins with < 50% missing protein intensityvalues were selected and imputedusingmaximumlikelihoodestimation(Rpackage)andusedfordownstreamanalysis.
2.2.9. Differentiallyabundantproteinidentificationandpathwaysanalyses
LFQnormalised expressiondataoftheselectedproteinswere usedtoidentifythedifferen- tiallyabundantproteins.Followingimputingthemissingvaluesintheseselectedproteins,dif- ferentiallyabundantproteinsbetweenTregandConvCD4+wereanalysed(Tregvs.ConvCD4+).
Theresults were obtainedasLog2FC values,wherethe positive valuesrepresenthighlyabun- dant proteins in Treg cells. Multiple t-testwith false discovery determination by a two-stage linearstep-upprocedure ofBenjamini,KriegerandYekutieli wasusedtocalculateq valuesof differentiallyabundant proteins betweenTregs andConv CD4+ Tcells. Functional enrichment ofthe proteomicdataset wasanalysed usingLog2FC valuesusing thebioinformatics software, STRING:Functionalproteinassociationnetwork,version11.5todetermineGOfunctionalenrich- ment,KEGGandReactomepathwayswithacut-off FDRof0.05%.TheR packageandGraphPad Prism(version9.2.0 forWindows,GraphPadSoftware,SanDiego,CaliforniaUSA)were usedin bioinformaticsanalysisandgraphgeneration.Wordcloudsweregeneratedusingtheonlinetool (https://www.wordclouds.com,September8th,2021).
2.2.10. ValidationofMSresultsbyflowcytometry
DDA-MSdatawasvalidatedusingFACS.Here,106PBMCfromsixhealthyindividuals,includ- ingdonorsusedintheproteomicsstudy,werelabelledwiththeforkheadboxP3(FOXP3)stain- ingkitasperthemanufacturer’sinstructions(Biolegend,USA),followedbysurfacestainingwith LIVE/DEAD® Fixable Aqua (Life Technologies, USA),CD3-APCe780 (eBioscience, Thermo Fisher, Scientific, USA), CD4-BV711 (BD Biosciences, USA) CD25-PEcy7 (BD Biosciences, USA), CD127- BV786(BDBiosciences, USA)andCD148-PE (BioLegend, USA)for20mins at4°C,andwashed withcold FACS buffer three times. Single cells were analysed on a BD LSRFortessa (BD Bio- sciences,USA)usingBDFACSDiva8.0software(BDBiosciences,USA).GatingcomprisedFSC-W andFSC-H withnon-viable cellsdumped.Flow data wasanalysed usingFlowJo v10(TreeStar, USA).
EthicsStatement
EthicalclearanceforthisstudywasobtainedfromtheQIMRBhumanresearchethicscommit- tee(HREC, #P2058).Informedconsentwasobtainedfromallvolunteersandthestudyadhered totheDeclarationofHelsinkiof1975.
CRediTAuthorStatement
HarshiWeerakoon:Data curation;Formal analysis;Validation;Investigation;Methodology;
Writing-originaldraft;John J.Miles: Conceptualization;Methodology; Resources;Supervision;
Writing-Review & Editing; Project administration; Funding acquisition; Ailin Lepletier: Con- ceptualization; Methodology; Supervision; Writing-Review & Editing; Project administration;
Michelle M. Hill: Conceptualization; Methodology; Resources; Supervision; Writing-original draft;Projectadministration.
DeclarationofCompetingInterest
Theauthorsdeclarethattheyhavenoknowncompetingfinancialinterestsorpersonalrela- tionshipswhichhaveorcouldbeperceivedtohaveinfluencedtheworkreportedinthisarticle.
Acknowledgments
We thank the volunteers for blood samples for this study. We sincerely thank Dr Renee RichardsandDrJasonMulvenna(QIMRB)foradministrativeandprojectsupport.HWwassup- ported by an International Postgraduate Research Scholarship, The University of Queensland, Brisbane, Australia, and PhD Top-up Scholarship (QIMRB). JJM was supported by a National HealthandMedicalResearchCouncilCareerDevelopmentLevel2Fellowship(1131732).Theau- thorsthankthestaff fromtheflowcytometryfacilityandtheanalyticalfacilityatQIMRB.
SupplementaryMaterials
Supplementary material associatedwith this articlecan be found in theonline version at doi:10.1016/j.dib.2021.107687.
References
[1] Y. Perez-Riverol, A. Csordas, J. Bai, M. Bernal-Llinares, S. Hewapathirana, D.J. Kundu, A. Inuganti, J. Griss, G. Mayer, M. Eisenacher, E. Pérez, J. Uszkoreit, J. Pfeuffer, T. Sachsenberg, S. Yilmaz, S. Tiwary, J. Cox, E. Audain, M. Walzer, A.F. Jarnuczak, T. Ternent, A . Brazma, J.A . Vizcaíno, The PRIDE database and related tools and resources in 2019:
improving support for quantification data, Nucleic Acids Res. 47 (2019) D442–D450, doi: 10.1093/nar/gky1106 . [2] J. Cox, M. Mann, MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies
and proteome-wide protein quantification, Nat. Biotechnol. 26 (2008) 1367–1372, doi: 10.1038/nbt.1511 .
[3] T.U. Consortium, UniProt: a worldwide hub of protein knowledge, Nucleic Acids Res. 47 (2019) D506–D515, doi: 10.
1093/nar/gky1049 .
[4] J. Cox, M.Y. Hein, C.A. Luber, I. Paron, N. Nagaraj, M. Mann, Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ, Mol. Cell. Proteom. 13 (2014) 2513–
2526, doi: 10.1074/mcp.M113.031591 .
[5] RStudio Team (2020). RStudio: Integrated Development for R. RStudio, PBC, Boston, MA URL http://www.rstudio.
com/ .
[6] Y. Benjamini, A.M. Krieger, D. Yekutieli, Adaptive linear step-up procedures that control the false discovery rate, Biometrika 93 (2006) 491–507, doi: 10.1093/biomet/93.3.491 .
[7] M. Kanehisa, S. Goto, KEGG: kyoto encyclopedia of genes and genomes, Nucleic Acids Res. 28 (20 0 0) 27–30, doi: 10.
1093/nar/28.1.27 .
[8] B. Jassal, L. Matthews, G. Viteri, C. Gong, P. Lorente, A. Fabregat, K. Sidiropoulos, J. Cook, M. Gillespie, R. Haw, F. Loney, B. May, M. Milacic, K. Rothfels, C. Sevilla, V. Shamovsky, S. Shorser, T. Varusai, J. Weiser, G. Wu, L. Stein, H. Hermjakob, P. D’Eustachio, The reactome pathway knowledgebase, Nucleic Acids Res. 48 (2020) D498–D503, doi: 10.1093/nar/gkz1031 .
[9] D. Szklarczyk, A.L. Gable, D. Lyon, A. Junge, S. Wyder, J. Huerta-Cepas, M. Simonovic, N.T. Doncheva, J.H. Mor- ris, P. Bork, L.J. Jensen, C. von Mering, STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets, Nucleic Acids Res. 47 (2019) D607–D613, doi: 10.1093/nar/gky1131 .
[10] H. Weerakoon, J. Straube, K. Lineburg, L. Cooper, S. Lane, C. Smith, S. Alabbas, J. Begun, J.J. Miles, M.M. Hill, A. Lepletier, Expression of CD49f defines subsets of human regulatory T cells with divergent transcriptional land- scape and function that correlate with ulcerative colitis disease activity, Clin. Transl. Immunol. 10 (2021) e1334, doi: 10.1002/cti2.1334 .
[11] O. Stepanek, T. Kalina, P. Draber, T. Skopcova, K. Svojgr, P. Angelisova, V. Horejsi, A. Weiss, T. Brdicka, Regulation of Src family kinases involved in T cell receptor signaling by protein-tyrosine phosphatase CD148, J. Biol. Chem. 286 (2011) 22101–22112, doi: 10.1074/jbc.M110.196733 .
[12] K.A. Dave, M.J. Headlam, T.P. Wallis, J.J. Gorman, Preparation and analysis of proteins and peptides using MALDI TOF/TOF mass spectrometry, Curr. Protoc. Protein Sci. Chapter 16 (2011) Unit 16.13, doi: 10.1002/0471140864.
ps1613s63 .
[13] H. Weerakoon, J. Potriquet, A.K. Shah, S. Reed, B. Jayakody, C. Kapil, M.K. Midha, R.L. Moritz, A. Lepletier, J. Mulvenna, J.J. Miles, M.M. Hill, A primary human T-cell spectral library to facilitate large scale quantitative T-cell proteomics, Sci. Data 7 (2020) 412, doi: 10.1038/s41597- 020- 00744- 3 .