R Script for Paper: Substantial Downregulation of Mitochondrial and Peroxisomal Proteins during Acute Kidney Injury revealed by Data-Independent Acquisition Proteomicsimage

DOI

Spectronaut v16 was used for mass spectrometric spectral library generation and data independent acquisition data processing, quantification, and statistical analysis. The Mus musculus reference proteome with 58,430 entries (UniProtKB-TrEMBL) accessed on 01/31/2018 was used for spectral library search at both the peptide and protein levels. Dynamic data extraction parameters and precision iRT calibration with local non-linear regression were used for data processing. Trypsin/P was used as the digestion enzyme with specific cleavages and up to two missed cleavages allowed. Methionine oxidation and protein N-terminus acetylation were set as dynamic modifications, and carbamidomethylation of cysteine was set as a static modification. Identification at the protein group level required at least two unique peptide identifications and was performed with a 1% q-value cutoff of the precursor ion and protein level. The protein level quantification was based on the peak areas of extracted ion chromatograms (XICs) of 3–6 MS2 fragment ions, specifically b- and y-ions with and automatic normalization strategy and 1% q-value data filtering applied.

The study used Consensus Path DB-mouse (Release MM11, 14.10.2021) for over-representation analysis (ORA) of the significantly altered quantifiable proteins to identify which gene ontology terms were significantly enriched in these samples. Gene ontology terms (including biological processes, molecular functions, and cellular components) were filtered to select for biological processes (term category = b) with a q-value < 0.01 and term level ≥ 5. Dot plots were generated using the ggplot2 package in R to visualize significantly enriched biological processes from each comparison.

The study also used Localization of Organellar Proteins by Isotope Tagging (LOPIT) Map in Rstudio with the pRolocData and pRoloc packages to download the mouse pluripotent stem cell (hyperLOPIT2015) dataset. The hyperLOPIT2015 dataset contains biological organelle density fraction enrichment patterns, quantitative multiplexed MS data of each fraction, and protein localization assignments based on similarities in distribution to well-annotated organelle protein markers. The LOPIT map was created using the t-SNE machine learning algorithm to reduce the multi-dimensional hyperLOPIT2015 dataset to cluster proteins by similarities in the multiple experimental factors described above and overlaying the points with our own fold-change data comparing the injured and healthy kidney.

Raw data and complete MS data sets have been uploaded to the Mass Spectrometry Interactive Virtual Environment (MassIVE) repository, developed by the Center for Computational Mass Spectrometry at the University of California, San Diego, and can be downloaded using the following link: https://massive.ucsd.edu/ProteoSAFe/dataset.jsp?accession=MSV000090777 (MassIVE ID number: MSV000090777; ProteomeXchange ID: PXD038339).