Aird is a new format for mass spectrometry data storage. It is an opensource and
computation-oriented format with controllable precision, flexible indexing strategies, and high
compression rate for m/z, intensity and ion mobility pairs. Aird provides a novel compressor called
ComboComp for m/z data compression,which makes up an amazing compression rate. Compared with Zlib, m/z data is
about
65% lower in the Aird on average. Aird is a computational friendly algorithm. Through SIMD
optimization, the decoding speed of Aird is much higher than that of Zlib.
Aird SDK is a developer tool written in Java, C# and Python language. It is convenient for developers who want to read
the spectrum data in the Aird file quickly. With the high performance of reading and excellent
compression rate, developer can develop a lot of application based on Aird for data visualization
and analysis.
Aird Index File Suffix: .json
Aird Data File Suffix: .aird
Aird Index File and Aird Data File show be stored in the same directory with the same file.
You should use the AirdPro client to transfer the vendor files into Aird format.
You can download the AirdPro from the github:
https://github.com/CSi-Studio/AirdPro/releases/
After downloading, unzip the file, click the AirdPro.exe to start the AirdPro Application AirdPro is
written in C#, it is also an opensource project. Simple UI is provided by AirdPro for people to
convert the vendor file to the Aird file quickly.
- DIA/SWATH
- DDA
- PRM
- DIA_PASEF
- DDA_PASEF
Demo code: see SampleCode.java in the project or in the "How to use" chapter
-
Lu, M., An, S., Wang, R. et al. Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time. BMC Bioinformatics 23, 35 (2022)
-
Wang,J. et al. StackZDPD: a novel encoding scheme for mass spectrometry data optimized for speed and compression ratio. Scientific Reports, 12, 5384.(2022)
<dependency>
<groupId>net.csibio.aird</groupId>
<artifactId>aird-sdk</artifactId>
<version>2.4.1.2</version>
</dependency>
Search "AirdSDK" in Nuget Package Manager
pip install AirdSDK
Name | Type | Required | Description |
---|---|---|---|
version | String | True | Aird format version |
versionCode | Integer | True | Aird format version code |
compressors | List | True | The compression strategies for m/z, intensity and mobility array |
instruments | List | True | General information about the MS instrument |
dataProcessings | List | False | Description of any manipulation (from the first conversion to Aird format until the creation of the current Aird instance document) applied to the data |
softwares | List | False | Software used to convert the data. If data has been processed (e.g. profile > centroid) by any additional progs these should be added too |
parentFiles | List | False | Path to all the ancestor files (up to the native acquisition file) used to generate the current Aird document |
rangeList | List | False | The precursor m/z window ranges which have been adjusted with experiment overlap. This field is targeted for DIA and PRM type format |
indexList | List | True | The index for mass spectrometry data |
type | String | True | Aird Type. There are four types now: DIA, DDA, PRM, DIA_PASEF, DDA_PASEF, COMMON |
fileSize | Long | True | The file size for Aird file and JSON file |
totalCount | Long | True | Total spectrums count |
airdPath | String | False | The .aird file path |
activator | String | False | Activator Method, CID,HCD,ETD,ECD |
energy | Float | False | Collision Energy |
msType | String | True | Mass Spectrum Type, PROFILE, CENTROIDED |
rtUnit | String | True | rt unit, always second |
polarity | String | True | Polarity type, POSITIVE, NEGATIVE, NEUTRAL |
ignoreZeroIntensityPoint | Boolean | True | Whether ignore the point which intensity is 0 |
mobiInfo | MobiInfo | False | ion mobility information |
creator | String | False | The file creator, this field can be set up in the AirdPro |
createDate | String | False | The create date for the aird file |
features | String | False | Some other features stored with “key:value;key:value” format |
Name | Type | Required | Description |
---|---|---|---|
target | String | True | mz, intensity, ion mobility |
methods | List | True | Combination Compressors like ["VB","Zstd"] |
precision | Integer | False | 10^N, the N means N decimal places for the final data |
digit | Integer | False | Use for StackZDPD algorithm, 2^digit = layers |
byteOrder | String | True | LITTLE_ENDIAN(default), BIG_ENDIAN |
Name | Type | Required | Description |
---|---|---|---|
start | Double | True | Precursor m/z start |
end | Double | True | Precursor m/z end |
mz | Double | True | Precursor m/z |
charge | Integer | False | Precursor charge, 0 when empty |
features | String | False | Some other features stored with “key:value;key:value” format |
Name | Type | Required | Description |
---|---|---|---|
level | Integer | True | 1:MS1, 2:MS2 |
startPtr | Long | True | The start point for the block |
endPtr | Long | True | The endpoint for the block |
num | Integer | False | The scan number in the vendor file. If a block has a list of MS2, this field is the related MS1’s number |
rangeList | List | False | The precursor m/z window ranges which have been adjusted with experiment overlap. This field is targeted for DIA and PRM type format |
nums | List | False | Scan numbers in the block |
rts | List | True | All the retention times in the block |
basePeakIntensities | List | True | Every Spectrum's total base peak intensity in the block |
basePeakMzs | List | True | Every Spectrum's total base peak mz in the block |
tags | List | False | Used in StackZDPD, the original layers of every mz point |
tics | List | False | Every Spectrum's total intensity in the block |
mzs | List | True | Size for every m/z bytes size |
ints | List | True | Size for every intensity bytes size |
mobilities | List | False | Size for every ion mobility bytes size |
cvList | List<List> | False | PSI Controlled Vocabulary |
features | String | False | Some other features stored with “key:value;key:value” format |
Name | Type | Required | Description |
---|---|---|---|
manufacturer | String | False | Instrument manufacturer: ”ABSciex”,”Thermo Fisher” |
ionization | String | False | Ionization |
resolution | String | False | Resolution |
model | String | False | Instrument model |
source | List | False | Source: ”electrospray ionization”, ”electrospray inlet” |
analyzer | List | False | Analyzer: “quadrupole”, “orbitrap” |
detector | List | False | Detector: ”inductive detector” |
Name | Type | Required | Description |
---|---|---|---|
processingOperations | List | False | Any additional manipulation not included elsewhere in the dataProcessing element |
Name | Type | Required | Description |
---|---|---|---|
name | String | True | The software name |
version | String | False | The software version |
type | String | False | The software function type, like "acquisition" |
Name | Type | Required | Description |
---|---|---|---|
name | String | True | The filename |
location | String | False | The file location |
type | String | False | The file type |
Name | Type | Required | Description |
---|---|---|---|
dictStart | long | True | start position in the aird for mobi array |
dictEnd | long | True | end position in the aird for mobi array |
unit | String | False | ion mobility unit |
type | String | False | ion mobility type, see MobilityType |
List<File> files = AirdScanUtil.scanIndexFiles("E:\\data\\SGS");
files.forEach(file -> {
AirdManager.getInstance().load(file.getPath());
});
DIAParser diaParser = new DIAParser("\\FilePath\\file.json");
DDAParser ddaParser = new DDAParser("\\FilePath\\file.json");
DDAPasefParser ddaPasefParser = new DDAPasefParser("\\FilePath\\file.json");
DIAPasefParser diaPasefParser = new DIAPasefParser("\\FilePath\\file.json");
PRMParser prmParser = new PRMParser("\\FilePath\\file.json");
DDAParser parser1 = new DDAParser(YOUR_AIRD_INDEX_FILE_PATH);
AirdInfo airdInfo = parser.getAirdInfo();
int num = 12
Spectrum pairs = parser.getSpectrum(num);
double rt = 12.3456
Spectrum pairs = parser.getSpectrum(num);
DIAParser diaParser = new DIAParser("\\FilePath\\file.json");
AirdInfo airdInfo = diaParser.getAirdInfo();
airdInfo.getIndexList().forEach(blockIndex -> {
TreeMap<Double, Spectrum> map = diaParser.getSpectrums(blockIndex); //key is retention time
});
This is only for DDAParser with small DDA data file(< 200MB as an advice). Read all spectra into the memory
DDAParser ddaParser = new DDAParser("\\FilePath\\file.json");
List<DDAMs> cycleList = ddaParser.readAllToMemory();
Detail sample code. See net.csibio.aird.sample.SampleCode