barista

Usage:

crux barista [options] <database> <fragmentation spectra> <search results>

Description:

Barista is a protein identification algorithm that combines two different tasks–peptide-spectrum match (PSM) verification and protein inference–into a single learning algorithm. The program requires three inputs: a set of MS2 spectra, a protein database, and the results of searching the spectra against the database. Barista produces as output three ranked lists of proteins, peptides and PSMs, based on how likely the proteins and peptides are to be present in the sample and how likely the PSMs are to be correct. Barista can jointly analyze the results of multiple shotgun proteomics experiments, corresponding to different experiments or replicate runs.

Barista uses a machine learning strategy that requires that the database search be carried out on target and decoy proteins. The searches may be carried out on a concatenated database or, using the --separate-searches option, separate target and decoy databases. The crux tide-index command can be used to generate a decoy database.

Barista assigns two types of statistical confidence estimates, q-values and posterior error probabilities, to identified PSMs, peptides and proteins. For more information about these values, see the documentation for assign-confidence.

More details on the Barista algorithm are provided in

Marina Spivak, Jason Weston, Michael J. MacCoss and William Stafford Noble. "Direct maximization of protein identifications from tandem mass spectra." Molecular and Cellular Proteomics. 11(2):M111.012161, 2012.

Input:

Output:

The program writes files to the folder crux-output by default. The name of the output folder can be set by the user using the --output-dir option. The following files will be created:

Options: