pipeline

Usage:

crux pipeline [options] <mass spectra>+ <peptide source>

Description:

Given one or more sets of tandem mass spectra as well as a protein database, this command runs a series of Crux tools and reports all of the results in a single output directory. There are three steps in the pipeline:

Bullseye to assign high-resolution precursor m/z values to MS/MS data. This step is optional.
Database searching using either Tide-search or Comet. The database can be provided as a file in FASTA format, or additionally, an index as produced by tide-index.
Post-processing using either assign-confidence or Percolator.

All of the command line options associated with the individual tools in the pipeline can be used with the pipeline command.

Input:

mass spectra+ – The name of the file(s) from which to parse the fragmentation spectra, in any of the file formats supported by ProteoWizard. Alteratively, with Tide-search, these files may be binary spectrum files produced by a previous run of crux tide-search using the store-spectra parameter.
peptide source – Either the name of a file in fasta format from which to retrieve proteins and peptides or an index created by a previous run of crux tide-index (for Tide searching).

Output:

The program writes files to the folder crux-output by default. The name of the output folder can be set by the user using the --output-dir option. The following files will be created:

bullseye.pid. – a file containing the fragmentation spectra for which accurate masses were successfully inferred. Unless otherwise specified (with the --spectrum-format option), the output file format is ".ms2". Note that if the output format is ".ms2," then a single spectrum may have multiple "Z" lines, each indicating a charge state and accurate mass. In addition, Bullseye inserts an "I" line (for charge-dependent analysis) corresponding to each "Z" line. The "I" line contains "EZ" in the second column, the charge and mass from the associated "Z" line in the third and fourth colummns, followed by the chromatographic apex and the intensity at the chromatographic apex.
bullseye.no-pid. – a file containing the fragmentation spectra for which accurate masses were not inferred.
hardklor.mono.txt – a tab-delimited text file containing one line for each isotope distribution, as described here.
bullseye.params.txt – a file containing the name and value of all parameters/options for the current operation. Not all parameters in the file may have been used in the operation. The resulting file can be used with the --parameter-file option for other crux programs.
bullseye.log.txt – a log file containing a copy of all messages that were printed to standard error.
tide-search.target.txt – a tab-delimited text file containing the target PSMs. See txt file format for a list of the fields.
tide-search.decoy.txt – a tab-delimited text file containing the decoy PSMs. This file will only be created if the index was created with decoys.
tide-search.params.txt – a file containing the name and value of all parameters/options for the current operation. Not all parameters in the file may have been used in the operation. The resulting file can be used with the --parameter-file option for other Crux programs.
tide-search.log.txt – a log file containing a copy of all messages that were printed to the screen during execution.
comet.target.txt – a tab-delimited text file containing the target PSMs. See txt file format for a list of the fields.
comet.params.txt – a file containing the name and value of all parameters/options for the current operation. Not all parameters in the file may have been used in the operation. The resulting file can be used with the --parameter-file option for other crux programs.
comet.log.txt – a log file containing a copy of all messages that were printed to standard error.
percolator.target.proteins.txt – a tab-delimited file containing the target protein matches. See here for a list of the fields.
percolator.decoy.proteins.txt – a tab-delimited file containing the decoy protein matches. See here for a list of the fields.
percolator.target.peptides.txt – a tab-delimited file containing the target peptide matches. See here for a list of the fields.
percolator.decoy.peptides.txt – a tab-delimited file containing the decoy peptide matches. See here for a list of the fields.
percolator.target.psms.txt – a tab-delimited file containing the target PSMs. See here for a list of the fields.
percolator.decoy.psms.txt – a tab-delimited file containing the decoy PSMs. See here for a list of the fields.
percolator.params.txt – a file containing the name and value of all parameters for the current operation. Not all parameters in the file may have been used in the operation. The resulting file can be used with the --parameter-file option for other crux programs.
percolator.pep.xml – a file containing the PSMs in pepXML format. This file can be used as input to some of the tools in the Transproteomic Pipeline.
percolator.mzid – a file containing the protein, peptide, and spectrum matches in mzIdentML format.
percolator.log.txt – a log file containing a copy of all messages that were printed to standard error.
assign-confidence.target.txt – a tab-delimited text file that contains the targets, sorted by score. The file will contain one new column, named "<method> q-value", where <method> is either "tdc" or "mix-max".
assign-confidence.log.txt – a log file containing a copy of all messages that were printed to stderr.
assign-confidence.params.txt – a file containing the name and value of all parameters/options for the current operation. Not all parameters in the file may have been used in the operation. The resulting file can be used with the --parameter-file option for other crux programs.

Options:

pipeline options
- --bullseye T|F – Run the Bullseye algorithm on the given MS data, using it to assign high-resolution precursor values to the MS/MS data. If a spectrum file ends with .ms2 or .cms2, matching .ms1/.cms1 files will be used as the MS1 file. Otherwise, it is assumed that the spectrum file contains both MS1 and MS2 scans. Default = false.
- --search-engine comet|tide-search – Specify which search engine to use. Default = tide-search.
- --post-processor percolator|assign-confidence|none – Specify which post-processor to apply to the search results. Default = percolator.
- --file-column T|F – Include the file column in tab-delimited output. Default = true.
- --c-pos <float> – Penalty for mistakes made on positive examples. If this value is not specified, then it is set via cross validation over the values {0.1, 1, 10}, selecting the value that yields the largest number of PSMs identified at the q-value threshold set via the --test-fdr parameter. Default = 0.01.
- --c-neg <float> – Penalty for mistake made on negative examples. This parameter requires the c-pos is set explicitly; otherwise, --c-neg will have no effect. If not specified, then this value is set by cross validation over {0.1, 1, 10}. Default = 0.
- --train-fdr <float> – False discovery rate threshold to define positive examples in training. Default = 0.01.
- --test-fdr <float> – False discovery rate threshold used in selecting hyperparameters during internal cross-validation and for reporting the final results. Default = 0.01.
- --maxiter <integer> – Maximum number of iterations for training. Default = 10.
- --quick-validation T|F – Quicker execution by reduced internal cross-validation. Default = false.
- --default-direction <string> – In its initial round of training, Percolator uses one feature to induce a ranking of PSMs. By default, Percolator will select the feature that produces the largest set of target PSMs at a specified FDR threshold (cf. --train-fdr). This option allows the user to specify which feature is used for the initial ranking, using the name as a string from this table. The name can be preceded by a hyphen (e.g. "-XCorr") to indicate that a lower value is better. Default = <empty>.
- --unitnorm T|F – Use unit normalization (i.e., linearly rescale each PSM's feature vector to have a Euclidean length of 1), instead of standard deviation normalization. Default = false.
- --test-each-iteration T|F – Measure performance on test set each iteration. Default = false.
- --override T|F – By default, Percolator will examine the learned weights for each feature, and if the weight appears to be problematic, then percolator will discard the learned weights and instead employ a previously trained, static score vector. This switch allows this error checking to be overriden. Default = false.
- --percolator-seed <string> – When given a unsigned integer value seeds the random number generator with that value. When given the string "time" seeds the random number generator with the system time. Default = 1.
- --klammer T|F – Use retention time features calculated as in "Improving tandem mass spectrum identification using peptide retention time prediction across diverse chromatography conditions" by Klammer AA, Yi X, MacCoss MJ and Noble WS. (Analytical Chemistry. 2007 Aug 15;79(16):6111-8.). Default = false.
- --only-psms T|F – Do not remove redundant peptides; keep all PSMs and exclude peptide level probability. Default = false.
- --post-processing-tdc T|F – Use target-decoy competition to assign q-values and PEPs. Default = false.
- --post-processing-qvality T|F – Replace the target-decoy competition with the method qvality to assign q-values and PEPs. Note that this option only has an effect if the input PSMs are from separate target and decoy searches. Default = false.
- --estimation-method mix-max|tdc|peptide-level – Specify the method used to estimate q-values: the mix-max procedure or target-decoy competition. peptide-level is applied for spectrum-centric search. Eliminates any PSMS for which there exists a better scoring PSM involving the same peptide. Default = tdc.
- --score <string> – Specify the column (for tab-delimited input) or tag (for XML input) used as input to the q-value estimation procedure. If this parameter is unspecified, then assign-confidence tries to seach for "xcorr score", "evalue" (comet), "exact p-value" score fields in this order in the input file. Default = <empty>.
- --sidak T|F – Adjust the score using the Sidak adjustment and reports them in a new column in the output file. Note that this adjustment only makes sense if the given scores are p-values, and that it requires the presence of the "distinct matches/spectrum" feature for each PSM. Default = false.
- --combine-charge-states T|F – Specify this parameter to T in order to combine charge states with peptide sequencesin peptide-centric search. Works only if peptide-level=T. Default = false.
- --combine-modified-peptides T|F – Specify this parameter to T in order to treat peptides carrying different or no modifications as being the same. Works only if peptide-level=T. Default = false.
Identifying PPIDs in MS1 spectra
- --max-persist <float> – Ignore PPIDs that persist for longer than this length of time in the MS1 spectra. The unit of time is whatever unit is used in your data file (usually minutes). These PPIDs are considered contaminants. Default = 2.
- --persist-tolerance <float> – Set the mass tolerance (+/-ppm) for finding PPIDs in consecutive MS1 scans. Default = 10.
- --gap-tolerance <integer> – Allowed gap size when checking for PPIDs across consecutive MS1 scans. Default = 1.
- --scan-tolerance <integer> – Total number of MS1 scans over which a PPID must be observed to be considered real. Gaps in persistence are allowed by setting --gap-tolerance. Default = 3.
- --bullseye-max-mass <float> – Only consider PPIDs below this maximum mass in daltons. Default = 8000.
- --bullseye-min-mass <float> – Only consider PPIDs above this minimum mass in daltons. Default = 600.
Matching PPIDs to MS2 spectra
- --exact-match T|F – When true, require an exact match (as defined by --exact-tolerance) between the center of the precursor isolation window in the MS2 scan and the base isotopic peak of the PPID. If this option is set to false and no exact match is observed, then attempt to match using a wider m/z tolerance. This wider tolerance is calculated using the PPID's monoisotopic mass and charge (the higher the charge, the smaller the window). Default = false.
- --exact-tolerance <float> – Set the tolerance (+/-ppm) for --exact-match. Default = 10.
- --retention-tolerance <float> – Set the tolerance (+/-units) around the retention time over which a PPID can be matches to the MS2 spectrum. The unit of time is whatever unit is used in your data file (usually minutes). Default = 0.5.
Search parameters
- --precursor-window <float> – Tolerance used for matching peptides to spectra. Peptides must be within +/- 'precursor-window' of the spectrum value. The precursor window units depend upon precursor-window-type. Default = 3.
- --precursor-window-type mass|mz|ppm – Specify the units for the window that is used to select peptides around the precursor mass location (mass, mz, ppm). The magnitude of the window is defined by the precursor-window option, and candidate peptides must fall within this window. For the mass window-type, the spectrum precursor m+h value is converted to mass, and the window is defined as that mass +/- precursor-window. If the m+h value is not available, then the mass is calculated from the precursor m/z and provided charge. The peptide mass is computed as the sum of the average amino acid masses plus 18 Da for the terminal OH group. The mz window-type calculates the window as spectrum precursor m/z +/- precursor-window and then converts the resulting m/z range to the peptide mass range using the precursor charge. For the parts-per-million (ppm) window-type, the spectrum mass is calculated as in the mass type. The lower bound of the mass window is then defined as the spectrum mass / (1.0 + (precursor-window / 1000000)) and the upper bound is defined as spectrum mass / (1.0 - (precursor-window / 1000000)). Default = mass.
- --spectrum-min-mz <float> – The lowest spectrum m/z to search in the ms2 file. Default = 0.
- --spectrum-max-mz <float> – The highest spectrum m/z to search in the ms2 file. Default = 1e+09.
- --min-peaks <integer> – The minimum number of peaks a spectrum must have for it to be searched. Default = 20.
- --spectrum-charge 1|2|3|all – The spectrum charges to search. With 'all' every spectrum will be searched and spectra with multiple charge states will be searched once at each charge state. With 1, 2, or 3 only spectra with that charge state will be searched. Default = all.
- --scan-number <string> – A single scan number or a range of numbers to be searched. Range should be specified as 'first-last' which will include scans 'first' and 'last'. Default = <empty>.
- --compute-sp T|F – Compute the preliminary score Sp for all candidate peptides. Report this score in the output, along with the corresponding rank, the number of matched ions and the total number of ions. This option is recommended if results are to be analyzed by Percolator or Barista. If sqt-output is enabled, then compute-sp is automatically enabled and cannot be overridden. Note that the Sp computation requires re-processing each observed spectrum, so turning on this switch involves significant computational overhead. Default = false.
- --remove-precursor-peak T|F – If true, all peaks around the precursor m/z will be removed, within a range specified by the --remove-precursor-tolerance option. Default = false.
- --remove-precursor-tolerance <float> – This parameter specifies the tolerance (in Th) around each precursor m/z that is removed when the --remove-precursor-peak option is invoked. Default = 1.5.
- --exact-p-value T|F – Enable the calculation of exact p-values for the XCorr score as described in this article. Calculation of p-values increases the running time but increases the number of identifications at a fixed confidence threshold. The p-values will be reported in a new column with header "exact p-value", and the "xcorr score" column will be replaced with a "refactored xcorr" column. Note that, currently, p-values can only be computed when the mz-bin-width parameter is set to its default value. Variable and static mods are allowed on non-terminal residues in conjunction with p-value computation, but currently only static mods are allowed on the N-terminus, and no mods on the C-terminus. Default = false.
- --use-neutral-loss-peaks T|F – Controls whether neutral loss ions are considered in the search. Two types of neutral losses are included and are applied only to singly charged b- and y-ions: loss of ammonia (NH3, 17.0086343 Da) and H2O (18.0091422). Each neutral loss peak has intensity 1/5 of the primary peak. Default = true.
- --use-flanking-peaks T|F – Include flanking peaks around singly charged b and y theoretical ions. Each flanking peak occurs in the adjacent m/z bin and has half the intensity of the primary peak. Default = false.
- --mz-bin-width <float> – Before calculation of the XCorr score, the m/z axes of the observed and theoretical spectra are discretized. This parameter specifies the size of each bin. The exact formula for computing the discretized m/z value is floor((x/mz-bin-width) + 1.0 - mz-bin-offset), where x is the observed m/z value. For low resolution ion trap ms/ms data 1.0005079 and for high resolution ms/ms 0.02 is recommended. Default = 1.0005079.
- --mz-bin-offset <float> – In the discretization of the m/z axes of the observed and theoretical spectra, this parameter specifies the location of the left edge of the first bin, relative to mass = 0 (i.e., mz-bin-offset = 0.xx means the left edge of the first bin will be located at +0.xx Da). Default = 0.4.
- --max-precursor-charge <integer> – The maximum charge state of a spectra to consider in search. Default = 5.
- --peptide-centric-search T|F – Carries out a peptide-centric search. For each peptide the top-scoring spectra are reported, in contrast to the standard spectrum-centric search where the top-scoring peptides are reported. Note that in this case the "xcorr rank" column will contain the rank of the given spectrum with respect to the given candidate peptide, rather than vice versa (which is the default). Default = false.
Fido options
- --protein T|F – Use the Fido algorithm to infer protein probabilities. Must be true to use any of the Fido options. Default = false.
- --fido-alpha <float> – Specify the probability with which a present protein emits an associated peptide. Set by grid search (see --fido-gridsearch-depth parameter) if not specified. Default = 0.
- --fido-beta <float> – Specify the probability of the creation of a peptide from noise. Set by grid search (see --fido-gridsearch-depth parameter) if not specified. Default = 0.
- --fido-gamma <float> – Specify the prior probability that a protein is present in the sample. Set by grid search (see --fido-gridsearch-depth parameter) if not specified. Default = 0.
- --fido-protein-level-pi0 T|F – Use pi_0 value when calculating empirical q-values Default = false.
- --fido-empirical-protein-q T|F – Estimate empirical p-values and q-values for proteins using target-decoy analysis. Default = false.
- --fido-gridsearch-depth <integer> – Set depth of the grid search for alpha, beta and gamma estimation. The values considered, for each possible value of the --fido-gridsearch-depth parameter, are as follows:
  - 0: alpha = {0.01, 0.04, 0.09, 0.16, 0.25, 0.36, 0.5}; beta = {0.0, 0.01, 0.15, 0.025, 0.035, 0.05, 0.1}; gamma = {0.1, 0.25, 0.5, 0.75}.
  - 1: alpha = {0.01, 0.04, 0.09, 0.16, 0.25, 0.36}; beta = {0.0, 0.01, 0.15, 0.025, 0.035, 0.05}; gamma = {0.1, 0.25, 0.5}.
  - 2: alpha = {0.01, 0.04, 0.16, 0.25, 0.36}; beta = {0.0, 0.01, 0.15, 0.030, 0.05}; gamma = {0.1, 0.5}.
  - 3: alpha = {0.01, 0.04, 0.16, 0.25, 0.36}; beta = {0.0, 0.01, 0.15, 0.030, 0.05}; gamma = {0.5}.
  Default = 0.
- --fido-gridsearch-mse-threshold <float> – Q-value threshold that will be used in the computation of the MSE and ROC AUC score in the grid search. Default = 0.05.
- --fido-fast-gridsearch <float> – Apply the specified threshold to PSM, peptide and protein probabilities to obtain a faster estimate of the alpha, beta and gamma parameters. Default = 0.
- --fido-protein-truncation-threshold <float> – To speed up inference, proteins for which none of the associated peptides has a probability exceeding the specified threshold will be assigned probability = 0. Default = 0.01.
- --fido-split-large-components T|F – Approximate the posterior distribution by allowing large graph components to be split into subgraphs. The splitting is done by duplicating peptides with low probabilities. Splitting continues until the number of possible configurations of each subgraph is below 2^18 Default = false.
Database
- --decoy_search <integer> – 0=no, 1=concatenated search, 2=separate search. Default = 0.
CPU threads
- --num-threads <integer> – 0=poll CPU to set num threads; else specify num threads directly. Default = 0.
- --num_threads <integer> – 0=poll CPU to set num threads; else specify num threads directly. Default = 0.
Masses
- --peptide_mass_tolerance <float> – Controls the mass tolerance value. The mass tolerance is set at +/- the specified number i.e. an entered value of "1.0" applies a -1.0 to +1.0 tolerance. The units of the mass tolerance is controlled by the parameter "peptide_mass_units". Default = 3.
- --peptide_mass_units <integer> – 0=amu, 1=mmu, 2=ppm. Default = 0.
- --mass_type_parent <integer> – 0=average masses, 1=monoisotopic masses. Default = 1.
- --mass_type_fragment <integer> – 0=average masses, 1=monoisotopic masses. Default = 1.
- --precursor_tolerance_type <integer> – 0=singly charged peptide mass, 1=precursor m/z. Default = 0.
- --isotope_error <integer> – 0=off, 1=on -1/0/1/2/3 (standard C13 error), 2=-8/-4/0/4/8 (for +4/+8 labeling). Default = 0.
Search enzyme
- --search_enzyme_number <integer> – Specify a search enzyme from the end of the parameter file. Default = 1.
- --num_enzyme_termini <integer> – valid values are 1 (semi-digested), 2 (fully digested), 8 N-term, 9 C-term. Default = 2.
- --allowed_missed_cleavage <integer> – Maximum value is 5; for enzyme search. Default = 2.
Fragment ions
- --fragment_bin_tol <float> – Binning to use on fragment ions. Default = 1.000507.
- --fragment_bin_offset <float> – Offset position to start the binning (0.0 to 1.0). Default = 0.4.
- --theoretical_fragment_ions <integer> – 0=default peak shape, 1=M peak only. Default = 1.
- --use_A_ions <integer> – Controls whether or not A-ions are considered in the search (0 - no, 1 - yes). Default = 0.
- --use_B_ions <integer> – Controls whether or not B-ions are considered in the search (0 - no, 1 - yes). Default = 1.
- --use_C_ions <integer> – Controls whether or not C-ions are considered in the search (0 - no, 1 - yes). Default = 0.
- --use_X_ions <integer> – Controls whether or not X-ions are considered in the search (0 - no, 1 - yes). Default = 0.
- --use_Y_ions <integer> – Controls whether or not Y-ions are considered in the search (0 - no, 1 - yes). Default = 1.
- --use_Z_ions <integer> – Controls whether or not Z-ions are considered in the search (0 - no, 1 - yes). Default = 0.
- --use_NL_ions <integer> – 0=no, 1= yes to consider NH3/H2O neutral loss peak. Default = 1.
mzXML/mzML parameters
- --scan_range <string> – Start and scan scan range to search; 0 as first entry ignores parameter. Default = 0 0.
- --precursor_charge <string> – Precursor charge range to analyze; does not override mzXML charge; 0 as first entry ignores parameter. Default = 0 0.
- --override_charge <integer> – Specifies the whether to override existing precursor charge state information when present in the files with the charge range specified by the "precursor_charge" parameter. Default = 0.
- --ms_level <integer> – MS level to analyze, valid are levels 2 or 3. Default = 2.
- --activation_method ALL|CID|ECD|ETD|PQD|HCD|IRMPD – Specifies which scan types are searched. Default = ALL.
Miscellaneous parameters
- --digest_mass_range <string> – MH+ peptide mass range to analyze. Default = 600.0 5000.0.
- --num_results <integer> – Number of search hits to store internally. Default = 50.
- --skip_researching <integer> – For '.out' file output only, 0=search everything again, 1=don't search if .out exists. Default = 1.
- --max_fragment_charge <integer> – Set maximum fragment charge state to analyze (allowed max 5). Default = 3.
- --max_precursor_charge <integer> – Set maximum precursor charge state to analyze (allowed max 9). Default = 6.
- --nucleotide_reading_frame <integer> – 0=proteinDB, 1-6, 7=forward three, 8=reverse three, 9=all six. Default = 0.
- --clip_nterm_methionine <integer> – 0=leave sequences as-is; 1=also consider sequence w/o N-term methionine. Default = 0.
- --spectrum_batch_size <integer> – Maximum number of spectra to search at a time; 0 to search the entire scan range in one loop. Default = 0.
- --decoy_prefix <string> – Specifies the prefix of the protein names that indicates a decoy. Default = decoy_.
- --output_suffix <string> – Specifies the suffix string that is appended to the base output name for the pep.xml, pin.xml, txt and sqt output files. Default = <empty>.
- --mass_offsets <string> – Specifies one or more mass offsets to apply. This value(s) are effectively subtracted from each precursor mass such that peptides that are smaller than the precursor mass by the offset value can still be matched to the respective spectrum. Default = <empty>.
Spectral processing
- --minimum_peaks <integer> – Minimum number of peaks in spectrum to search. Default = 10.
- --minimum_intensity <float> – Minimum intensity value to read in. Default = 0.
- --remove_precursor_peak <integer> – 0=no, 1=yes, 2=all charge reduced precursor peaks (for ETD). Default = 0.
- --remove_precursor_tolerance <float> – +- Da tolerance for precursor removal. Default = 1.5.
- --clear_mz_range <string> – For iTRAQ/TMT type data; will clear out all peaks in the specified m/z range. Default = 0.0 0.0.
Variable modifications
- --variable_mod01 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod02 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod03 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod04 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod05 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod06 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod07 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod08 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --variable_mod09 <string> – Up to 9 variable modifications are supported; format: " <0=variable/1=binary> " e.g. 79.966331 STY 0 3. Default = 0.0 null 0 4 -1 0 0.
- --max_variable_mods_in_peptide <integer> – Specifies the total/maximum number of residues that can be modified in a peptide. Default = 5.
- --require_variable_mod <integer> – Controls whether the analyzed peptides must contain at least one variable modification. Default = 0.
Static modifications
- --add_Cterm_peptide <float> – Specifiy a static modification to the c-terminus of all peptides. Default = 0.
- --add_Nterm_peptide <float> – Specify a static modification to the n-terminus of all peptides. Default = 0.
- --add_Cterm_protein <float> – Specify a static modification to the c-terminal peptide of each protein. Default = 0.
- --add_Nterm_protein <float> – Specify a static modification to the n-terminal peptide of each protein. Default = 0.
- --add_A_alanine <float> – Specify a static modification to the residue A. Default = 0.
- --add_B_user_amino_acid <float> – Specify a static modification to the residue B. Default = 0.
- --add_C_cysteine <float> – Specify a static modification to the residue C. Default = 57.021464.
- --add_D_aspartic_acid <float> – Specify a static modification to the residue D. Default = 0.
- --add_E_glutamic_acid <float> – Specify a static modification to the residue E. Default = 0.
- --add_F_phenylalanine <float> – Specify a static modification to the residue F. Default = 0.
- --add_G_glycine <float> – Specify a static modification to the residue G. Default = 0.
- --add_H_histidine <float> – Specify a static modification to the residue H. Default = 0.
- --add_I_isoleucine <float> – Specify a static modification to the residue I. Default = 0.
- --add_J_user_amino_acid <float> – Specify a static modification to the residue J. Default = 0.
- --add_K_lysine <float> – Specify a static modification to the residue K. Default = 0.
- --add_L_leucine <float> – Specify a static modification to the residue L. Default = 0.
- --add_M_methionine <float> – Specify a static modification to the residue M. Default = 0.
- --add_N_asparagine <float> – Specify a static modification to the residue N. Default = 0.
- --add_O_ornithine <float> – Specify a static modification to the residue O. Default = 0.
- --add_P_proline <float> – Specify a static modification to the residue P. Default = 0.
- --add_Q_glutamine <float> – Specify a static modification to the residue Q. Default = 0.
- --add_R_arginine <float> – Specify a static modification to the residue R. Default = 0.
- --add_S_serine <float> – Specify a static modification to the residue S. Default = 0.
- --add_T_threonine <float> – Specify a static modification to the residue T. Default = 0.
- --add_U_selenocysteine <float> – Specify a static modification to the residue U. Default = 0.
- --add_V_valine <float> – Specify a static modification to the residue V. Default = 0.
- --add_W_tryptophan <float> – Specify a static modification to the residue W. Default = 0.
- --add_X_user_amino_acid <float> – Specify a static modification to the residue X. Default = 0.
- --add_Y_tyrosine <float> – Specify a static modification to the residue Y. Default = 0.
- --add_Z_user_amino_acid <float> – Specify a static modification to the residue Z. Default = 0.
Input and output
- --fileroot <string> – The fileroot string will be added as a prefix to all output file names. Default = <empty>.
- --output-dir <string> – The name of the directory where output files will be created. Default = crux-output.
- --overwrite T|F – Replace existing files if true or fail when trying to overwrite a file if false. Default = false.
- --spectrum-format |ms2|bms2|cms2|mgf – The format to write the output spectra to. If empty, the spectra will be output in the same format as the MS2 input. Default = <empty>.
- --parameter-file <string> – A file containing parameters. See the parameter documentation page for details. Default = <empty>.
- --verbosity <integer> – Specify the verbosity of the current processes. Each level prints the following messages, including all those at lower verbosity levels: 0-fatal errors, 10-non-fatal errors, 20-warnings, 30-information on the progress of execution, 40-more progress information, 50-debug info, 60-detailed debug info. Default = 30.
- --top-match <integer> – Specify the number of matches to report for each spectrum. Default = 5.
- --store-spectra <string> – Specify the name of the file where the binarized fragmentation spectra will be stored. Subsequent runs of crux tide-search will execute more quickly if provided with the spectra in binary format. The filename is specified relative to the current working directory, not the Crux output directory (as specified by --output-dir). This option is not valid if multiple input spectrum files are given. Default = <empty>.
- --store-index <string> – When providing a FASTA file as the index, the generated binary index will be stored at the given path. This option has no effect if a binary index is provided as the index. Default = <empty>.
- --concat T|F – When set to T, target and decoy search results are reported in a single file, and only the top-scoring N matches (as specified via --top-match) are reported for each spectrum, irrespective of whether the matches involve target or decoy peptides. Default = false.
- --print-search-progress <integer> – Show search progress by printing every n spectra searched. Set to 0 to show no search progress. Default = 1000.
- --spectrum-parser pwiz|mstoolkit – Specify the parser to use for reading in MS/MS spectra. The default, ProteoWizard parser can read the MS/MS file formats listed here. The alternative is MSToolkit parser. If the ProteoWizard parser fails to read your files properly, you may want to try the MSToolkit parser instead. Default = pwiz.
- --use-z-line T|F – Specify whether, when parsing an MS2 spectrum file, Crux obtains the precursor mass information from the "S" line or the "Z" line. Default = true.
- --txt-output T|F – Output a tab-delimited results file to the output directory. Default = true.
- --sqt-output T|F – Outputs an SQT results file to the output directory. Note that if sqt-output is enabled, then compute-sp is automatically enabled and cannot be overridden. Default = false.
- --pepxml-output T|F – Output a pepXML results file to the output directory. Default = false.
- --mzid-output T|F – Output an mzIdentML results file to the output directory. Default = false.
- --pin-output T|F – Output a Percolator input (PIN) file to the output directory. Default = false.
- --output_sqtfile <integer> – 0=no, 1=yes write sqt file. Default = 0.
- --output_txtfile <integer> – 0=no, 1=yes write tab-delimited text file. Default = 1.
- --output_pepxmlfile <integer> – 0=no, 1=yes write pep.xml file. Default = 1.
- --output_percolatorfile <integer> – 0=no, 1=yes write percolator file. Default = 0.
- --output_outfiles <integer> – 0=no, 1=yes write .out files. Default = 0.
- --print_expect_score <integer> – 0=no, 1=yes to replace Sp with expect in out & sqt. Default = 1.
- --num_output_lines <integer> – num peptide results to show. Default = 5.
- --show_fragment_ions <integer> – 0=no, 1=yes for out files only. Default = 0.
- --sample_enzyme_number <integer> – Sample enzyme which is possibly different than the one applied to the search. Used to calculate NTT & NMC in pepXML output. Default = 1.
- --pout-output T|F – Output a Percolator pout.xml format results file to the output directory. Default = false.
- --feature-file-out T|F – Output the computed features in tab-delimited text format. Default = false.
- --list-of-files T|F – Specify that the search results are provided as lists of files, rather than as individual files. Default = false.
- --feature-file-in T|F – When set to T, interpret the input file as a PIN file. Default = false.
- --decoy-xml-output T|F – Include decoys (PSMs, peptides, and/or proteins) in the XML output. Default = false.
- --decoy-prefix <string> – Specifies the prefix of the protein names that indicate a decoy. Default = decoy_.
- --output-weights T|F – Output final weights to a file named "percolator.weights.txt". Default = false.
- --init-weights <string> – Read initial weights from the given file (one per line). Default = <empty>.

pipeline

Usage:

Description:

Input:

Output:

Options:

pipeline options

Identifying PPIDs in MS1 spectra

Matching PPIDs to MS2 spectra

Search parameters

Fido options

Database

CPU threads

Masses

Search enzyme

Fragment ions

mzXML/mzML parameters

Miscellaneous parameters

Spectral processing

Variable modifications

Static modifications

Input and output