SanXoTSieve
SanXoTSieve v0.14 is a program made in the Jesus Vazquez Cardiovascular Proteomics Lab at Centro Nacional de Investigaciones Cardiovasculares, used to perform automatical removal of lower level outliers in an integration performed using the SanXoT integrator.
SanXoTSieve needs
- the two input files of a SanXoT integration (see SanXoT's help): commands -d and -r, respectively.
- and the resulting variance of the integration that has been performed: commands -V (assigned from the info file of the integration.) or -v.
... and delivers two output files:
- a new relations file (by default suffixed "_tagged"), which is identical to the original relations file, but tagging in the third column the relations marked as outlier.
- the log file.
Usage:
sanxotsieve.py -d[data file] -r[relations file] -V[info file] [OPTIONS]
Arguments:
-h, --help Display basic help and exit.
-H, --advanced-help Display this help and exit.
-a, --analysis=string
Use a prefix for the output files. If this is not
provided, then the prefix will be garnered from the data
file.
-b, --no-verbose Do not print result summary after executing.
-d, --datafile=filename
Data file with identificators of the lowel level in the
first column, measured values (x) in the second column,
and weights (v) in the third column.
-D, --removeduplicateupper
When merging data with relations table, remove duplicate
higher level elements (not removed by default).
-f, --fdrlimit=float
Use an FDR limit different than 0.01 (1%).
-L, --infofile=filename
To use a non-default name for the log file.
-n, --newrelfile=filename
To use a non-default name for the relations file
containing the tagged outliers.
-o, --outlierrelfile=filename
To use a non-default name for the relations responsible
of outliers (note that outlier relations are only saved
when the --oldway option is active)
-p, --place, --folder=foldername
To use a different common folder for the output files.
If this is not provided, the folder used will be the
same as the input folder.
-r, --relfile, --relationsfile=filename
Relations file, with identificators of the higher level
in the first column, and identificators of the lower
level in the second column.
-u, --one-to-one Remove only one outlier per cycle. This is slightly more
accurate than the default mode (where the outermost
outlier of each category with outliers is removed in
each cycle), but usually exacerbatingly slow.
-v, --var, --varianceseed=double
Variance used in the concerning integration.
Default is 0.001.
-V, --varfile=filename
Get the variance value from a text file. It must contain
a line (not more than once) with the text
"Variance = [double]". This suits the info file from a
previous integration (see -L in SanXoT).
--oldway Do it the old way: instead of tagging, create two
separated relation files, with and without outliers.
--outliertag=string To select a non-default tag for outliers (default: out)
--tags=string To define a tag to distinguish groups to perform the
integration. The tag can be used by inclusion, such as
--tags="mod"
or by exclusion, putting first the "!" symbol, such as
--tags="!out"
Tags should be included in a third column of the
relations file. Note that the tag "!out" for outliers is
implicit.
Different tags can be combined using logical operators
"and" (&), "or" (|), and "not" (!), and parentheses.
Some examples:
--tags="!out&mod"
--tags="!out&(dig0|dig1)"
--tags="(!dig0&!dig1)|mod1"
--tags="mod1|mod2|mod3"




