Vectools increases the development speed and reproducability of bioinformatics analyses by offering a high-quality
alternative to writing custom scripts.
cat f1.tsv cat f2.tsv cat f3.tsv
A 1 A 3 A 5
B 2 B 4 B 6
cat i1.tsv
C 10 11 12
D 13 14 15
vectools join -s 1 f*.tsv | \
vectools add -r - i1.tsv
A 11 14 17
B 15 18 21
# Calculate dipeptide composition of positive and negative sets.
vectools ncomp --kmer-len 2 pos.faa > pos.vec
vectools ncomp --kmer-len 2 neg.faa > neg.vec
# Find best parameters via gird search, k-fold testing,
# and independent set testing. Then build SVM model.
vectools svmtrain \
--folds 5 --kernel rbf \
--best-metrics ca.best_stats \
--model ca.model \
pos.vec neg.vec
# Get dipeptide composition from multi-fasta of unknowns.
vectools ncomp -r --kmer-len 2 unknowns.faa > unknowns.vec
# Predict classes of unknowns.
vectools svmclassify -r --model ca.model unknowns.vec > preds.tsv
#class class_ID score
0 pos.vec 0.42779
1 neg.vec -0.30745
0 pos.vec 0.16307
....
Extensive documentation
usage: vectools slice [-h] [--keep-cols [KEEP_COLS]]
[--remove-cols [REMOVE_COLS]] [-c] [-r] [-d [DELIMITER]]
[--roundto ROUNDTO]
[matrices [matrices ...]]
positional arguments:
matrices Matrices to add to a base matrix.
optional arguments:
-h, --help show this help message and exit
--keep-cols [KEEP_COLS] The columns which should be kept (comma-separated).
--remove-cols [REMOVE_COLS] The to remove (comma-separated). Omitted if --keep-cols is present.
-c, --column-titles The matrix has column titles.
-r, --row-titles The matrix has row titles.
-d [DELIMITER], --delimiter [DELIMITER] The character separating columns. default: TAB
--roundto ROUNDTO Round to n decimal places.
This function is to slice a matrix
You have to say which columns you want to keep (--keep-cols) or to remove (--remove-cols)
in a comma separated list like 1,3,7 or 1,4:7,9:
See function chop if you want to remove rows
#Examples:
$ cat matrix.tsv
id,c,d,e
a,1,2,3
b,3,4,5
$ vectools slice --keep-cols 0,2 --delimiter , --row-titles --column-titles matrix.tsv
id,c,e
a,1,3
b,3,5
$ vectools slice --remove-cols 1:2 --delimiter , --column-titles matrix.tsv
id,e
a,3
b,5
Extensive Unit and Integration Testing
Scenario: run params test for max r # features/analysis.feature:134
Given the file test containing # features/steps/matrix_helper.py:81 0.000s
"""
0,1,2,3,4,5
1,2,3,4,5,6
2,3,4,5,6,7
3,4,5,6,7,8
"""
Given file test as parameter # features/steps/matrix_helper.py:104 0.000s
Given parameter --delimiter = , # features/steps/matrix_helper.py:11 0.000s
Given last parameter --row-titles # features/steps/matrix_helper.py:73 0.000s
When we run minimum from analysis with tmpfile # features/steps/analysis.py:114 0.001s
Then we expect the matrix # features/steps/matrix_helper.py:53 0.000s
"""
1,2,3,4,5
"""