input: data file, and target directory for the results
A data file includes training and validation sample sets.
Each sample set is represented by:
a one-hot-encode DNA sequence matrix in a size of N * 1000 * 4,
a 3-class matrix in a size of N * 3. N is the number of samples.
example: python train.py ./examples/data_training.hdf5 ./examples/
output: in the directory ./examples/
auc.txt
fpr_threshold_scores.txt
model_weights.hdf5
input: data file, model file,
example: python train.py ./examples/data_prediction.hdf5 ./examples/model_weights.hdf5
output: ./examples/data_prediction.hdf5.predict.data
%pip install Bio
%pip install pybedtools
!python /content/SilencerEnhancerPredict/train.py /content/SilencerEnhancerPredict/examples/training_200seq_2class.hdf5 /content/SilencerEnhancerPredict/examples/
!python /content/SilencerEnhancerPredict/train.py /content/SilencerEnhancerPredict/examples/training_200seq_2class.hdf5 /content/SilencerEnhancerPredict/examples/model_weights.hdf5