CNN model for predicting enhancers and silencers

How to use

Generating data

prerequisite: the whole_genome_sequence_file (downloaded from the USCS genome browse, it is too big to be included here).

    generate_data.py enhancers.bed silencers.bed controls.bed outputdata genome.fasta

example:

    generate_data.py ./examples/tempEN.bed ./examples/tempSL.bed ./examples/tempBK.bed temp.data.hdf5 hg19.fa

Traing a model

input: data file, and target directory for the results

    A data file includes training and validation sample sets. 
    Each sample set is represented by:
       a one-hot-encode DNA sequence matrix in a size of N * 1000 * 4, 
       a 3-class matrix in a size of N * 3. N is the number of samples.

example: python train.py ./examples/data_training.hdf5 ./examples/

output: in the directory ./examples/

    auc.txt

    fpr_threshold_scores.txt
    
    model_weights.hdf5

Making predictions with a built model,

input: data file, model file,

example: python train.py ./examples/data_prediction.hdf5 ./examples/model_weights.hdf5

output: ./examples/data_prediction.hdf5.predict.data

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
examples		examples
README.md		README.md
generate_data.py		generate_data.py
predict.py		predict.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNN model for predicting enhancers and silencers

How to use

Generating data

Traing a model

Making predictions with a built model,

Featuring

About

Releases

Packages

Languages

ncbi/SilencerEnhancerPredict

Folders and files

Latest commit

History

Repository files navigation

CNN model for predicting enhancers and silencers

How to use

Generating data

Traing a model

Making predictions with a built model,

Featuring

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages