iEnhancer-5Step

A web server for identify enhancers and their strength by word embedding representation

Submit your sequences Download dataset

Introduction

In genetics, an enhancer is a short (50–1500bp) region of DNA that plays an important role in gene expression and then produce RNA and proteins. Enhancers are cis-acting and these RNAs or proteins are frequently alluded to as transcription factors. They can be situated up to 1 Mbp (1,000,000 bp) away from a gene, or even in different chromosome, upstream or downstream from the beginning site. There are a big number of enhancers in the human genome found in both eukaryotes and prokaryotes. Genetic variation in enhancers has been linked to a lot of human diseases, especially cancer, disorder, or inflammatory bowel disease.

Due to the importance in genomic, identification and classification of enhancers is one of the well-known studies in biological research. Our idea to transform enhancer sequences into vectors using word embedding and then classify them by using an effective neural networks. Our implementation used:

By using our webserver, users can identify the enhancers and their strength easily without any computational knowledge. They only need to provide a sequence and submit to our server, we then efficiently process and return the results. Our workflow to process an input sequences is as follows:

Dataset

If you would like to build a model and evaluate our model, please follow this link to download the benchmark dataset.

Cross-validation dataset Independent dataset

Submission

In order to avoid the errors, please submit the sequence in fasta format (we also give you the fasta file examples). The user can choose two options to submit, including paste the sequence into text area and upload sequence file. The user can submit one single fasta file or multiple fasta file. In the result page, we show the results for the sequences which belong to enhancers or not as well as their strength.

Example DNA sequences
>CHR12_6645339_6645539
GTGGCATAGTGGGGTGGTGAATACCATGTACAAAGCTTGTGCCCAGACTGTGGGTGGCAG
TGCCCCACATGGCCGCTTCTCCTGGAAGGGCTTCGTATGACTGGGGGTGTTGGGCAGCCC
TGGAGCCTTCAGTTGCAGCCATGCCTTAAGCCAGGCCAGCCTGGCAGGGAAGCTCAAGGG
AGATAAAATTCAACCTCTTG  
>CHRX_36223479_36223679
TACAAATTTGTTAAAGAGTGGTAATCTGATAAAGATAATAAGTGCACTTTGAGGAAGGTG
AGCTCTTGCTTTGAGAGTGAACTTGGTTATGGAAGTTGTGCTGAAGTTGAGGCTTAGGCT
GGCTCTAAAAGAAAGGAAAAATGTTGATGGCTAGAGGTTATCAGAGGAGAAGAATGTGTT
ATTGGTGAGTAACTTATCTT
>CHR12_46780933_46781133
TAAAAATTTTCCCATTTCATTTATACTCCAGTTTCAGTTCCGTTGATTCAAAGTTTTGGG
ACTTAGGACAAGTTTGCTAGCTTCTCTGAGCCTCAGTTTTTTAAACCGTTAAATAGGAAT
AACAGCATGCTGAGTGCCAAGAATTAAAGAAAAATTTATGTGAAAATATAGATTAGAAAG
AAAACTAATATAAATGCAGG

Contact us


Department of Computer Science and Engineering
Graduate Program in Biomedical Informatics
Bioinformatics Laboratory (R1607B)
Address: No. 135, Yuandong Road, Chungli City, Taoyuan County, Taiwan R.O.C .32003
Tel: (03) 463-8800

If you have any problem or suggest any idea for our website, feel free to contact us via email: [email protected]