FastETC

A web server for identifying electron transport proteins
using efficient learning of word representations

Introduction

FastETC (Fast Electron Transport Classification) is a database created to explore the universe of the electron transport proteins, which are essential for life because of the importance of electron transfer in bioenergetics and other processes. Here you can submit your own sequences to predict the electron transport proteins.

Electrons captured from donor molecules are transferred via these complexes. These complexes are organized into Complex I, Complex II, Complex III, Complex IV, and ATP synthase (which may be called Complex V). Each complex includes numerous specific electron carriers with different molecular functions. At the mitochondrial inner membrane, electrons from nicotinamide adenine dinucleotide (NADH) and succinate bypass through the electron transport chain to oxygen. The most famous molecular function in complex I and complex II are NADH dehydrogenase and succinate dehydrogenase, respectively. Electrons bypass from complex I to a carrier (coenzyme Q) that embeds itself inside the membrane. From coenzyme Q, electrons are handed to complex III (cytochrome b, c1 complex). The pathway from complex III ends in cytochrome c then to complex IV (cytochrome oxidase complex). At the end, the proton electrochemical gradient allows ATP synthase to apply the flow of H+ to generate ATP. Therefore, the identification of the electron transport proteins is vital for helping biologists understand the electron transport chain process and energy production in cells.

The web interface is built friendly so that the user can easily to access the function and comfortably use it. We optimized the performance of our web server with different techniques to give the best experience for all users. According to this web server, biologists can discover new proteins belongs to the electron transport proteins to understand clearly the operating mechanism of electron transport chains.

Method

We approached a novel using word representations to identify the electron transport proteins from transport proteins with high performance. The flowchart of the study is show as follows.

All of the process for training and evaluate model was done by using FastText library.

Dataset

All the dataset using in this web server are retrieved from UniProt. The detail of the dataset lists in the below table.

Original Remove overlapping Cross-validation Independent
Transport proteins 74,759 53,686 44,739 8,947
Electron Proteins 12,832 12,571 10,476 2,095

If you would like to build a model and evaluate our model, we provide the dataset as the below link.

Download dataset.zip

Submission

In order to avoid the errors, please submit the sequence in fasta format (we also give you the fasta file examples). The user can choose two options to submit, including paste the sequence into text area and upload sequence file. The user can submit one single fasta file or multiple fasta file. In the result page, we show the results for the sequences which belong to electron transport proteins or not.

Sample fasta Sequence(s)

>sp|Q719N1|SPAST_PIG
MNSPGGRGKKKGSGGPSSPVPPRPPPPCQARSRPAPKPAPPPQSPHKRNLYYFSYPLFLG
FALLRLVAFHLGLLFVWLCQRFSRALMAAKRSSGAAPASASPPAPVPGGEAERVRAFHKQ
AFEYISVALRIDEDEKVGQKDQAVEWYKKGIEELEKGIAVVVTGQGEQCERARRLQAKMM
TNLVMAKDRLQLLEKLQPSLQFSKSQTDVYNDSTNLTCRNGHLQSESGAVPKRKDPLTHA
SNSLPRSKTVMKTGPTGLSGHHRAPSCSGLSMVSGVRQGPGSAAATHKSTPKTNRTNKPS
TPTTAARKKKDLKNFRNVDSNLANLIMNEIVDNGTAVKFDDIAGQELAKQALQEIVILPS
LRPELFTGLRAPARGLLLFGPPGNGKTMLAKAVAAESNATFFNISAASLTSKYVGEGEKL
VRALFAVARELQPSIIFIDEVDSLLCERREGEHDASRRLKTEFLIEFDGVQSAGDDRVLV
MGATNRPQELDEAVLRRFTKRVYVSLPNEETRLLLLKNLLCKQGSPLTQKELAQLARMTN
GYSGSDLTALAKDAALGPIRELKPEQVKNMSASEMRNIRLSDFTESLKKIKRSVSPQTLE
AYIRWNKDFGDTTV
>sp|P42115|THIO_NEUCR
MSDGVKHINSAQEFANLLNTTQYVVADFYADWCGPCKAIAPMYAQFAKTFSIPNFLAFAK
INVDSVQQVAQHYRVSAMPTFLFFKNGKQVAVNGSVMIQGADVNSLRAAAEKMGRLAKEK
AAAAGSS

Members

Yu-Yen Ou Assistant Professor

Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.

Quang-Thai Ho Research Scholar

Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.

Nguyen-Quoc-Khanh Le Research Scholar

Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.

Trinh-Trung-Duong Nguyen Research Scholar

Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.

Contact us

Yuan Ze University
Department of Computer Science and Engineering
Graduate Program in Biomedical Informatics
Bioinformatics Laboratory (R1607B)
Address: No. 135, Yuandong Road, Chungli City, Taoyuan County, Taiwan R.O.C .32003
Tel: (03) 463-8800

If you have any problem or suggest any idea for our website, feel free to contact us via email: [email protected]