A web server for identifying electron transport proteins
using efficient learning of word representations
Introduction
FastETC (Fast Electron Transport Classification) is a database created to explore the universe of the electron transport
proteins, which are essential for life because of the importance of electron transfer in bioenergetics and other processes.
Here you can submit your own sequences to predict the electron transport proteins.
Electrons captured from donor molecules are transferred via these complexes. These complexes are organized into Complex
I, Complex II, Complex III, Complex IV, and ATP synthase (which may be called Complex V). Each complex includes numerous
specific electron carriers with different molecular functions. At the mitochondrial inner membrane, electrons from
nicotinamide adenine dinucleotide (NADH) and succinate bypass through the electron transport chain to oxygen. The most
famous molecular function in complex I and complex II are NADH dehydrogenase and succinate dehydrogenase, respectively.
Electrons bypass from complex I to a carrier (coenzyme Q) that embeds itself inside the membrane. From coenzyme Q,
electrons are handed to complex III (cytochrome b, c1 complex). The pathway from complex III ends in cytochrome c then
to complex IV (cytochrome oxidase complex). At the end, the proton electrochemical gradient allows ATP synthase to
apply the flow of H+ to generate ATP. Therefore, the identification of the electron transport proteins is vital for
helping biologists understand the electron transport chain process and energy production in cells.
The web interface is built friendly so that the user can easily to access the function and comfortably use it. We optimized
the performance of our web server with different techniques to give the best experience for all users. According to
this web server, biologists can discover new proteins belongs to the electron transport proteins to understand clearly
the operating mechanism of electron transport chains.
Method
We approached a novel using word representations to identify the electron transport proteins from transport proteins
with high performance. The flowchart of the study is show as follows.
All of the process for training and evaluate model was done by using FastText library.
Dataset
All the dataset using in this web server are retrieved from UniProt. The detail of the dataset lists in the below table.
Original
Remove overlapping
Cross-validation
Independent
Transport proteins
74,759
53,686
44,739
8,947
Electron Proteins
12,832
12,571
10,476
2,095
If you would like to build a model and evaluate our model, we provide the dataset as the below link.
In order to avoid the errors, please submit the sequence in fasta format (we also give you the fasta file examples). The user can choose two options to submit, including paste the sequence into text area and upload sequence file. The user can submit one single fasta file or multiple fasta file. In the result page, we show the results for the sequences which belong to electron transport proteins or not.
Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.
Quang-Thai Ho
Research Scholar
Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.
Nguyen-Quoc-Khanh Le
Research Scholar
Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.
Trinh-Trung-Duong Nguyen
Research Scholar
Department of Computer Science and Engineering
Yuan Ze University
135 Yuan-Tung Road, Chung-Li, Taiwan 32003, R.O.C.
Contact us
Yuan Ze University Department of Computer Science and Engineering
Graduate Program in Biomedical Informatics
Bioinformatics Laboratory (R1607B)
Address: No. 135, Yuandong Road, Chungli City, Taoyuan County, Taiwan R.O.C .32003
Tel: (03) 463-8800
If you have any problem or suggest any idea for our website, feel free to contact us via email:
[email protected]