Start Submission Become a Reviewer

Reading: Cluster Identification in Metagenomics – A Novel Technique of Dimensionality Reduction throu...

Download

A- A+
Alt. Display

Articles

Cluster Identification in Metagenomics – A Novel Technique of Dimensionality Reduction through Autoencoders

Authors:

Kalana Wijegunarathna ,

University of Moratuwa, LK
About Kalana
Department of Computer Science & Engineering
X close

Uditha Maduranga,

University of Moratuwa, LK
About Uditha
Department of Computer Science & Engineering
X close

Sadeep Weerasinghe,

University of Moratuwa, LK
About Sadeep
Department of Computer Science & Engineering
X close

Indika Perera,

University of Moratuwa, LK
About Indika
Department of Computer Science & Engineering
X close

Anuradha Wickaramarachchi

Australian National University, AU
X close

Abstract

Analysis of metagenomic data is not only challenging because they are acquired from a sample in their natural habitats but also because of the high volume and high dimensionality. The fact that no prior lab based cultivation is carried out in metagenomics makes the inference on the presence of numerous microorganisms all the more challenging, accentuating the need for an informative visualization of this data. In a successful visualization, the congruent reads of the sequences should appear in clusters depending on the diversity and taxonomy of the microorganisms in the sequenced sample. The metagenomic data represented by their oligonucleotide frequency vectors is inherently high dimensional and therefore impossible to visualize as is. This raises the need for a dimensionality reduction technique to convert these higher dimensional sequence data into lower dimensional data for visualization purposes. In this process, preservation of the genomic characteristics must be given highest priority. Currently, for dimensionality reduction purposes in metagenomics, Principal Component Analysis (PCA) which is a linear technique and t-distributed Stochastic Neighbor Embedding (t-SNE), a non-linear technique, are widely used. Albeit their wide use, these techniques are not exceptionally suited to the domain of metagenomics with certain shortcomings and weaknesses. Our research explores the possibility of using autoencoders, a deep learning technique, that has the potential to overcome the prevailing impediments of the existing dimensionality reduction techniques eventually leading to richer visualizations.
How to Cite: Wijegunarathna, K., Maduranga, U., Weerasinghe, S., Perera, I. and Wickaramarachchi, A., 2021. Cluster Identification in Metagenomics – A Novel Technique of Dimensionality Reduction through Autoencoders. International Journal on Advances in ICT for Emerging Regions (ICTer), 14(2), pp.9–18. DOI: http://doi.org/10.4038/icter.v14i2.7224
Published on 30 Mar 2021.
Peer Reviewed

Downloads

  • PDF (EN)

    comments powered by Disqus