Publication
Histone-Net: a multi-paradigm computational framework for histone occupancy and modification prediction
Muhammad Nabeel Asim; Muhammad Ali Ibrahim; Muhammad Imran Malik; Imran Razzak; Andreas Dengel; Sheraz Ahmed
In: Complex & Intelligent Systems, Pages 1-21, Springer, 7/2022.
Abstract
Deep exploration of histone occupancy and covalent post-translational modifications (e.g., acetylation, methylation) is essential to decode gene expression regulation, chromosome packaging, DNA damage, and transcriptional activation. Existing
computational approaches are unable to precisely predict histone occupancy and modifications mainly due to the use of
sub-optimal statistical representation of histone sequences. For the establishment of an improved histone occupancy and
modification landscape for multiple histone markers, the paper in hand presents an end-to-end computational multi-paradigm
framework “Histone-Net”. To learn local and global residue context aware sequence representation, Histone-Net generates
unsupervised higher order residue embeddings (DNA2Vec) and presents a different application of language modelling,
where it encapsulates histone occupancy and modification information while generating higher order residue embeddings
(SuperDNA2Vec) in a supervised manner. We perform an intrinsic and extrinsic evaluation of both presented distributed
representation learning schemes. A comprehensive empirical evaluation of Histone-Net over ten benchmark histone markers data
sets for three different histone sequence analysis tasks indicates that SuperDNA2Vec sequence representation and softmax
classifier-based approach outperforms state-of-the-art approach by an average accuracy of 7%. To eliminate the overhead of
training separate binary classifiers for all ten histone markers, Histone-Net is evaluated in multi-label classification paradigm,
where it produces decent performance for simultaneous prediction of histone occupancy, acetylation, and methylation.