Project: Classifying High Occupancy of Targets (HOT) regions, and Low Occupancy of Targets (LOT) regions from DNA sequences
Regulatory binding factor are not randomly distributed among the whole genome of human. 50% of them are found clustered is different regions in the genome. These regions are called High Occupancy Target (HOT) regions, consequently the other regions of the genome are termed as Low Occupancy Target (LOT) regions. Approximately, 90% of the constituents of Human HOT regions show strong enrichment of promoters while 10-20% shows context-specific enrichment.
Due to the development of machine learning and with known HOT positions in genome, we can find a model to classify these regions. My goal is to find a model which can be interpretable.
This webpage is dedicated for the project work of CS 234!
Oct 13