Chen, W.Zeng, Y.C.Achinger-Kawecka, J.Campbell, E.Jones, A.K.Stewart, A.G.Khoury, A.Clark, S.J.2025-04-092025-04-092024Nucleic Acids Research (NAR), 2024; 52(14):8086-80990305-10480305-1048https://hdl.handle.net/2440/144169CCCTC-binding factor (CTCF) is an insulator protein that binds to a highly conserved DNA motif and facilitates regulation of three-dimensional (3D) nuclear architecture and transcription. CTCF binding sites (CTCF-BSs) reside in non-coding DNA and are frequently mutated in cancer. Our previous study identified a small subclass of CTCF-BSs that are resistant to CTCF knock down, termed persistent CTCF binding sites (P-CTCF-BSs). P-CTCF-BSs show high binding conservation and potentially regulate cell-type constitutive 3D chromatin architecture. Here, using ICGC sequencing data we made the striking observation that P-CTCF-BSs display a highly elevated mutation rate in breast and prostate cancer when compared to all CTCF-BSs. To address whether P-CTCF-BS mutations are also enriched in other cell-types, we developed CTCF-INSITE-a tool utilising machine learning to predict persistence based on genetic and epigenetic features of experimentally-determined P-CTCF-BSs. Notably, predicted P-CTCF-BSs also show a significantly elevated mutational burden in all 12 cancer-types tested. Enrichment was even stronger for P-CTCF-BS mutations with predicted functional impact to CTCF binding and chromatin looping. Using in vitro binding assays we validated that P-CTCF-BS cancer mutations, predicted to be disruptive, indeed reduced CTCF binding. Together this study reveals a new subclass of cancer specific CTCF-BS DNA mutations and provides insights into their importance in genome organization in a pan-cancer setting.en© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.Binding SitesBreast NeoplasmsCCCTC-Binding FactorChromatinMachine LearningMutationNeoplasmsProstatic NeoplasmsProtein BindingBinding SitesBreast NeoplasmsCCCTC-Binding FactorChromatinMachine LearningMutationNeoplasmsProstatic NeoplasmsProtein BindingHumansFemaleMaleMachine learning enables pan-cancer identification of mutational hotspots at persistent CTCF binding sitesJournal article10.1093/nar/gkae530725066Achinger-Kawecka, J. [0000-0002-2902-9371]