Successes and Pitfalls in the Data Mining of HTS Data
Stephen Pickett
Cheminformatics, GlaxoSmithKline
stephen.d.pickett@gsk.com
High-throughput screening is the mainstay of lead discovery efforts in the pharma industry and the vast volumes of data generated provide a large resource for data mining. However, high-throughput screening is a highly complex process involving many stages and departments. In this presentation we discuss the whole high-throughput screening process from compound supply through screening to analysis at the chemist's desktop, examining the issues at each stage and some of the solutions put in place at GSK to improve the quality of the high-throughput screening process and the quality of leads that result. Methodologies developed to help chemists visualise and interpret the results are discussed and a novel approach to determine which compounds should enter the screening collection is presented. Examples from GSK screening campaigns will be used to highlight the specific issues discussed.