Assistant Professor of Computer Science
PhD (Louisiana State University)
Research Interests: Big data, distributed computing, high performance computing (HPC), computational genomics
Professor Sayan Goswami received his doctorate in computer science in 2019 from Louisiana State University in Baton Rouge, USA. He got his BTech in 2011 from the National Institute of Technology (NIT), Durgapur, followed by a stint with the IT industry in Sapient Global Markets where he worked on backend processes in energy trading. Before joining Ahmedabad University, he was an Assistant Professor of Computer science at LSU Shreveport, USA. He has taught courses ranging from introductory to advanced in programming, object-oriented programming, object-oriented design, rapid GUI app development, and big data. His interdisciplinary research interests lie in the application of High-Performance Computing to process scientific big data, especially those produced in computational genomics. In the past, Professor Goswami has primarily concentrated on de novo whole genome assembly from a computational point of view. Specifically, his research has addressed the increase in memory and execution times required to analyse the ever-growing amount of genomic data. In addition to parallel and distributed computational techniques, these solutions employ big data algorithms such as sketching and streaming. In his current research project, he is dealing with similar problems encountered in processing metagenomes but working on solutions involving hardware accelerators which yield more performance per rupee while requiring less space and energy footprint.
Professor Sayan Goswami is an Assistant Professor in the Mathematical and Physical Sciences division at the School of Arts and Sciences.
In the past, Professor Goswami has primarily concentrated on de novo whole genome assembly from a computational point of view. Specifically, his research has addressed the increase in memory and execution times required to analyse the ever-growing amount of genomic data. A part of genomics deals with the extraction of whole genome sequences of organisms for applications in personalised medicine, evolutionary biology, etc. This requires machines known as sequencers which parse the nucleotides of the genome and output them as text. Due to limitations in sequencing technology, sequencers cannot read the entire genome at one go. Instead, they clone the genome and read short segments from quasi-random positions at each clone. These overlapping segments are then merged in a process known as assembly. From a computational point of view, an assembly is the shortest common superstring problem. Common heuristic solutions use either overlap graphs or de Bruijn graphs. Building overlap graphs include a compute-intensive step of finding overlaps between all pairs of reads in the dataset. Contrarily, de Bruijn graphs are easier to build but require large amounts to memory for storage and subsequent processing. Professor Goswami’s research addresses the increase in memory requirements and execution times of assemblers because of the genomic data explosion during the last decade.