Sachin Konan

Contact: sachin at twosigma.com
CV / Google Scholar

Hi, I am an engineer at Two Sigma Investments, where I work on research and engineering problems related to graphs and automated code generation. I obtained my BS in Computer Science in 2022 from Georgia Tech.

I have been at:

Two Sigma - Graph Perturbation/Differential Privacy
Fundamental AI Research (FAIR) @ Meta - Open World Detection
Intel - Optimizing Neural Network Simulation
Georgia Tech - Single/Multi-Agent Reinforcement Learning (RL) ☆

☆ I was fortunate to be advised by Matthew Gombolay. I also worked under Constantinos Dovrolis as a teaching Assistant for CS 3510.

I am interested in scientifically understanding Machine Learning and its applications to reasoning tasks. I want to build deployable ML that can be seamlessly used.

	Automating the Generation of Functional Semantic Types with Foundational Models Sachin Konan, Larry Rudolph, Scott Affens North American Chapter of the Association for Computational Linguistics (NAACL), 2024 (In Preparation) TLDR;* The proliferation of vast data providers and inherent dirtiness of data have increased the value proposition of Semantic Types. We introduce the concept of Functional Semantic Types (FST)s which are Python classes that encapsulate the informational and functional context of columnar data. FSTs will normalize and validate data in a structured/readable manner, allowing automated cross-table joins or fast lookups. In order to scale the generation of FSTs, we leverage Foundational Models to transform serialized data-tables to FSTs. Across Kaggle, Harvard, and FactSet Data-Verses we show our method FSTO-Gen, can generate functionally and informationally correct FSTs.
	Merge-Split: Directed Graph Perturbations that Preserve Random Walk Structure Sachin Konan, Larry Rudolph, Very Large DataBases (VLDB), 2024 (In Review) TLDR;* Directed Graphs contain important dependencies and whose structure may be sensitive. Previous work has shown that node deanonymization isn't enough so we protect against subgraph isomorphism and Sybil attacks through the use of random perturbations (Merge-Split). For usability, the perturbed graph has to maintain similarity to the original, which is achieved by minimizing the change in the graph's eigenspace. Our experiments showed that Merge-Split locally disrupts random walks while maintaining overall structural properties, like the graph's steady-state distribution.
	Contrastive Decision Transformer Sachin Konan, Esmaeil Seraj, Matthew Gombolay Conference on Robot Learning, 2023 Official: PMLR / Video / Poster TLDR;* Decision Transformer (DT) is a return-conditioned system that generates the action that will achieve a desired return in a given state. Achieving high return should theoretically require drastically different behavior than low return. In the same way that search is optimized through indexing, ConDT organizes state-action embeddings by return, allowing the transformer to more easily "recall" the necessary action for that return. ConDT improves performance in Gym, Atari, and hand-grip tasks.
	Iterated reasoning with mutual information in cooperative and byzantine decentralized Teaming Sachin Konan, Esmaeil Seraj, Matthew Gombolay International Conference on Learning Representations, 2022 Official: arXiv / Blog / Poster / Presentation / Video TLDR; Collaborative teaming in multi-agent RL is challenging because agents need to consider the conditionality of their actions, which exponentially grows in complexity with the size of the action space and the number of agents. Humans excel at learning this conditionality by recursively reasoning about the actions of others, like in chess where a player A recursively considers how their player B considers how player A, and so on. We formulated a multi-agent policy gradient (InfoPG) that fosters this type of reasoning by maximizing inter-agent mutual information. InfoPG improves team-performance in Gym and Starcraft cooperative games and is robust to adversarial team-mates (Byzantine Generals Problem).
	Extending one-stage detection with open-world proposals Sachin Konan, Kevin J Liang, Li Yin arXiv pre-print, 2022 TLDR;* Object Detection consists of the localization and classification of objects, and two-stage networks made this process conditional. However, in practice, one might want to localize any object, regardless of whether it can be classified (this is called Open World Detection (OWD)). Previously, two-stage networks have been studied, but we investigate a one-stage network called FCOS for its simplicity and decoupling of classification from localization. We investigate various architectural and sampling improvements that allow FCOS to retain is classification ability, while improving localization recall.

Also, I recently wrote a blog post about the benefits of migrating datatable schemas from primitive data-types to entity-driven data-types, called Semantic Types. We are investigating the usage of LLMs to automatically generate Semantic Type definitions and data processing code.

Website template from Jon Barron