Anda di halaman 1dari 3

Tanya Goyal

tanyagoyal@utexas.edu | 512.985.1383 | tanyagoyal.weebly.com

RESEARCH INTERESTS
I am interested in deep models for structured prediction in natural language processing and machine learning. I
also work on approaches that leverage unlabeled data for unsupervised knowledge induction.

EDUCATION
UNIVERSITY OF TEXAS AT AUSTIN Austin, Texas
Masters in Computer Science Aug 2017 – May 2019 (Expected)
Thesis Advisor: Dr. Greg Durrett
GPA: 3.945/4.0
INDIAN INSTITUTE OF TECHNOLOGY, GUWAHATI Assam, India
Bachelor of Technology in Mathematics and Computing July 2011 – June 2015
CPI: 9.10/10
SELECT PUBLICATIONS
● Goyal, Tanya, et al. “An Empirical Analysis of Edit Importance between Document Versions.” Proceedings
of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017).
● Goyal, Tanya, et al. “Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to
Ensure Quality Relevance Annotations.” Proceedings of the 6th AAAI Conference on Human Computation
and Crowdsourcing (HCOMP 2018).
● Goyal, Tanya, et al. “Preventing inadvertent information disclosures via automatic security policies.”
Pacific-Asia Conference on Knowledge Discovery and Data Mining. (PAKDD 2017).

ACADEMIC RESEARCH
• Temporal Event Ordering (Master’s Thesis) Aug 2018 – Present
Advisor: Dr. Greg Durrett, University of Texas at Austin
This work focuses on inferring the temporal order of events in text. We approach this problem as a pairwise
classification model along with structural constraints to ensure global consistency. One of the major
limitations of the problem is the low data availability. We are formulating unsupervised techniques that use
a number of distant learners to inform the model.

• Crowd source quality control through worker reliability modeling Aug 2017 – May 2018
Advisor: Dr. Matt Lease, University of Texas at Austin
Explored a new direction for quality control of data collected from Mechanical Turk; estimating labeling
quality through workers’ behavioral signals.
Proposed three behavior-based models to predict worker accuracy and label correctness, developed a
markov decision process based approach to dynamically design cost-optimized tasks.

PROJECTS
• Joint Modeling of Image Captioning and Visual Question Answering Fall 2017
Explored multi-task models and architectures for the visual question-answering problem. Experimented with
various architectures combining the main VQA task with related tasks such as image captioning or sub-tasks
such as question-type prediction.
• Named Entity Recognition in Low Resource Domains Fall 2017
Performed a comparative analysis of domain adaptive techniques for the task of Named Entity Recognition
in low resource domains, using deep models.

• Controlled Sentence Generation Spring 2018


Proposed VAE architectures capable of generating text conditioned on semantically interpretable attributes,
such as tag distributions or topic distributions. The architecture learns latent vectors that correspond to the
attribute specific and attribute agnostic components. Minimized the Wasserstein distance between gold
attribute distribution and sampled distribution in order to learn the former. Additional constraints ensure
learning of attribute agnostic latent spaces.

INDUSTRIAL RESEARCH
• Edit Classification Research Engineer, Adobe Research, 2015 - 2017
Proposed a supervised approach to extract textual differences between multiple versions of a document, e.g.
Wikipedia edit history, and assign labels to edits, classified as paraphrase or factual. A ranking algorithm
was proposed to rank edits in order of perceived importance by reviewers.
Published in EMNLP 2017.

• Extreme Multi-label Document Classification Research Engineer, Adobe Research, 2015 - 2017
Developed a hierarchical multi-label classification algorithm for the task of recipient prediction for
documents, e.g. emails, based on the text content and user social network.
Published in PAKDD 2017, demoed at Adobe Tech Summit, San Jose 2017, bi-annual conference
showcasing innovative projects across research and engineering teams.

COMPLETE LIST OF PUBLICATIONS


● Jaidka, Kokil, Tanya Goyal, and Niyati Chhaya. "Predicting Email and Article Clickthroughs with Domain-
adaptive Language Models." Proceedings of the 10th ACM Conference on Web Science. ACM, 2018.
● Chhaya, Niyati, Kushal Chawla, Tanya Goyal, Projjal Chanda, and Jaya Singh. "Frustrated, Polite, or
Formal: Quantifying Feelings and Tone in Email." Proceedings of the Second Workshop on Computational
Modeling of People’s Opinions, Personality, and Emotions in Social Media. 2018.
● Sancheti, Abhilasha, Paridhi Maheshwari, Rajat Chaturvedi, Anish V. Monsy, Tanya Goyal, and Balaji
Vasan Srinivasan. "Harvesting Knowledge from Cultural Heritage Artifacts in Museums of India."
In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2018.
● Srinivasan, Balaji Vasan, Tanya Goyal, Varun Syal, Shubhankar Suman Singh, and Vineet Sharma.
"Environment Specific Content Rendering & Transformation." In Companion Publication of the 21st
International Conference on Intelligent User Interfaces, 2016.
● Srinivasan, Balaji Vasan, Tanya Goyal, Nikhil Mohan Nainani, and Kartik Sreenivasan. "Smart filters for
social retrieval." In Proceedings of the 3rd IKDD Conference on Data Science, 2016.

INFORMATION DISCLOSURES (PATENTS)


Granted
● Generating Data Driver Geo-fences [US Patent 9838843]
● Method to expand seed keywords into a relevant social query [US Patent 9,965,766]
Published
● Classifying and ranking changes between document versions [US Patent App. 15/476,640]
● Method and apparatus for generating predictive insights for authoring short messages [US Patent App.
15/146,676]
● Notification Control based on Location, Activity, and Temporal Prediction [US Patent App. 15/374,561]
● Tagging documents with security policies [US Patent App. 15/424,527]
● Determination of Paywall Metrics [US Patent App. 15/277,136]
● Content to Layout Template Mapping and Transformation [US Patent App. 15/013,809]
● Method to modify existing query based on relevance feedback from social posts [US Patent App. 14/590,362]
● Bundling Online Content Fragments For Presentation Based on Content-Specific Metrics and Inter-Content
Constraints [US Patent App. 15/687,658]

REFERENCES
Dr. Greg Durrett Dr. Matt Lease Balaji Vasan Srinivasan
Assistant Professor Associate Professor Sr. Computer Scientist
University of Texas, Austin University of Texas, Austin Adobe Research
gdurrett@cs.utexas.edu ml@utexas.edu balsrini@adobe.com