Anda di halaman 1dari 1

PALLAV ANAND

400 Nagle Street, College Station,TX 04pallav@gmail.com


OBJECTIVE +1-206-466-9707
Seeking full time opportunities in the field of Data Analytics/Machine Learning/Operations Research starting from May, 2017.
EDUCATION
Texas A&M University, College Station, Texas, USA Aug 2015-May 2017
Master of Science in Industrial Engineering (Focus on Data Science and Analytics) GPA: 3.88/4.00 with Instate Scholarship.
Courses: Applied Multivariate Statistical Analysis, Engineering Data Analysis, R and Big Data Applications, Theory of Statistics,
Linear Programming & Optimization, Time Series Analysis, Non- Linear and Dynamic Programming.
Indian Institute of Technology (IIT Roorkee) India May 2010-May 2014
Bachelor of Technology, First Division Merit Scholar
CORE COMPETENCIES
Programming: R (3 years), Python (NumPy,SciPy, Pandas, Scikit-learn, NTLK) (3 years), SQL (1 year) , JAVA
Visualization: ggplot2, Matplotlib, Seaborn, Tableau
Big Data tools: Spark (SparkSQL, MLlib), Hadoop, Sqoop, Flume, Hive, Pig, MapReduce ( 1 year)
Statistics and Machine Learning : Linear Regression, Logistic Regression, Decision Tree, Random Forest, SVM, Boosting,
Neural Networks, K-Means, Hierarchical Clustering, kNN
Strong fundamentals in Computer Science, Algorithms & Data Structures
PROFESSIONAL EXPERIENCE
Data Scientist Co-op, Danaher Labs, Santa Clara,CA Aug 2016-Dec 2016
Part of a 3 member data science team predicting customer retention rate for Nobel Biocare,a Zurich based dental company.
Provided actionable insights for sales and marketing teams using customer characteristics like market segmentation , geographical
location and buying behaviour.
Used NLP tools like NLTK and tm library in R to segment customer feedback into positive and negative signals and predicted
customer satisfaction.
Visualized data in matplotlib and ggplot2 and built models predicting probability of customer retention using techniques like
Random Forests, Boosted trees and logistic regression.
Data Science Intern, Optimal Asset Management, Los Altos, CA June 2016-Aug 2016
Built predictive models to optimize financial returns and created financial portfolios using factor analysis dealing with
highly multicollinear data using Principal Component Analysis(PCA) and other methods.
Used R scripts to generate reports containing portfolio visualization and performance metrics.
Utilized data provided by financial firms like Blackrock to build and maintain SQL databases
Wrote, reviewed and managed production level code in a git framework.
Data Analyst Engineer, Reliance Industries Limited July 2014 July 2015
Performed time series analysis and other forecasting methods like ARIMA to predict crude oil prices
Handled large data sets including data in unusual formats, transforming data into a usable form, and aggregating data as
needed using a variety of tools including Python and R
Wrote MySQL scripts and built indices to extract data efficiently from remote databases
Data Analyst Internee, Reliance Industries Limited May, 2013-July, 2013
Developed framework for Supplier Management in Strategic Sourcing of spare parts considering multiple strategic and
operational factors
PROJECTS
Semiconductor Fabrication Testing - Classification, Texas A&M University April 2016 May 2016
Working on fault prediction in semiconductor manufacturing using feature selection techniques, random forest (cross validation),
considering causal relationships with a view to identifying the key features to enable an increase in process throughput.
Fraud Detection in Credit Card Transactions, Capital One March 2016April 2016
Predicted fraud in credit card transactions and built robust models using machine learning algorithms. Used Extreme Gradient
Boosting based on decision trees and subsequently performed cost-benefit analysis to determine threshold for classification using
domain knowledge and historical transaction data provided by Capital One Company.
Data Science Project Group Leader - Drug Repositioning using Microarray Data Analysis, TAMU Jan 2016 April 2016
Worked on data cleaning, statistical analysis, modeling and testing of gene microarrays expression data using R programming in
LINUX environment, to detect target drugs potential indications to treat new diseases, based on drug-disease information obtained
from public databases.
Source Code for projects : https://github.com/04pallav
CERTIFICATIONS
Big Data Basics: Hadoop, MapReduce, Hive, Pig & Spark Udemy Course Certificate No UC-M1FVUGW4
Advanced Databases and SQL Querying Udemy Course Certificate No UC-SXKMPL4V
Machine Learning in Python Udemy Course Certificate No UC-S22KPO0O
AWARDS
Merit scholarship with Full Fee waiver at IIT Roorkee (Undergraduate university)
Test score of 333 out of 340 in Graduate Record Examination(GRE)
Only student awarded In-state Scholarship among 140 students in Industrial Engineering at Texas A&M University
Secured 99.66 percentile in All India Engineering Entrance Examination (1723 out of 500,000 students) in IIT-JEE 2010

Anda mungkin juga menyukai