400 Nagle Street, College Station,TX 04pallav@gmail.com
OBJECTIVE +1-206-466-9707 Seeking full time opportunities in the field of Data Analytics/Machine Learning/Operations Research starting from May, 2017. EDUCATION Texas A&M University, College Station, Texas, USA Aug 2015-May 2017 Master of Science in Industrial Engineering (Focus on Data Science and Analytics) GPA: 3.88/4.00 with Instate Scholarship. Courses: Applied Multivariate Statistical Analysis, Engineering Data Analysis, R and Big Data Applications, Theory of Statistics, Linear Programming & Optimization, Time Series Analysis, Non- Linear and Dynamic Programming. Indian Institute of Technology (IIT Roorkee) India May 2010-May 2014 Bachelor of Technology, First Division Merit Scholar CORE COMPETENCIES Programming: R (3 years), Python (NumPy,SciPy, Pandas, Scikit-learn, NTLK) (3 years), SQL (1 year) , JAVA Visualization: ggplot2, Matplotlib, Seaborn, Tableau Big Data tools: Spark (SparkSQL, MLlib), Hadoop, Sqoop, Flume, Hive, Pig, MapReduce ( 1 year) Statistics and Machine Learning : Linear Regression, Logistic Regression, Decision Tree, Random Forest, SVM, Boosting, Neural Networks, K-Means, Hierarchical Clustering, kNN Strong fundamentals in Computer Science, Algorithms & Data Structures PROFESSIONAL EXPERIENCE Data Scientist Co-op, Danaher Labs, Santa Clara,CA Aug 2016-Dec 2016 Part of a 3 member data science team predicting customer retention rate for Nobel Biocare,a Zurich based dental company. Provided actionable insights for sales and marketing teams using customer characteristics like market segmentation , geographical location and buying behaviour. Used NLP tools like NLTK and tm library in R to segment customer feedback into positive and negative signals and predicted customer satisfaction. Visualized data in matplotlib and ggplot2 and built models predicting probability of customer retention using techniques like Random Forests, Boosted trees and logistic regression. Data Science Intern, Optimal Asset Management, Los Altos, CA June 2016-Aug 2016 Built predictive models to optimize financial returns and created financial portfolios using factor analysis dealing with highly multicollinear data using Principal Component Analysis(PCA) and other methods. Used R scripts to generate reports containing portfolio visualization and performance metrics. Utilized data provided by financial firms like Blackrock to build and maintain SQL databases Wrote, reviewed and managed production level code in a git framework. Data Analyst Engineer, Reliance Industries Limited July 2014 July 2015 Performed time series analysis and other forecasting methods like ARIMA to predict crude oil prices Handled large data sets including data in unusual formats, transforming data into a usable form, and aggregating data as needed using a variety of tools including Python and R Wrote MySQL scripts and built indices to extract data efficiently from remote databases Data Analyst Internee, Reliance Industries Limited May, 2013-July, 2013 Developed framework for Supplier Management in Strategic Sourcing of spare parts considering multiple strategic and operational factors PROJECTS Semiconductor Fabrication Testing - Classification, Texas A&M University April 2016 May 2016 Working on fault prediction in semiconductor manufacturing using feature selection techniques, random forest (cross validation), considering causal relationships with a view to identifying the key features to enable an increase in process throughput. Fraud Detection in Credit Card Transactions, Capital One March 2016April 2016 Predicted fraud in credit card transactions and built robust models using machine learning algorithms. Used Extreme Gradient Boosting based on decision trees and subsequently performed cost-benefit analysis to determine threshold for classification using domain knowledge and historical transaction data provided by Capital One Company. Data Science Project Group Leader - Drug Repositioning using Microarray Data Analysis, TAMU Jan 2016 April 2016 Worked on data cleaning, statistical analysis, modeling and testing of gene microarrays expression data using R programming in LINUX environment, to detect target drugs potential indications to treat new diseases, based on drug-disease information obtained from public databases. Source Code for projects : https://github.com/04pallav CERTIFICATIONS Big Data Basics: Hadoop, MapReduce, Hive, Pig & Spark Udemy Course Certificate No UC-M1FVUGW4 Advanced Databases and SQL Querying Udemy Course Certificate No UC-SXKMPL4V Machine Learning in Python Udemy Course Certificate No UC-S22KPO0O AWARDS Merit scholarship with Full Fee waiver at IIT Roorkee (Undergraduate university) Test score of 333 out of 340 in Graduate Record Examination(GRE) Only student awarded In-state Scholarship among 140 students in Industrial Engineering at Texas A&M University Secured 99.66 percentile in All India Engineering Entrance Examination (1723 out of 500,000 students) in IIT-JEE 2010
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve