Andres S Castaneda-Data Scientist

Andres S Castañeda
Email: andrescastaneda535@gmail.com
Phone: 773-598-4686
PROFESSIONAL SUMMARY
 Around 8 years of experience in IT as Data scientist with strong technical expertise, business experience, and
communication skills to drive high-impact business outcomes through data-driven innovations and decisions.
 Extensive experience in Retail Analytics for consumer goods and buyers behavior, developing different Data
Mining, Analytics, Machine Learning solutions to various business problems and generating data visualizations
using Python.
 Expertise in transforming business requirements into analytical models, designing algorithms, building models,
developing data mining and reporting solutions that scale across a massive volume of structured and
unstructured data.
 Experience in designing stunning visualizations using Python software and publishing and presenting
dashboards, Storyline on desktop platforms.
 Hands on experience in implementing LDA, Naïve Bayes and skilled in Random Forests, Decision Trees, Linear
and Logistic Regression, SVM, Clustering, Principle Component Analysis and good knowledge on Recommender
Systems.
 Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees, Random
Forest, SVM, K-Nearest Neighbors, Bayesian, XGBoost) in Forecasting/ Predictive Analytics, Segmentation
methodologies, Regression-based models, Hypothesis testing, Factor analysis/ PCA, Ensembles.
 Worked and extracted data from various database sources like Oracle, SQL Server.
 Well experienced in Normalization & De-Normalization techniques for optimum performance in relational and
dimensional database environments.
 Hand on working experience in statistics to draw meaningful insights from data. I am good at communication and
storytelling with data.
 Utilize analytical applications/libraries like Pandas, Numpy, Scikit-Learn, Seaborn and Matplotlib to identify
trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical
findings into marketing strategies that drive value.
 Hands on experience on Databricks and PySpark utilities such as classification, regression, clustering,
dimensionality reductions
 Strong knowledge of data governance and experienced in input data analysis for deviations in raw data
processing.
 Solid team player, team builder, and an excellent communicator.
 Extensive hands-on experience and high proficiency with structures, semi-structured and unstructured data,
using a broad range of data science programming languages and big data tools including Python, Spark, SQL,
Scikit Learn, Hadoop, RDD, RDBMS.
 Expertise in Technical proficiency in Designing, Cleaning and preparing Data, Modeling, Solution, Data
Warehouse/Business Intelligence Applications.
 Experience in working on both windows, Linux a platforms.
 Flexible with Unix/Linux and Windows Environments, working with Operating Systems like Ubuntu13/14.
AWARDS & HONORS

 Science award in sustainability, project winner with the best affordable design of a solar stove.
 Honors for the best GPA of my generation.
EDUCATION
 Bachelors of science, UVM Laureate University, Mexico (GPA 9.28/10.0)
TECHNICAL SKILLS
Page 1 of 3
Programming: Python
Databases: Oracle, MongoDB, MS Access
Visualization: Seaborn, Bokeh, Matplotlib, GGplot
Software: MS Office (MS Excel, MS Power Point, MS Access)
Data Mining: Data reduction, Clustering, Classification, Anomaly detection, Text mining
Big Data Ecosystem: Hadoop, Spark
Machine Learning: Linear/Logistic regression, RFC, KNN, K-Means, Dimensionality reduction algorithms
BI Tools: Power BI, Tableau
PROFESSIONAL EXPERIENCE
Company : The Nielsen Company, Chicago, IL Aug 2010 - Present

Role: Data Scientist
Clients: Nestle, P&G, Coca-Cola, Pfizer, Bayer, Pepsico, Hershey’s, Kellogg’s, Bimbo, MARS, and General Mills.
Data Scientist
 Advice of key accounts to drive business models for the clients.

 Consulting client’s data to see what was previously invisible to improve operation, POS optimization, increase
market share, shopper analytics and consumer trends. Modern and traditional trade, segmentations.
 Scorecard, reporting
 Analyzed competitors’ market data to find opportunities for new business
 Use of both statistical analysis and tools like R, SPSS and Python to find patterns and predictive modeling.
Global Data Standards
 Developing Food, Beverages, Homecare, Pharma global data methods for advanced analytics.
 Data coding models.
 Designing methods to process big data in production lines globally.
 Designed category standards, normalized products coding and innovative market segmentations for all consumer
goods industry across globe.
Reference Data Supervisor
 Overseen coding teams. Data quality monitoring. Outliers manage.

 Functional development of processes for production lines to code data standards for consumer goods industry.
 Deep Knowledge of products segmentation based on packaging and brading.
 Statistical controls and KPIs. Six Sigma, Lean Manufacturing, and quality controls. Optimization and continuous
improvement.
Key Responsibilities:
 Analyzed data to optimize operations

 Designed scorecard and KPIS to identify deviations or opportunities in real time
 Logistics analytics to maximize revenue in-bound/out-bound
 Created visual controls to manage different phases of supply chain
 Cleaned, transformed and improved data warehouse.
Page 2 of 3
 Performed data parsing and data profiling from large volumes of varied data to learn about behavior with various
features based on transactional data, call center history data and customer personal profile, etc.
 Processed the primary quantitative and qualitative market research and loaded the survey responses into
database, in preparation of data exploration
 Developed python scripts to automate data sampling process. Ensured the data integrity by checking for
duplication, completeness, accuracy, and validity
 Worked on data cleaning and ensured data quality, consistency, integrity using Numpy, S Frame in Python
 Developed solutions for market analysis with product association, Share of Market, Neuroscience, Consumer
Behaviour
 Applied Principal Component Analysis method in feature engineering to analyze high dimensional data
 Application of various machine learning algorithms and statistical modeling - decision tree, lasso regression,
multivariate regression to identify key features using scikit-learn package in python
 Evaluated models using k-fold cross validation, log loss function
 Ensured that the model has low false positive rate, validated model by interpreting ROC Plot
 Built repeatable processes in support of implementation of new features and other initiatives
 Communicated and presented the results with product development team for driving best decisions
Environment: Python 3.6, PySpark, Tableau, Nump, Scikit-Learn, Seaborn, Matplotlib, FuzzyWuzzy, Bokeh
Page 3 of 3

Andres S Castaneda-Data Scientist

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Andres S Castaneda-Data Scientist

Uploaded by

Copyright:

Available Formats

Andres S Castañeda

AWARDS & HONORS

Company : The Nielsen Company, Chicago, IL Aug 2010 - Present

 Advice of key accounts to drive business models for the clients.

Global Data Standards

Reference Data Supervisor

 Overseen coding teams. Data quality monitoring. Outliers manage.

 Analyzed data to optimize operations

You might also like