Ashish Raj from REEF

Ashish Raj

Data Scientist at REEF

Miami, FL, USA

Last updated November 18, 2020

Who I Am  

A Little About Me

• Around 5 years of professional experience in Data Science and Analytics including Machine Learning, Deep Learning, Natural Language Processing, Time Series analysis and forecasting, Predictive Modelling and Statistical Analysis. • Expertise in manipulating large datasets from big data and cloud platforms such as Hadoop, AWS Redshift, Spark. • Practical experience implementing advanced analytical concepts and algorithms (Linear Regression, Logistic Regression /Classification, Decision Tree/ Random Forest, Monte Carlo simulation, Recommender Systems). • Strong aptitude in Data Science methods including supervised and unsupervised such as k-means clustering, Principal Component Analysis and Dimension Reductionality. • Hands on experience in using Python Libraries like NumPy, SciPy, Pandas, Statsmodels, Matplotlib, Seaborn, Beautiful Soup, XGBoost, Bokeh, Dash, Plotly. • Implementing statistics and data mining techniques ex Hypothesis testing, chi-square testing, retrieval processes to identity trends, patterns and other relevant information. • Categorizing content using unsupervised learning (k means clustering), implement in Python with Scikit-learn/Sklearn, and with classification algorithms (Support Vector Machine (SVM), K-nearest neighbours (KNN), logistic regression) • Experience in database, database modelling, data normalization and standardization with different RDBMS like Postgresql, Oracle and MySQL. • Developed advanced algorithms to solve problems of large dimensionality in a computationally efficient and statistically effective manner. • Experience with different RDBMS like Oracle 9i/10g/11g, SQL Server 2005/2008, MySQL. • Implemented CRISP-DM methodology - Business Understanding, Data Understanding, Data Acquisition, Testing and Validation, Data Preparation, Modelling, Evaluation, Deployment • Extensive hands-on experience and high proficiency with structured, semi-structured and unstructured data, using a broad range of data science programming languages and big tools. • Experience in designing and creating dashboards in Tableau and PowerBI that allows for automated reporting. • Strong Familiarity in the entire Data Science project life cycle, including Data Acquisition, Data Cleansing, Data Manipulation, Feature Engineering, Modeling, Evaluation, Optimization, Testing and Deployment. • Used predictive modelling to optimize current processes and improve corrective action timelines. Co-ordinated with different functional teams to implement models and monitor outcomes.



Countries Visited

Map View

United Arab Emirates

What I Do  

Professional Experience


Data Scientist

Jan 2020 - Present
Responsibilities: • Meticulously involved in setting up REEF's data infrastructure and data warehouse. This transformation initiative established automated management reports (dashboard reporting using Tableau and PowerBI), enhanced competitive advantage (Predicting effectiveness of new initiatives using Machine Learning and Statistical Analysis) and enabled stakeholders to monitor new product initiatives. • Implemented quantitative analysis of structured, semi-structured, and unstructured data working in small teams to develop, test and productionize advanced analytical models as required. • Wrote code in Python, Pandas, SQL, Shell and Hadoop to extract the data from different sources and perform ETL. • Used Apace Airtable to schedule ETL tasks to extract data from source, transform it using Python scripts (Pandas, Numpy and sqlalchemy) and then load it into the database. • Compiled data from multiple sources and used SQL and Python packages for data extraction, loading and transformation and uploading to cloud storage service - Amazon Web Services Redshift. Performed Data Cleaning, feature scaling, feature engineering using pandas and sci-kit learn. • Participated in implementation of Hadoop platform and big data technologies: Spark, Hive and HBase on Amazon Web Services S3, Redshift and Athena use cases. • Implemented machine learning techniques such as GLM, Random Forest, K-means and Time Series models. Applied data science methods such as data wrangling, feature engineering, regularization, dimension reduction, hyper parameter optimization, model evaluation and validation. • Collaborated with senior data scientists to prototype predictive models for converting data into insights. • Participated in the implementation of companywide Data Lake on AWS Redshift and helped in maintaining the data integrity by conducting schema checks.

Data Scientist

Mar 2019 - Jan 2020
• Performed Data Cleaning, features scaling, features engineering, feature prioritization using pandas and NumPy packages in Python. • Build predictive models using machine-learning algorithms. • Handled anomalies in the data - removing duplicates, imputing missing values and treating null values using Python, Pandas and Scikit- learn. • Performed text mining using NLP packages. • Participated in implementation of Hadoop platform and big data technologies: Spark, Hive, Pig, and HBase on Amazon Web Services S3, Redshift and Redis. • Extensively followed agile principles like Continuous Integration, Pair Programming and Test-Driven Development. • Build and deploy docker containers to implement backend services. • Maintained Git workflows for version control (source code management). • Developed and maintained automated CI/CD pipelines for code deployment.
+2 more experiences

Professional Interests

Data Science
Data Analytics
Machine Learning
Statistical Analysis
Data Preparation



Bachelor of Engineering, Chemical Engineering