Swapna Kumar Panda @swapnakpanda
Day 1-20 Python Basics
Day 21-30 Data Types
Day 31-50 Statistics
Day 51-70 Machine Learning
Day 111-130 Projects
Day 101-110 Data Cleaning
Day 91-100 Data Visualization
Day 71-90 Deep Learning
Day 131-140 Communication Skill
Day 141-150 Revise
Automated Data Cleansing: Using AI-based platforms to outsource labor-intensive work
AutoML: Automating the iterative tasks of machine learning
Customer Personalization: Predict consumer behaviors using AI and ML
Data Science in the Blockchain: Generate insights from data on the blockchain
Machine Learning as Service: Outsource machine learning work to an external service provider
Natural Language Processing: An increasingly growing branch of AI
TinyML: Implementing machine learning on small, low-powered devices
Synthetic Data: Generate synthetic data that mirrors the statistical properties of a dataset

Machine Learning: Classification, Regression, Reinforcement Learning, Deep Learning, Clustering, Dimensionally reduction
Programming Language: Python, R, Java
Data Visualization: Tableau, Power BI, Matplotlib, GG Plot, Seaborn
Data Analysis: Feature Engineering, Data Wrangling, EDA
IDE: Pycharm, Jupyter, Colaboratory, Spyder, R-Studio
Math: Statistics, Linear Algebra, Differential Calculas
Deploy: AWS, AZURE
Web Scraping: Beautiful Soup, Scrapy, URLLIB
All data roles are identical
Transitioning to data science is impossible
Higher studies are essential
Data scientists only work on predictive modeling
Data science companies don't hire fresh graduates
Data scientists are expert coders
Data scientists have a strong mathematical training
🔰 Python: https://t.co/PADFyTHYBJ
🔰 SQL: https://t.co/zEH4zSUsof
🔰 Statistics and R https://t.co/Evuy8nWqmB
🔰 Data Science: R Basics: https://t.co/BJniGqeSPb
🔰 Excel and PowerBI: https://t.co/eukGiIcyVT
🔰 Data Science: Visualization: https://t.co/cF6Byygi0N
🔰 Data Science: Machine Learning: https://t.co/b7e16ciHJb
🔰 R: https://t.co/0vPpyDTktw
🔰 Tableau: https://t.co/49cK6pBD97
🔰 PowerBI: https://t.co/4zDqoGtNpp
🔰 Data Science: Productivity Tools: https://t.co/FfHikzj7YG
🔰 Data Science: Probability: https://t.co/pB6TKDRzQ1
🔰 Mathematics: https://t.co/veawz2h2mA
🔰 Statistics: https://t.co/LNWiUkS3pb
🔰 Data Visualization: https://t.co/MzbkixW4qE
🔰 Machine Learning: https://t.co/PpSeWBfMOA
🔰 Deep Learning: https://t.co/sijDAY1Ses
🔰 Data Science: Linear Regression: https://t.co/OlSb3uGTlc
🔰 Data Science: Wrangling: https://t.co/omMFGEFn0t
🔰 Linear Algebra: https://t.co/uHHcXyFUPa
🔰 Probability: https://t.co/o0bIqaxQ7G
🔰 Introduction to Linear Models and Matrix Algebra: https://t.co/mQTUHDhsuF
🔰 Data Science: Capstone: https://t.co/S8t1VLpe7D
🔰 Data Analysis: https://t.co/Sv4yrbDD6f
Follow @Kanojiyaaakash1 for such free resources.
Programming: Python, R, Java, SQL
Math Fundamentals: Statistics, Linear Algebra, Differential Calculas, Discrete Math
Data Analysis: Feature Engineering, Data wrangling, EDA
Machine Learning: Classification, Regression, Reinforcement Learning, Deep Learning, Dimensionality Reduction, Clustering
Web Scrapping: Beautiful SOAP, Scrappy, URLLIB
Visualization: Tableau, D3.js, Scatter Plot, Power BI, Ggplot2
| Data Analyst | Data Scientist |
|---|---|
| Scrub and retrieve information | Examine both historical and current patterns |
| Data collection statistical analysis | Create operational and financial reports |
| Deep learning framework training and development | Perform forecasting in tools such as Excel (Help by Matt Dancho) |
| Create architecture that can manage large amounts of data | Design infographics |
| Develop automation that streamlines data gathering and processing | Interpret data and communicate clearly |
| Present insights to the executive team and assist with data-driven decision making | Perform data screening by analyzing documents and fixing data corruption |
| Plan | Skills |
|---|---|
| Machine Learning | Supervised Classification, Supervised Regression, Unsupervised Clustering, Dimensionality Reduction, Local Interpretable Model Explanation - H20 Automatic Machine Learning, parsnip (XGBoost, SVM, Random Machine Learning Forest, GLM), K-Means, UMAP, recipes, lime |
| Data Visualization | Interactive and Static Visualizations, ggplot2 and plotly |
| Data Wrangling & Cleaning | Working with outliers, missing data, reshaping data, aggregation, filtering, selecting, calculating, and many more critical operations, dplyr and tidy packages |
| Data Preprocessing & Feature Engineering | Preparing data for machine learning, Engineering Features (dates, text, aggregates), Recipes package |
| Time Series | Working with date/datetime data, aggregating, transforming, visualizing time series, timetk package |
| Forecasting | ARIMA, Exponential Smoothing, Prophet, Machine Learning (XGBoost, Random Forest, GLMnet, etc), Deep Learning (GluonTS), Ensembles, Hyperparameter Tuning, Scaling to 1000s of forecasts, Modeltime package |
| Text | Working with text data, Stringr |
| NLP | Machine learning, Text Features |
| Functional Progamming | Making reusable functions, sourcing code |
| Iteration | Loops and Mapping, using Purr package |
| Reporting | Rmarkdown. Interactive HTML. Static PDF |
| Applications | Building Shiny web applications, Flexdashboard, Bootstrap |
| Deployment | Cloud (AWS, Azure, GCP), Docker, Git |
| Databases | SQL (for data import). MongoDB (for apps) |