Data Science

150 Days of Data Science Challenge:

Day 1-20 Python Basics
Day 21-30 Data Types
Day 31-50 Statistics
Day 51-70 Machine Learning
Day 111-130 Projects
Day 101-110 Data Cleaning
Day 91-100 Data Visualization
Day 71-90 Deep Learning
Day 131-140 Communication Skill
Day 141-150 Revise

Data Science trends:

Automated Data Cleansing: Using AI-based platforms to outsource labor-intensive work
AutoML: Automating the iterative tasks of machine learning
Customer Personalization: Predict consumer behaviors using AI and ML
Data Science in the Blockchain: Generate insights from data on the blockchain
Machine Learning as Service: Outsource machine learning work to an external service provider
Natural Language Processing: An increasingly growing branch of AI
TinyML: Implementing machine learning on small, low-powered devices
Synthetic Data: Generate synthetic data that mirrors the statistical properties of a dataset

Data Science skillset:

Datascience skillset

Data science tools:

Machine Learning: Classification, Regression, Reinforcement Learning, Deep Learning, Clustering, Dimensionally reduction
Programming Language: Python, R, Java
Data Visualization: Tableau, Power BI, Matplotlib, GG Plot, Seaborn
Data Analysis: Feature Engineering, Data Wrangling, EDA
IDE: Pycharm, Jupyter, Colaboratory, Spyder, R-Studio
Math: Statistics, Linear Algebra, Differential Calculas
Deploy: AWS, AZURE
Web Scraping: Beautiful Soup, Scrapy, URLLIB

Top 7 Data science myths

All data roles are identical
Transitioning to data science is impossible
Higher studies are essential
Data scientists only work on predictive modeling
Data science companies don't hire fresh graduates
Data scientists are expert coders
Data scientists have a strong mathematical training

Free Certification Courses to Learn Data Science:

🔰 Python: https://t.co/PADFyTHYBJ
🔰 SQL: https://t.co/zEH4zSUsof
🔰 Statistics and R https://t.co/Evuy8nWqmB
🔰 Data Science: R Basics: https://t.co/BJniGqeSPb
🔰 Excel and PowerBI: https://t.co/eukGiIcyVT
🔰 Data Science: Visualization: https://t.co/cF6Byygi0N
🔰 Data Science: Machine Learning: https://t.co/b7e16ciHJb
🔰 R: https://t.co/0vPpyDTktw
🔰 Tableau: https://t.co/49cK6pBD97
🔰 PowerBI: https://t.co/4zDqoGtNpp
🔰 Data Science: Productivity Tools: https://t.co/FfHikzj7YG
🔰 Data Science: Probability: https://t.co/pB6TKDRzQ1
🔰 Mathematics: https://t.co/veawz2h2mA
🔰 Statistics: https://t.co/LNWiUkS3pb
🔰 Data Visualization: https://t.co/MzbkixW4qE
🔰 Machine Learning: https://t.co/PpSeWBfMOA
🔰 Deep Learning: https://t.co/sijDAY1Ses
🔰 Data Science: Linear Regression: https://t.co/OlSb3uGTlc
🔰 Data Science: Wrangling: https://t.co/omMFGEFn0t
🔰 Linear Algebra: https://t.co/uHHcXyFUPa
🔰 Probability: https://t.co/o0bIqaxQ7G
🔰 Introduction to Linear Models and Matrix Algebra: https://t.co/mQTUHDhsuF
🔰 Data Science: Capstone: https://t.co/S8t1VLpe7D
🔰 Data Analysis: https://t.co/Sv4yrbDD6f

Follow @Kanojiyaaakash1 for such free resources.

Guide for Data Scientist:

Programming: Python, R, Java, SQL
Math Fundamentals: Statistics, Linear Algebra, Differential Calculas, Discrete Math
Data Analysis: Feature Engineering, Data wrangling, EDA
Machine Learning: Classification, Regression, Reinforcement Learning, Deep Learning, Dimensionality Reduction, Clustering
Web Scrapping: Beautiful SOAP, Scrappy, URLLIB
Visualization: Tableau, D3.js, Scatter Plot, Power BI, Ggplot2

Data Analyst	Data Scientist
Scrub and retrieve information	Examine both historical and current patterns
Data collection statistical analysis	Create operational and financial reports
Deep learning framework training and development	Perform forecasting in tools such as Excel (Help by Matt Dancho)
Create architecture that can manage large amounts of data	Design infographics
Develop automation that streamlines data gathering and processing	Interpret data and communicate clearly
Present insights to the executive team and assist with data-driven decision making	Perform data screening by analyzing documents and fixing data corruption

Plan	Skills
Machine Learning	Supervised Classification, Supervised Regression, Unsupervised Clustering, Dimensionality Reduction, Local Interpretable Model Explanation - H20 Automatic Machine Learning, parsnip (XGBoost, SVM, Random Machine Learning Forest, GLM), K-Means, UMAP, recipes, lime
Data Visualization	Interactive and Static Visualizations, ggplot2 and plotly
Data Wrangling & Cleaning	Working with outliers, missing data, reshaping data, aggregation, filtering, selecting, calculating, and many more critical operations, dplyr and tidy packages
Data Preprocessing & Feature Engineering	Preparing data for machine learning, Engineering Features (dates, text, aggregates), Recipes package
Time Series	Working with date/datetime data, aggregating, transforming, visualizing time series, timetk package
Forecasting	ARIMA, Exponential Smoothing, Prophet, Machine Learning (XGBoost, Random Forest, GLMnet, etc), Deep Learning (GluonTS), Ensembles, Hyperparameter Tuning, Scaling to 1000s of forecasts, Modeltime package
Text	Working with text data, Stringr
NLP	Machine learning, Text Features
Functional Progamming	Making reusable functions, sourcing code
Iteration	Loops and Mapping, using Purr package
Reporting	Rmarkdown. Interactive HTML. Static PDF
Applications	Building Shiny web applications, Flexdashboard, Bootstrap
Deployment	Cloud (AWS, Azure, GCP), Docker, Git
Databases	SQL (for data import). MongoDB (for apps)

operational-efficiency Data Science

Best YouTube channels for Data Science: