Complete Data Science Roadmap from noob to expert.

Welcome to the Data Science course! Over the next 100 days, you will learn a wide range of topics related to Python programming, data science, and machine learning. These topics will be covered in a variety of posts, so be sure to bookmark this page and follow me here and on GitHub for updates.

Throughout the course, you will have the opportunity to work with real-world data sets and apply the concepts you have learned to solve practical problems. You will also find exercises in each post that you can practice to further solidify your understanding of the material. All materials and exercises will be available on the GitHub repository linked below.

GitHub link: Complete-Data-Science-Bootcamp

By the end of the course, you will have a strong foundation in data science and be well-prepared to pursue further study or a career in the field. So let's get started!

Python
1. Python basics
  1. Input/Output
    - Printing to the console
    - Getting input from the user
  2. Operators
    - Arithmetic operators (e.g. +, -, *, /)
    - Comparison operators (e.g. ==, !=, >, <)
    - Logical operators (e.g. and, or, not)
  3. Operations
    - Working with variables
    - Data types (e.g. int, float, str)
    - Type conversion
    - Basic string manipulation (e.g. indexing, slicing, concatenation)
2. Python data structures
  1. list
    - Creating and accessing lists
    - Modifying lists (e.g. adding, removing, and sorting elements)
    - Looping through lists
  2. tuple
    - Creating and accessing tuples
    - Modifying tuples (e.g. adding and removing elements)
    - Looping through tuples
  3. set
    - Creating and accessing sets
    - Modifying sets (e.g. adding, removing, and intersecting elements)
    - Looping through sets
  4. dictionary
    - Creating and accessing dictionaries
    - Modifying dictionaries (e.g. adding, removing, and updating key-value pairs)
    - Looping through dictionaries
3. Python fundamentals
  1. loops
    - For loops
    - While loops
    - Break and continue statements
  2. functions
    - Defining and calling functions
    - Parameters and arguments
    - Return values
  3. object and classes
    - Defining classes and objects
    - Constructors and destructors
    - Inheritance
    - Method overloading and overriding
4. Pandas
  - Introduction to Pandas library
  - Loading and saving data with Pandas
  - Working with DataFrames and Series
  - Manipulating and cleaning data with Pandas
5. Numpy
  - Introduction to Numpy library
  - Creating and accessing arrays
  - Array operations (e.g. reshaping, slicing, and element-wise operations)
  - Mathematical and statistical functions
6. Matplotlib
  - Introduction to Matplotlib library
  - Creating basic plots (e.g. line, scatter, and bar plots)
  - Customizing plots (e.g. labels, titles, and legends)
  - Saving and showing plots
SQL
- Introduction to Structured Query Language (SQL)
- Creating and modifying databases and tables
- Selecting, filtering, and sorting data
- Grouping and aggregating data
- Joining tables
- Subqueries and views
Maths Refresher
1. Statistics
  - Mean, median, mode
  - Range, variance, standard deviation
  - Percentiles and quartiles
  - Z-scores
2. Probability
  - Basic probability concepts (e.g. events, sample space, and probability)
  - Conditional probability and independence
3. Linear algebra
  - Vectors and matrices
  - Matrix operations (e.g. addition, multiplication, and transposition)
4. Calculus
  - Limits and continuity
  - Derivatives
  - Integrals
  - Fundamental theorem of calculus
Python for data science
1. Jupyter notebook and google collab walkthrough
  - Introduction to Jupyter notebooks and Google Colab
  - Creating and running cells
  - Importing and exporting notebooks
2. Python data science libraries
  - Introduction to popular data science libraries (e.g. Scikit-learn, TensorFlow, and Keras)
  - Installing and importing libraries
3. Exploratory data analysis
  1. Visualization
    - Introduction to Matplotlib and Seaborn
    - Plotting distributions, scatterplots, and boxplots
    - Customizing plots
  2. Summary statistics
    - Calculating basic statistics (e.g. mean, median, and standard deviation)
    - Generating descriptive statistics with Pandas
  3. Correlation analysis
    - Calculating and interpreting correlations
    - Visualizing correlations with scatterplots
  4. Data cleaning
    - Handling missing values
    - Removing outliers
    - Normalizing and standardizing data
  5. Dimension reduction
    - Introduction to dimension reduction techniques (e.g. PCA and t-SNE)
    - Implementing and interpreting dimension reduction in Python
  6. Anomaly detection
    - Introduction to anomaly detection techniques (e.g. isolation forests and local outlier factor)
    - Implementing and interpreting anomaly detection in Python
  7. Feature engineering
    - Introduction to feature engineering
    - Creating new features from existing data
    - Selecting relevant features for model building
Machine learning
1. Introduction
  - Definition and types of machine learning
  - Differences between supervised, unsupervised, and reinforcement learning
2. Supervised learning
  - Regression and classification algorithms
  - Evaluation metrics for regression and classification models (e.g. mean squared error and accuracy)
3. Classification
  - K-nearest neighbors (KNN)
  - Logistic regression
  - Support vector machines (SVM)
4. Decision trees
  - Introduction to decision trees
  - Implementing decision trees in Python
  - Visualizing decision trees
5. Time series prediction
  - Introduction to time series data
  - Moving average and exponential smoothing models
  - Autoregressive integrated moving average (ARIMA) model
6. Unsupervised learning
  - Clustering algorithms (e.g. k-means and hierarchical clustering)
  - Evaluation metrics for clustering (e.g. silhouette score and calinski-harabasz index)
7. Some projects (5-8)
  - Suggested projects to apply machine learning concepts (e.g. building a spam detector or a customer segmentation model)
Tableau
- Connecting to and importing data
- Working with data in Tableau
- Creating and customizing visualizations
- Dashboarding and storytelling with Tableau
- Advanced techniques (e.g. calculated fields, parameters, and table calculations)
- Exporting and publishing dashboards

Same in tabular form:

Module	Topic	Sub-Topic	Content
Python	Python basics	Input/Output	Printing to the console
			Getting input from the user
		Operators	Arithmetic operators (e.g. +, -, *, /)
			Comparison operators (e.g. ==, !=, >, <)
			Logical operators (e.g. and, or, not)
		Operations	Working with variables
			Data types (e.g. int, float, str)
			Type conversion
			Basic string manipulation (e.g. indexing, slicing, concatenation)
	Python data structures	list	Creating and accessing lists
			Modifying lists (e.g. adding, removing, and sorting elements)
			Looping through lists
		tuple	Creating and accessing tuples
			Modifying tuples (e.g. adding and removing elements)
			Looping through tuples
		set	Creating and accessing sets
			Modifying sets (e.g. adding, removing, and intersecting elements)
			Looping through sets
		dictionary	Creating and accessing dictionaries
			Modifying dictionaries (e.g. adding, removing, and updating key-value pairs)
			Looping through dictionaries
	Python fundamentals	loops	For loops
			While loops
			Break and continue statements
		functions	Defining and calling functions
			Parameters and arguments
			Return values
		object and classes	Defining classes and objects
			Constructors and destructors
			Inheritance
			Method overloading and overriding
	Pandas	Introduction to Pandas library
			Loading and saving data with Pandas
			Working with DataFrames and Series
			Manipulating and cleaning data with Pandas
	Numpy	Introduction to Numpy library
			Creating and accessing arrays
			Array operations (e.g. reshaping, slicing, and element-wise operations)
			Mathematical and statistical functions
	Matplotlib	Introduction to Matplotlib library
			Creating basic plots (e.g. line, scatter, and bar plots)
			Customizing plots (e.g. labels, titles, and legends)
			Saving and showing plots
SQL	Introduction to Structured Query Language (SQL)
			Creating and modifying databases and tables
			Selecting, filtering, and sorting data
			Grouping and aggregating
	Joining tables
			Subqueries and views
Maths Refresher	Statistics	Mean, median, mode
			Range, variance, standard deviation
			Percentiles and quartiles
			Z-scores
	Probability	Basic probability concepts (e.g. events, sample space, and probability)
			Conditional probability and independence
			Bayes' theorem
	Linear algebra	Vectors and matrices
			Matrix operations (e.g. addition, multiplication, and transposition)
			Determinants and inverses
	Calculus	Limits and continuity
			Derivatives
			Integrals
			Fundamental theorem of calculus
Python for data science	Jupyter notebook and google collab walkthrough	Introduction to Jupyter notebooks and Google Colab
			Creating and running cells
			Importing and exporting notebooks
	Python data science libraries	Introduction to popular data science libraries (e.g. Scikit-learn, TensorFlow, and Keras)
			Installing and importing libraries
	Exploratory data analysis	Visualization	Introduction to Matplotlib and Seaborn
			Plotting distributions, scatterplots, and boxplots
			Customizing plots
		Summary statistics	Calculating basic statistics (e.g. mean, median, and standard deviation)
			Generating descriptive statistics with Pandas
		Correlation analysis	Calculating and interpreting correlations
			Visualizing correlations with scatterplots
		Data cleaning	Handling missing values
			Removing outliers
			Normalizing and standardizing data
		Dimension reduction	Introduction to dimension reduction techniques (e.g. PCA and t-SNE)
			Implementing and interpreting dimension reduction in Python
		Anomaly detection	Introduction to anomaly detection techniques (e.g. isolation forests and local outlier factor)
			Implementing and interpreting anomaly detection in Python
		Feature engineering	Introduction to feature engineering
			Creating new features from existing data
			Selecting relevant features for model building
Machine learning	Introduction	Definition and types of machine learning
			Differences between supervised, unsupervised, and reinforcement learning
	Supervised learning	Regression and classification algorithms
			Evaluation metrics for regression and classification models (e.g. mean squared error and accuracy)
	Classification	K-nearest neighbors (KNN)
			Logistic regression
			Support vector machines (SVM)
	Decision trees	Introduction to decision trees
			Implementing decision trees in Python
			Visualizing decision trees
	Time series prediction	Introduction to time series data
			Moving average and exponential smoothing models
			Autoregressive integrated moving average (ARIMA) model
	Unsupervised learning	Clustering algorithms (e.g. k-means and hierarchical clustering)
			Evaluation metrics for clustering (e.g. silhouette score and calinski-harabasz index)
	Some projects (5-8)	Suggested projects to apply machine learning concepts (e.g. building a spam detector or a customer segmentation model)
Tableau	Introduction to Tableau
			Connecting to and importing data
			Working with data
	Working with data in Tableau
			Creating and customizing visualizations
			Dashboarding and storytelling with Tableau
		Advanced techniques	Calculated fields, parameters, and table calculations
			Exporting and publishing dashboards

We hope that you will enjoy learning about data science with me! By completing this course, you should now have a strong foundation in Python programming, SQL, maths refresher, data science with Python, machine learning, and Tableau. You should be well-prepared to pursue further study or a career in the field, and we encourage you to continue learning and staying up-to-date on new developments in the world of data science.

We would like to thank you for joining me on this journey and hope that you will continue to follow us for future updates and learning opportunities. Don't forget to check out the GitHub repository linked below for all materials and exercises, and we look forward to seeing what you will accomplish with your new skills!

Complete Data Science Roadmap from noob to expert.

A 100-Day Guide to Becoming a Data Scientist with Python, SQL, and Machine Learning

GitHub link: Complete-Data-Science-Bootcamp

Same in tabular form:

GitHub link: Complete-Data-Science-Bootcamp

Did you find this article valuable?