Introduction to Data Analytics

CRISP(Cross-Industry Standard Process for Data Mining)-DM

Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment

Business Understanding

Objective: Understand project objectives and requirements from business perspectives
Task:
- Define business goal
- Covert business goal to datqa goal

Data Understanding

Objective: To collect and explore the data to understand its structure and quality
Task:
- Gather initial data
- Describe the data
- Explore the data
- Verify data

Data Preparation

Objective: To prepare the data for modeling by cleaning, transforming, and organizing it
Task:
- Select relevant data
- Clean the data
- Construct new features
- Integrate data
- Format and structure the data

Modeling

Objective: To build and evaluate predictive models based on the prepared data
Task:
- Select appropriate modeling techniques
- Train models using the prepared data
- Evaluate model performance
- Precision and recall
- Tune model parameters

Evaluation

Objective: To assess the model's performance and ensure it meets business objectives
Task:
- Evaluate the model's result against business objectives
- Validate the model's effectiveness and reliabilty
- Review the process and results

Deployment

Objective: To implement the model in a real-world setting and monitor its performance
Task:
- Deploy the model into production
- Monitor the model's performance and maintain
- Update and refine the model

What is predictive analytics?

Data mining
Statistical inference
Machine Learning
Business Sense

Machine Learning

Supervised Learning: The model is trained on a labeled dataset. This means that for each input in the training set, the corresponding output (or label) is known. The goal is to learn a mapping from inputs to outputs so that the model can accurately predict the output for new, unseen data. Always requires a labelled training dataset Examples: Predictive Modeling, uplift modeling, recommender systems, sentiment analysis
Unsupervised Learning: The model is trained on a dataset that does not contain labeled outputs. Instead, the model tries to find hidden patterns, structures, or relationships within the data without any explicit instructions on what to predict. Examples: Association Rule Mining, Clustering,

Forms of Predictive Analytics

Predictive Modeling
- Regression: It estimates relationships between variables to predict a continuous numerical outcome.
- Classification: Predicts discrete categories or classes, such as spam, cancer cells, or speech. The output is typically a label or a class from a set of predefined options.
Clustering
- This technique groups similar data points together based on their inherent characteristics without predefined labels.
- K-means, hierarchical clustering, and density-based clustering are prominent algorithms.
- Used for: Customer segmentation, market basket analysis, identifying anomalies
Association Rule Mining
- Identifies relationships between variables in large datasets
- For example, market basket analysis predicts customer purchasing behavior by finding associations between products.
Recommender Systems
- Recommender systems are a type of predictive modeling and data filtering technology that aims to suggest items or content to users based on their preferences, behavior, or similarities with other users.
- These systems predict the relevance of items (such as products, movies, articles, etc.) to a particular user, helping to personalize their experience by recommending things they are likely to be interested in
Sentiment Analysis
- Sentiment analysis, also known as opinion mining, is a natural language processing (NLP) technique used to determine the emotional tone or sentiment expressed in a piece of text.
- It involves classifying text into categories such as positive, negative, or neutral, based on the underlying emotions or opinions conveyed by the words and phrases.
Uplift Modeling
- Uplift modeling, also known as incremental modeling, is a predictive modeling technique used to estimate the causal impact of a specific action or treatment on an individual's behavior.
- Uplift models predict the difference in outcomes caused by an intervention (e.g., how likely a customer is to buy a product as a result of receiving a targeted marketing campaign).

PreviousPredictive Analytics NextProbability & Linear Algebra

hashtagCRISP(Cross-Industry Standard Process for Data Mining)-DM

hashtagBusiness Understanding

hashtagData Understanding

hashtagData Preparation

hashtagModeling

hashtagEvaluation

hashtagDeployment

hashtagWhat is predictive analytics?

hashtagMachine Learning

hashtagForms of Predictive Analytics

CRISP(Cross-Industry Standard Process for Data Mining)-DM

Business Understanding

Data Understanding

Data Preparation

Modeling

Evaluation

Deployment

What is predictive analytics?

Machine Learning

Forms of Predictive Analytics