Data Filtering
Filtering involves removing irrelevant or unnecessary data from a dataset to reduce noise and focus on the most relevant information.
Data Validation
Data validation aims to check it data adheres to defined rules and constraints, identifying and correcting inconsistencies.
Data Deduplication
Data deduplication involves eliminating duplicate records from a dataset, ensuring that each record is unique.
Data Encoding
Data encoding involves converting categorical data into a numerical format to make it compatible with machine learning algorithms.
Data Imputation
Data imputation entails replacing missing or null values with estimated values to maintain data integrity.
Data Aggregation
Data aggregation entails grouping data by category, time period, or another criterion to obtain summarized statistics.
Data Standardization
Standardizing data involves putting all data into a common format to facilitate comparison and analysis.
Data Sampling
Data sampling is the process of selecting a representative subset of data to expedite analysis while preserving data integrity.
Data Transformation
Data transformation involves modifying existing data to make it more suitable for analysis or modeling.
Outlier Detection
Outlier detection is the process of identifying and managing values that significantly deviate from the rest of the data, often by treating or removing them.
Data Cleansing
Data cleansing is process that encompasses the application of multiple techniques to ensure data accuracy, completeness, and compliance with standards.
Data Profiling
Data profiling involves in-depth analysis of data to understand its structure, characteristics, and quality.