Category Data Science

Dimensionality Reduction

In Machine Learning, Dimensionality is defined as the number of input variables for a dataset.More the input variables, more challenging the model becomes to do the predictive analysis. This phenomenon is called as curse of dimensionality. Hence, the number of…

Fact Extensions

Facts are the objects which store actual data values in the data warehouse at a specified business level in MicroStrategy. Whenever the fact definitions need to be extended or changed beyond its warehouse level, there come the roles of Fact…

Basic Data Cleaning techniques

Data cleaning can be a monotonous process in the Machine Learning project. The steps and techniques for data cleaning will not be similar for every dataset. Without proper data, it will be time-consuming to see the actually important parts in…

Linear Regression

What is Linear Regression? Linear Regression is a one type of regression technique that determines the linear relationship between a dependent variable and one or more independent variables. It can be classified into simple(one independent variable) and multiple linear Regression(two…

Cumulative Accuracy Profile

Introduction: Cumulative Accuracy Profile(CAP) curve is used to evaluate the performance accuracy of Classification Algorithms. It determines the cumulative number for required property(in y-axis) across the corresponding cumulative number for total population(in x-axis). Characteristics: Area between the curve and line…

Understanding the Plotly

Plotly: Plotly is an organization that makes visualization tools including a Python API library. Plotly Python library can be used to create interactive graphs in an easier, faster and efficient way. Plotly library is a declarative programming and best alternative…

Understand the Box Plot

Box Plot: Box plot is a graph which represent insights in five sectors of summary. It is used to indicate the distributions and crucial observations in the data set. In this post, we are going to see brief content on…