If you want to plot trend lines in the RadHtmlChart for ASP.NET AJAX control and get a basic understanding on statistics terms like linear/logarithmic regression lines, ordinary least square method and r-squared measurement, do not miss the opportunity to examine this blog post to the end.

What is a Linear Regression?

The term “linear regression” (Figure 1) refers to an approach in statistics, related to modelling the relationship between a dependent variable/variables (usually denoted by Y) and an independent variable (usually denoted by X).

Regression lines are also known as trend lines, and there are different types such as linear, logarithmic, exponential and so on.Because the linear regression with a single independent variable is the basic case of the regression modelling, this is the one we will examine.

Figure 1: A sample linear regression plot in MS Excel that shows the relationship between the quantity and yield of bonds.

You can compare the chart from MS Excel above (Figure 1) with our final result from Telerik ASP.NET Chartat the end of the blog post (Figure 3).

Use Cases of the Linear Regression

Linear regressions can be used to forecast/predict values by developing a model over a data set of observation. Say you have the results from a student survey about the number of coffee cups drunk before attending an exam and the marks after the exam. Now you can find a potential relationship between bothvariables, so you can predict the exam mark of a student that will drunk "n" cups of coffee.

In the case of a multivariate regression, you can also determine which independent variables have or have no effect over the dependent one. For example, you can develop a bank score card that illustrates which characteristics (for example, age, salary, marital status and so on), and how they are related to the solvency of a potential creditor.

Before proceeding further, we will also cover another statistics term, tightly related to the linear regression subject–the ordinary least squares method.

The Ordinary Least Squares Method (OLS)

The OLS method purpose is to estimate such parameters (α - the slope and β- the intercept) of the linear regression model (y = α*x + β) that the sum of squared residuals (i.e., the sum of the differences between the observed and predicted responses in a data set) is the minimum.

Imagine we have a bunch of random points, and we plot a straight line through it. Now our target is to find the minimum aggregate of the vertical distances between the points and the line which is illustrated by the estimates of the α and β parameters as follows:

To get a better understanding of the above formulae we will create a sample data set of (X,Y) points in MS Excel and calculate the corresponding parameters as illustrated in Table 1.

Table 1: Ordinary least squares method formulas in MS Excel with a sample data set.

For the moment ,we will pay attention to the steps that create only the left side of the table:

Create these columns - X*Y; X^2; Y^2

Calculate these values: Sum(X); Sum(Y); Sum(X*Y); Sum(X^2); Sum(Y^2); Avg(X); Avg(Y)

Ok, we have created the regression model and can proceed further by determining how good it is, thanks to the r-squared measure.

R-Squared

R-squared, a.k.a. the coefficient of determination is a statistical measure that illustrates how well the data fits to the regression model.The coefficient ranges from 0 to 1, where values close to 1 indicate a good model (the variability of X explains well the variability of Y) while values close to 0 indicate the contrary.

You can see how the coefficient is calculated in Figure 2.

Figure 2: Formulas of the r-squared measure.

Now, we can get back to Table 1 but have a look this time at the right part of it, responsible for the r-squared calculation.

Create these columns: (Y-Avg(Y))^2; Yest: α*X+β; (Y-Yest)^2

Calculate these values: SST: Sum[(Y-Avg(Y))^2]; SSE: Sum[(Y-Yest)^2]

Calculate: R^2 = 1-(SSE/SST)

And the C# code analogue:

Example 2: A part of the OrdinaryLeastSquares method that shows how to calculate the coefficient of determination.

Place the RegressionModels.cs file in the App_Code folder of your web app/site

Include the RegressionModels namespace in the code behind logic of your page

Call the CreateRegressionModel.Plot method and pass the required parameters:

HtmlChart: The RadHtmlChart instance

DataSource: The DataTable data source

DataFieldX: The name of the column in the data source that stores the x-values

DataFieldY: The name of the column in the data source that stores the y-values

RegressionModelType: The type of the regression model

Make Your Own Customizations

If you wonder what the RegressionModelType parameter is for, here come the custom modifications you can do in this example. You can easily add support for a logarithmic regression by adding a field that calculates the Ln of X and use it instead of the original X field:

I think it is high time we saw our final result in Figure 3:

Figure 3: RadHtmlChart that has its second series fits the data of the first series and displays the regression model in the legend.

Found the Example Useful?

We went through the basics of a popular approach in statistics–the simple linearregression--and illustrated how to integrate it in the Telerik chart for ASP.NET AJAX control. Please feel free to share your thoughts and feedback. The source code of the demo application is also available in the Plot Regression Models with RadHtmlChart code library.

Danail Vasilev is
a Tech Support Engineer at Telerik’s ASP.NET AJAX Division, where he is
mainly responsible for RadHtmlChart, RadGauge and RadButton controls. He
joined the company in 2012 and ever since he has been responsible for providing help to
customers of Telerik UI for ASP.NET AJAX suite and improving the online resources. Apart from work he likes swimming and reading books.