Categories
data science FAQ's

What is foreign key?

A foreign key is a special key that belongs to one table and can be used as a primary key of another table. In order to create a relationship between the two tables, we reference the foreign key with the primary key of the other table.

Categories
data science FAQ's

Difference between delete and truncate commands

DELETE command is used in conjunction with WHERE clause to delete some rows from the table. This action can be rolled back.

However, TRUNCATE is used to delete all the rows of a table and this action cannot be rolled back.

Categories
data science FAQ's

What are the difference forms of join in a table ?

Some of the different joins in a table are –

  • Inner Join
  • Left Join
  • Outer Join
  • Full Join
  • Self Join
  • Cartesian Join
Categories
data science FAQ's

Difference between recall and precision?

Recall is the fraction of instances that have been classified as true. On the contrary, precision is a measure of weighing instances that are actually true. While recall is an approximation, precision is a true value that represents factual knowledge.

Categories
data science FAQ's

What is meant by curse of dimensionality? how can we solve it?

While analyzing the dataset, there are instances where the number of variables or columns are in excess. However, we are required to only extract significant variables from the group. For example, consider that there are a thousand features. However, we only need to extract handful of significant features. This problem of having numerous features where we only need a few is called ‘curse of dimensionality’.

There are various algorithms for dimensionality reduction like PCA (Principal Component Analysis).

Categories
data science FAQ's

What is the box cox transformation?

In order to transform the response variable so that the data meets its required assumptions, we make use of Box Cox Transformation. With the help of this technique, we can transform non-normal dependent variables into normal shapes. We can apply a broader number of tests with the help of this transformation.

Categories
data science FAQ's

Explain A/B testing

To perform a hypothesis testing of a randomized experiment with two variables A and B, we make use of A/B testing. A/B testing is used to optimize web-pages based on user preferences where small changes are added to web-pages that are delivered to a sample of users. Based on their reaction to the web-page and reaction of the rest of the audience to the original page, we can carry out this statistical experiment.

Categories
data science FAQ's

Why don’t gradient descent methods always converge to the same point?

This is because, in some cases, they reach to local or local optima point. The methods don’t always achieve global minima. This is also dependent on the data, the descent rate and origin point of descent.

Categories
data science FAQ's

How can you compute significance using p-value?

After a hypothesis test is conducted, we compute the significance of the results. The p-value is present between 0 and 1. If the p-value is less than 0.05, then it means that we cannot reject the null hypothesis. However, if it is greater than 0.05, then we reject the null hypothesis.

Categories
data science FAQ's

What is SVM? Can you name some kernels used in SVM?

SVM stands for support vector machine. They are used for classification and prediction tasks. SVM consists of a separating plane that discriminates between the two classes of variables. This separating plane is known as hyperplane. Some of the kernels used in SVM are –

  • Polynomial Kernel
  • Gaussian Kernel
  • Laplace RBF Kernel
  • Sigmoid Kernel
  • Hyperbolic Kernel