Categories
2. Data Science Algorithms

Apriori

In 1994, R. Agrawal and R. Srikant developed the Apriori Algorithm. This algorithm is used for finding frequently occurring itemsets using the boolean association rule. This algorithm is called Apriori as it makes use of the ‘prior’ knowledge of the properties in an itemset.

In this algorithm an iterative approach is applied. This is a level-wise search where we mine k-frequently occurring itemset to find k+1 itemsets.

Apriori makes the following assumptions –

  • The subsets of a frequent itemset must also be frequent.
  • Supersets of an in-frequent itemset must also be in-frequent.

The three significant components of an Apriori Algorithm are –

  • Support
  • Confidence
  • Lift

Support is a measure of the default popularity (which is a result of frequency) of an item ‘X’. Support is calculated through the division of the number of transactions in which X appears with the total number of transactions.

We can define the confidence of a rule as the division of the total number of transactions involving X and Y with the total number of transactions involving X.

Lift is the increase in the ratio of the sale of X when you sell the item Y. It is used to measure the likelihood of the Y being purchased when X is already purchased, taking into account the popularity of the item Y.

Leave a Reply

Your email address will not be published. Required fields are marked *