Apriori Algorithm

The Apriori algorithm is a classic algorithm used in data mining to find frequent itemsets and association rules. Below is the step-by-step description of the Apriori algorithm:

  1. Input Data:

    Read the transactional data containing sets of items.

  2. Determine Minimum Support Threshold:

    Set a minimum support threshold, indicating the minimum number of occurrences for a set of items to be considered frequent.

  3. Identify Frequent 1-Itemsets:

    Count the occurrences of each individual item in the dataset. Identify frequent 1-itemsets that meet the minimum support threshold.

  4. Generate Candidate Itemsets:

    For each pair of frequent (k-1)-itemsets, join them to form candidate k-itemsets. Ensure that the generated candidate itemsets are unique.

  5. Check Candidate Itemsets Against Dataset:

    Scan the dataset to count the occurrences of each candidate itemset. Discard candidate itemsets that do not meet the minimum support threshold.

  6. Repeat the Process:

    If frequent itemsets are found in the previous step, repeat the process to generate candidate itemsets of the next higher length. Continue until no new frequent itemsets can be found.

  7. Output Frequent Itemsets:

    Once all frequent itemsets are identified, output them for further analysis.

Code