Apriori Algorithm
The Apriori algorithm is a classic algorithm used in data mining to find frequent itemsets and association rules. Below is the step-by-step description of the Apriori algorithm:
- Input Data:
Read the transactional data containing sets of items.
- Determine Minimum Support Threshold:
Set a minimum support threshold, indicating the minimum number of occurrences for a set of items to be considered frequent.
- Identify Frequent 1-Itemsets:
Count the occurrences of each individual item in the dataset. Identify frequent 1-itemsets that meet the minimum support threshold.
- Generate Candidate Itemsets:
For each pair of frequent (k-1)-itemsets, join them to form candidate k-itemsets. Ensure that the generated candidate itemsets are unique.
- Check Candidate Itemsets Against Dataset:
Scan the dataset to count the occurrences of each candidate itemset. Discard candidate itemsets that do not meet the minimum support threshold.
- Repeat the Process:
If frequent itemsets are found in the previous step, repeat the process to generate candidate itemsets of the next higher length. Continue until no new frequent itemsets can be found.
- Output Frequent Itemsets:
Once all frequent itemsets are identified, output them for further analysis.