Skip to content

Latest commit

 

History

History
20 lines (13 loc) · 943 Bytes

README.md

File metadata and controls

20 lines (13 loc) · 943 Bytes

MS-GSP from-scratch implementation

Context:

Class: Data Mining and Text Mining, University of Illinois at Chicago (Fall 2023)

1st year of Master's Degree in Computer Science coursework

Abstract:

The goal of the assignment was to implement a modified version of the GSP algorothm (Generalized Sequential Pattern algorithm) allowing to assign each item a different minimum support value. This generalization of the GSP mining algorithm takes as input a collection of sequences, where each sequence is defined as a list of ordered itemsets, and a parameter specification file containing the different minsup values for the items. The output is a collection of frequent sequential patterns, organized by length.

Files:

  • main.py contains the full code to run the algorithm;
  • data.txt contains sample input data;
  • para.txt contains the minsup values settings for each item;
  • results.txt contains sample output data.