Skip to content

florian-ariasu/suffix-trees

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🌲 Suffix Tree Implementation

A sophisticated implementation of suffix trees with both standard and compressed variants, designed for efficient pattern matching and text analysis.

🎯 Project Overview

This implementation provides a comprehensive suffix tree solution featuring both standard and compressed variants. The project includes pattern matching capabilities, statistical analysis, and efficient tree traversal algorithms.

🏗️ Core Data Structures

Standard Suffix Tree Node

typedef struct node {
    char c;                    // Character stored in node
    struct node *children[27]; // Array for a-z + '$'
} node, *tree;

Compressed Suffix Tree Node

typedef struct nodeComp {
    char *label;              // Edge label string
    struct nodeComp *children[27];
} nodeC, *treeC;

🛠️ Implemented Operations

Tree Construction Operations

Operation Description
initTree() Initializes empty tree with root node
createTree() Builds complete suffix tree from input
insertWord() Inserts single suffix into tree
decomposeWord() Generates all suffixes of input string

Analysis Operations

Operation Description
calculateNumOfLeaves() Counts terminal nodes
countNodesOnLevel() Analyzes nodes at specific depth
maxNumOfDescendants() Finds maximum branching factor
findSuffixes() Pattern matching in tree

Compression Operations

Operation Description
stToCst() Converts to compressed format
initTreeC() Initializes compressed tree
isSingleChild() Checks for path compression opportunity

📊 Task Implementations

Task 1: Basic Construction

  • BFS traversal output
  • Complete suffix tree building
  • Dynamic memory management

Task 2: Statistical Analysis

Output format:
<leaf_count>
<k_length_suffixes>
<max_descendants>

Task 3: Pattern Matching

Input: "banana", ["na", "ana"]
Output: 
1
1

Task 4: Compressed Tree

  • Path compression
  • Space optimization
  • Equivalent functionality

⚙️ Usage

./stree [-c1 | -c2 <K> | -c3 | -c4] [input_file] [output_file]

🔍 Implementation Details

Memory Management

  • Dynamic allocation for all nodes
  • Proper initialization of pointers
  • Fixed-size arrays (27) for alphabet
  • Cleanup routines for all allocations

Complexity Analysis

Construction: O(n²)
Pattern Match: O(m)
Space (Standard): O(n²)
Space (Compressed): O(n)

Error Handling

  • Null pointer validation
  • Input file verification
  • Memory allocation checks
  • Index bounds verification

🚀 Build Instructions

make build  # Compiles the project
make clean  # Removes artifacts

⚠️ Important Notes

  1. All suffixes are terminated with '$'
  2. Input strings use lowercase English alphabet
  3. Pattern matching returns binary results
  4. Compressed tree maintains original functionality

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published