Skip to content

Liuguanli/LBMC

Repository files navigation

LBMC

Main Implementation

The main implementation of the LBMC (Learning-Based Multi-dimensional Cost model) can be found in the following directories:

  1. Cost Calculation: The cost calculation code is in python/global_cost.py and python/local_cost.py. These files contain implementations for the global and local cost calculations as outlined in the paper.
  2. Verification and Utilities: Additional verification and utility functions, including those for cost verification, drop patterns, and rise patterns, are located in python/verify_cost.py and python/utils.py.

The verification of the correctness of our cost modelling

./python/verify_cost.py

The key idea of verification is to proof that our proposed cost algorithms can get exactly the same results.

In verify_cost.py the following code snippet will verify the results of the cost algorithm.

    assert my_gc == naive_gc, ("wrong global cost calculation my_gc:%d, naive_gc:%d", (my_gc, naive_gc))
    assert my_gc_all == naive_gc, ("wrong global cost calculation my_gc_all:%d, naive_gc:%d", (my_gc_all, naive_gc))
    assert my_lc == naive_lc, ("wrong local cost calculation my_lc:%d, naive_lc:%d", (my_lc, naive_lc))
    assert my_lc_all == naive_lc, ("wrong local cost calculation my_lc_all:%d, naive_lc:%d", (my_lc_all, naive_lc))

How to calculate drop patterns and rise patterns:

Please refer to calculate_drop_pattern and calculate_rise_pattern.

Experimental Sections

The experiments in the paper are divided into three main sections, each with corresponding code and dataset information as follows:


E1:

Cost for n queries and m BMCs:

Global Cost: please refer to global_cost

Please refer to formula (5).

Local Cost: Please refer to local_cost

Please refer to Algorithm 1.

E2:

Datasets

All used datasets are listed: here

Integrate cost estimations to BMTree

Please refer to Learned-BMTree

Comparison via PostgreSQL

For BMTree, please refer to Learned-BMTree pg_test.py

For others, please refer to LearnSFC pg_test.py

E3:

Datasets

TPC:

https://www.tpc.org/tpc_documents_current_versions/current_specifications5.asp

./dbgen -s 1 -T o

NYC:

https://data.cityofnewyork.us/Transportation/2017-Yellow-Taxi-Trip-Data/biws-g3hs/about_data

Run queries on Hudi

For NYC dataset, please refer to nyc.scala

For TPC-H dataset, please refer to tpc.scala

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published