Skip to content
Joel Natividad edited this page Dec 21, 2024 · 1 revision

The luau command provides powerful data transformation capabilities using Luau scripting (version 0.653). It serves as qsv's Domain-Specific Language (DSL) for data wrangling, allowing you to create new columns, filter rows, and perform complex calculations.

Basic Usage

qsv luau <subcommand> [options] <args>... [<input>]

Subcommands:
  map     - Create new columns by mapping Luau script results
  filter  - Filter rows based on Luau script conditions

Key Features

Column Access Methods

There are 3 ways to reference columns in your Luau scripts:

  • Direct column name: Amount (can be disabled with -g)
  • Using col table: col.Amount or col["Total Balance"]
  • Column index (1-based): col[1] (requires --colindex or --no-headers)

Special Variables

  • _IDX: Current row number (read-only)
  • _INDEX: Enable random access mode for indexed files
  • _ROWCOUNT: Total number of rows
  • _LASTROW: Last row number in the file

Script Structure

Scripts can have three sections:

  • BEGIN { ... }!: Initialization code
  • Main script: Row processing logic after BEGIN
  • END { ... }!: Final calculations/cleanup

Helper Functions

Cumulative Calculations

-- Running sum
qsv_cumsum("name", value)  -- Returns cumulative sum

-- Running product
qsv_cumprod("name", value)  -- Returns cumulative product

-- Running min/max
qsv_cummin("name", value)  -- Returns running minimum
qsv_cummax("name", value)  -- Returns running maximum

-- Boolean operations
qsv_cumany("name", condition)  -- Returns true if any value was true
qsv_cumall("name", condition)  -- Returns true if all values were true

Window Functions

-- Access previous values
qsv_lag("name", value, lag_periods, default)

-- Calculate differences
qsv_diff("name", value, periods)

Data Loading & Lookup Tables

-- Load CSV as lookup table
qsv_loadcsv("table_name", "path/to/file.csv", "key_column")

-- Load JSON data
qsv_loadjson("table_name", "path/to/file.json")

-- Register remote lookup table
qsv_register_lookup("table_name", "lookup_uri", cache_age_secs)

Utility Functions

-- Logging
qsv_log(level, message, ...)  -- Levels: info, warn, error, debug, trace

-- Flow control
qsv_break(message)  -- Stop processing with message
qsv_skip()          -- Skip current row in output

-- File operations
qsv_writefile(filename, data)
qsv_fileexists(filepath)

-- Environment variables
qsv_setenv(name, value)
qsv_getenv(name)

-- Shell commands
qsv_cmd(qsv_args)       -- Execute qsv command
qsv_shellcmd(cmd, args) -- Execute allowed shell commands

Examples

  1. Basic Column Creation
# Add a new column that doubles the Amount
echo 'Amount\n10\n20\n30' | qsv luau map 'doubled' 'Amount * 2'
  1. Running Totals with BEGIN/END Blocks
# Create a script file named totals.lua
cat > totals.lua << 'EOF'
BEGIN {
    total = 0
    grand_total = 0
}!

total = total + Amount
grand_total = grand_total + total
return total

END {
    return ("Grand total: " .. grand_total)
}!
EOF

# Run the script
qsv luau map 'Running_Total' 'file:totals.lua' input.csv
  1. Using Lookup Tables
# Create a script file named lookup.lua
cat > lookup.lua << 'EOF'
BEGIN {
    -- Load lookup table from CSV
    states = qsv_register_lookup("states", "states.csv", 3600)
    if not states then
        qsv_break("Failed to load states lookup table")
    end
}!

-- Look up state name from abbreviation
return states[StateCode]["Name"]
EOF

# Run the script
qsv luau map 'State_Name' 'file:lookup.lua' input.csv
  1. Random Access Processing
# Create a script file named percent_of_total.lua
cat > percent_of_total.lua << 'EOF'
BEGIN {
    qsv_autoindex()  -- Create index for random access
    total_amount = 0
    
    -- First pass: calculate total
    for idx = 1, _LASTROW do
        _INDEX = idx
        total_amount = total_amount + tonumber(Amount)
    end
    
    -- Reset to start processing rows
    _INDEX = 1
}!

-- Calculate percentage of total for each row
local pct = (tonumber(Amount) / total_amount) * 100
_INDEX = _INDEX + 1
return string.format("%.1f%%", pct)

END {
    return ("Processed " .. _ROWCOUNT .. " rows, Total Amount: " .. total_amount)
}!
EOF

# Run the script
qsv luau map 'Percent_Of_Total' 'file:percent_of_total.lua' input.csv
  1. Multiple Column Output
# Create multiple columns in one pass
qsv luau map 'col1,col2,col3' '{value + 1, value * 2, value ^ 2}'

Performance Tips

  1. Use --no-globals when possible for better performance
  2. Create indexes for random access operations
  3. Use --colindex when working with numeric column positions
  4. Consider using the progress bar (-p) for long-running operations

Error Handling

  1. Invalid scripts return <ERROR> in map mode
  2. While developing/debugging Luau scripts, set the QSV_LOG_LEVEL environment variable for detailed error messages
  3. Use --max-errors to control error tolerance