You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
There is a significant performance difference between 32-bit and 64-bit builds when using libspng. After detailed analysis and discussion in the zlib-ng repository, it was observed that the 64-bit build performs considerably worse compared to the 32-bit build.
Observed Behavior:
In our tests, encoding a PNG with 32-bit results in ~131 ms, while the same image takes ~400 ms with the 64-bit build.
Both tests were conducted on the same machine with the same configuration, using libspng and zlib-ng.
Detailed Analysis:
Filter Decision as the Cause:
Based on profiling, it appears the 64-bit build takes significantly longer because of differences in the filter decision logic.
The 32-bit build uses a more optimized path (e.g., SIMD vectorized loops), while the 64-bit build appears to rely on scalar operations.
Forcing the filter choice to SPNG_FILTER_CHOICE_NONE resolves the performance issue and brings the 64-bit performance in line with the 32-bit results. However, this is a manual workaround.
Filter Logic Differences:
libspng dynamically selects filters during the encoding process.
It seems the heuristic for choosing filters differs between 32-bit and 64-bit builds, potentially due to underlying differences in how zlib-ng operates in these environments.
zlib-ng Findings:
The analysis in the zlib-ng repository revealed that the 64-bit build might have suboptimal behavior in encode_scanline.
Scalar operations and loops dominate the profiling data in the 64-bit build, while the 32-bit build uses SIMD vectorized loops effectively.
Steps to Reproduce:
Use the provided C++ example to encode a raw image into a PNG with libspng.
Compare the encoding times between 32-bit and 64-bit builds.
Optionally, set the filter choice manually to SPNG_FILTER_CHOICE_NONE to observe how it impacts the 64-bit performance.
Request:
Could you investigate the filter decision logic in libspng? Specifically:
Why the 64-bit build seems to perform worse in selecting filters.
Whether this is related to differences in how zlib-ng interacts with libspng in 32-bit vs. 64-bit environments.
How the default filter heuristic could be improved for 64-bit builds to align with 32-bit behavior.
Full Example:
#include <iostream>
#include <fstream>
#include <vector>
#include <stdexcept>
#include <cmath>
#include <string>
#include <cstring>
#include <chrono>
extern "C" {
#include "/home/adam/spng-install/include/spng.h"
}
int WritePNGCallback(spng_ctx *ctx, void *user, void *src, size_t length)
{
std::ofstream* out = reinterpret_cast<std::ofstream*>(user);
if(!out->write(reinterpret_cast<const char*>(src), length))
{
return SPNG_IO_ERROR;
}
return SPNG_OK;
}
void EncodeRawImageToPNG(const std::string& RawFileName,
const std::string& PngFileName,
uint32_t Width,
uint32_t Height,
int DPI)
{
spng_ctx* ctx = nullptr;
spng_ihdr ihdr;
std::ifstream rawFile(RawFileName, std::ios::binary);
if(!rawFile.is_open()) throw std::runtime_error("Failed to open raw file.");
rawFile.seekg(0, std::ios::end);
std::streampos fileSize = rawFile.tellg();
rawFile.seekg(0, std::ios::beg);
std::vector<unsigned char> rawBuffer(fileSize);
if(!rawFile.read(reinterpret_cast<char*>(rawBuffer.data()), fileSize))
throw std::runtime_error("Failed to read raw file into memory.");
rawFile.close();
std::ofstream pngFile(PngFileName, std::ios::binary);
if(!pngFile.is_open()) throw std::runtime_error("Failed to create/open PNG output file.");
ctx = spng_ctx_new(SPNG_CTX_ENCODER);
if(ctx == nullptr) throw std::runtime_error("Failed to create spng context.");
try
{
std::memset(&ihdr, 0, sizeof(ihdr));
ihdr.width = Width;
ihdr.height = Height;
ihdr.bit_depth = 8;
ihdr.color_type = SPNG_COLOR_TYPE_TRUECOLOR_ALPHA;
ihdr.compression_method = 0;
ihdr.filter_method = 0;
ihdr.interlace_method = SPNG_INTERLACE_NONE;
int resultCode = spng_set_ihdr(ctx, &ihdr);
if(resultCode != SPNG_OK)
throw std::runtime_error(std::string("Failed to set IHDR: ") + spng_strerror(resultCode));
resultCode = spng_set_option(ctx, SPNG_IMG_COMPRESSION_LEVEL, 1);
//Remove this comment line after testing.
//resultCode = spng_set_option(ctx, SPNG_FILTER_CHOICE, SPNG_FILTER_CHOICE_NONE);
if(resultCode != SPNG_OK)
throw std::runtime_error("Failed to set compression level to 1.");
resultCode = spng_set_png_stream(ctx, WritePNGCallback, &pngFile);
if(resultCode != SPNG_OK)
throw std::runtime_error("Failed to set PNG stream callback.");
double ppm = static_cast<double>(DPI) * 39.37;
int ippm = static_cast<int>(std::round(ppm));
spng_phys phys;
std::memset(&phys, 0, sizeof(phys));
phys.ppu_x = ippm;
phys.ppu_y = ippm;
phys.unit_specifier = 1;
resultCode = spng_set_phys(ctx, &phys);
if(resultCode != SPNG_OK)
throw std::runtime_error("Failed to set pHYs chunk.");
size_t imageSize = static_cast<size_t>(Width) * static_cast<size_t>(Height) * 4;
if(rawBuffer.size() < imageSize)
throw std::runtime_error("RAW buffer is smaller than the expected image size.");
resultCode = spng_encode_image(ctx, rawBuffer.data(), imageSize, SPNG_FMT_RAW, SPNG_ENCODE_FINALIZE);
if(resultCode != SPNG_OK)
throw std::runtime_error(std::string("Failed to encode image: ") + spng_strerror(resultCode));
}
catch(...)
{
spng_ctx_free(ctx);
pngFile.close();
throw;
}
spng_ctx_free(ctx);
pngFile.close();
}
int main(int argc, char *argv[])
{
size_t w = 2480;
size_t h = 3508;
const std::string raw_fname(argv[1]);
const std::string out_fname(argv[2]);
/* Just assuming a squarish image for now */
auto t0 = std::chrono::steady_clock::now();
EncodeRawImageToPNG(raw_fname, out_fname, w, h, 300);
auto t1 = std::chrono::steady_clock::now();
auto diff = t1 - t0;
double total = std::chrono::duration<double>(diff).count();
printf("img encode too %lf ms\n", total * 1e3);
return 0;
}
Why the 64-bit build seems to perform worse in selecting filters.
Filtering performance for encode depends entirely on compiler optimizations at this point, there is no SIMD code used there. Filtering behavior should be identical on 32-bit and 64-bit, i.e. zlib-ng gets the same data in both cases.
You could try something different with the example code, set SPNG_IMG_COMPRESSION_LEVEL to 0 to minimize potential zlib-ng weirdness and SPNG_FILTER_CHOICE choice to SPNG_FILTER_CHOICE_ALL (otherwise filtering is disabled automatically when compression level is 0). if the slowdown is similar on 64-bit then it's probably not zlib-ng or some complex interaction between the two.
I think it comes down to the compiler not vectorizing code for the 64-bit build, which is a known issue.
@randy408 The suggested changes were implemented. SPNG_IMG_COMPRESSION_LEVEL was set to 0, and SPNG_FILTER_CHOICE was set to SPNG_FILTER_CHOICE_ALL. With these adjustments, the performance on 64-bit improved significantly, now ranging between 58-60ms.
This seems to indicate an issue with libspng itself. Are there any plans to address this or make improvements to handle such cases better in 64-bit builds?
SIMD optimizations are the obvious choice, then it won't matter if the compiler isn't optimizing the code as it does on 32-bit, issue #37 is the one to subscribe to.
Description:
There is a significant performance difference between 32-bit and 64-bit builds when using libspng. After detailed analysis and discussion in the zlib-ng repository, it was observed that the 64-bit build performs considerably worse compared to the 32-bit build.
Observed Behavior:
In our tests, encoding a PNG with 32-bit results in ~131 ms, while the same image takes ~400 ms with the 64-bit build.
Both tests were conducted on the same machine with the same configuration, using libspng and zlib-ng.
Detailed Analysis:
Filter Decision as the Cause:
Based on profiling, it appears the 64-bit build takes significantly longer because of differences in the filter decision logic.
The 32-bit build uses a more optimized path (e.g., SIMD vectorized loops), while the 64-bit build appears to rely on scalar operations.
Forcing the filter choice to SPNG_FILTER_CHOICE_NONE resolves the performance issue and brings the 64-bit performance in line with the 32-bit results. However, this is a manual workaround.
Filter Logic Differences:
libspng dynamically selects filters during the encoding process.
It seems the heuristic for choosing filters differs between 32-bit and 64-bit builds, potentially due to underlying differences in how zlib-ng operates in these environments.
zlib-ng Findings:
The analysis in the zlib-ng repository revealed that the 64-bit build might have suboptimal behavior in encode_scanline.
Scalar operations and loops dominate the profiling data in the 64-bit build, while the 32-bit build uses SIMD vectorized loops effectively.
Steps to Reproduce:
Use the provided C++ example to encode a raw image into a PNG with libspng.
Compare the encoding times between 32-bit and 64-bit builds.
Optionally, set the filter choice manually to SPNG_FILTER_CHOICE_NONE to observe how it impacts the 64-bit performance.
resultCode = spng_set_option(ctx, SPNG_FILTER_CHOICE, SPNG_FILTER_CHOICE_NONE);
Expected Behavior:
Both 32-bit and 64-bit builds should perform similarly, with comparable encoding times and efficient use of filters.
Links to Related Issues:
zlib-ng Performance Analysis
Request:
Could you investigate the filter decision logic in libspng? Specifically:
Why the 64-bit build seems to perform worse in selecting filters.
Whether this is related to differences in how zlib-ng interacts with libspng in 32-bit vs. 64-bit environments.
How the default filter heuristic could be improved for 64-bit builds to align with 32-bit behavior.
Full Example:
Raw RGBA image
raw-rgba.zip
The text was updated successfully, but these errors were encountered: