Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

corrupted and/or malformed document or cells with write_rich_string #289

Open
monolied opened this issue Sep 15, 2023 · 3 comments
Open

Comments

@monolied
Copy link

Hi John, I am using Excel::Writer::XLSX and I have encountered a problem. I
want it to do SOMETHING but the module appears to do SOMETHING_ELSE.

Just kidding. :) The following code should be mostly self explanatory.
Essentially, certain issues with using formats or empty fields result in several complications.
(The "corrupt" aspect is relative. Excel 2016 complains and repairs, LibreOffice 7.3 does not care.)
I tried to find open issues related to this, but didn't. I may have failed there. In that case, I apologize.

Great module, btw.

#!/usr/bin/perl -w

use strict;
use warnings;

use Excel::Writer::XLSX;

my $excel = '/tmp/garbled.xlsx';

my $workbook  = Excel::Writer::XLSX->new( $excel );
$workbook->set_properties(
    title    => 'garbled and corrupted xlsx',
    created => [ gmtime(time) ],
);
my $fmt_bold = $workbook->add_format( bold => 1 );
my $fmt_norm = $workbook->add_format( bold => 0 );
my $fmt_wrap = $workbook->add_format(text_wrap => 1);
my $fmt_ital = $workbook->add_format( italic => 1 );
my $fmt_undl = $workbook->add_format( underline => 1 );
my $worksheet = $workbook->add_worksheet('Examples');
$worksheet->set_column( 0, 0, 45 );
$worksheet->set_column( 1, 1, 15 );
$worksheet->write_rich_string('A1', 'This', '', $fmt_bold, ' will corrupt the xlx'); # NOK
$worksheet->write_rich_string('A2', '', $fmt_bold, ' This will not corrupt the xlx'); # OK
$worksheet->write_rich_string('A3', $fmt_bold, $fmt_norm, 'This will be invisible'); # NOK
$worksheet->write_rich_string('B3', $fmt_bold, '<-- see? invisible'); # OK
$worksheet->write_rich_string('A4', 'This will be garbled with xml ', $fmt_ital, $fmt_bold); # NOK
$workbook->close();
print "File $excel created.\n";

This is with latest libexcel-writer-xlsx-perl package in Debian.

    Perl version   : 5.032001
    OS name        : linux
    Module versions: (not all are required)
                     Excel::Writer::XLSX        1.07
                     Spreadsheet::WriteExcel    (not installed)
                     Archive::Zip               1.68
                     XML::Writer                (not installed)
                     IO::File                   1.41
                     File::Temp                 0.2309
@jmcnamara
Copy link
Owner

jmcnamara commented Sep 16, 2023

I want it to do SOMETHING but the module appears to do SOMETHING_ELSE.

Just kidding. :)

:-)

Thanks for the report. There probably should be more validation checks on the rich string input. I think some of these are caught in the Python version for comparison.

I presume you don't expect any of these to be valid and are highlighting that they should be rejected by the library. Is that right?

@monolied
Copy link
Author

Hm. Yes, I guess that's what I'd expect. Or the rules of engagement explained in the docs when it comes to using write_rich_string().
Ideally, I imagine, consecutive formats would just contain the defined values and merge. Don't know if that's madness, haven't thought it entirely through. I was starting to implement something like that on top, that's how it came to that. (And empty strings might well be silently dropped.) I stumbled on all of this converting external input, of course.

@jmcnamara
Copy link
Owner

Hm. Yes, I guess that's what I'd expect. Or the rules of engagement explained in the docs when it comes to using write_rich_string().

The rules of engagement are described here: Excel::Writer::XLSX worksheet.write_rich_string() although the Python docs do a slightly better job of explaining: XlsxWriter worksheet.write_rich_string(). In particular it highlights some of the things to avoid:

Note
Excel doesn’t allow the use of two consecutive formats in a rich string or an empty string fragment. For either of these conditions a warning is raised and the input to write_rich_string() is ignored.

In hindsight I made the Perl/Python write_rich_string() API too permissive and therefore error prone. The Rust version of write_rich_string() requires the formats and fragments to be tuples which are slightly harder to generate but less error prone.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

2 participants