-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Html Entities in Excel 2003 File #2157
Comments
The Xls Reader will not read the file, because it isn't an xls file (Just giving a file an extension of The Xml Reader will not load it however, because it isn't valid Xml. Entities like |
Hello Mark, thank you for the detailed explanation. I was wondering about the CDATA tags but makes sense they should be used in this case.
Let me know if there is anything I can do to help. |
The following appears to work: public static function unentity(string $contents): string
{
$contents = preg_replace('/&(amp|lt|gt|quot|apos);/', "\u{fffe}\u{feff}\$1;", trim($contents)) ?? $contents;
$contents = html_entity_decode($contents, ENT_NOQUOTES | ENT_SUBSTITUTE | ENT_HTML401, 'UTF-8');
$contents = str_replace("\u{fffe}\u{feff}", '&', $contents);
return $contents;
}
public static function copyAsXlsx() {
$infile = 'issue.2157.xml';
$contents = file_get_contents($infile);
$contents = self::unentity($contents);
$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xml();
$spreadsheet = $reader->loadSpreadsheetFromString($contents);
// You should be able to do whatever you want to
// the spreadsheet object here.
$writer = new \PhpOffice\PhpSpreadsheet\Writer\Xlsx($spreadsheet);
$outfile = 'issue.2157.unentity.xlsx';
$writer->save($outfile);
echo "saved $outfile\n";
} A case could be made for adding the |
For some reason, Security Scanner has a callback function. This makes it just too easy to implement 'unentity' as described above. I will open a PR to do so. |
Fix PHPOffice#2157. Excel 2003 format allows some things which a real Xml parser would reject. In particular, it permits Html entities, and leading whitespace. Change Xml Reader to do likewise.
This is:
Not even sure if this is a bug report or a feature request as I could not find if this type of XLS file is supported.
What is the expected behavior?
PhpSpreadsheet to read the file as I can open it just fine in OpenOffice.
What is the current behavior?
Trying to read the file I get the error
The filename products.xls is not recognised as an OLE file
trying to read it asXls
. Trying to read it asXml
because the content is just XML, generates a lot of PHP warnings:What are the steps to reproduce?
The file used in this example:
ProductsAPI-14-06-2021.xls
Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:
This is trying to read it as XLS:
This is trying to read it as XML:
I have also tried it with this code:
That results in the same output as trying to read it as an XML file above.
Which versions of PhpSpreadsheet and PHP are affected?
1.18.0
The text was updated successfully, but these errors were encountered: