Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ancient regression in qrdectxt.c #237

Open
ldoolitt opened this issue Nov 21, 2022 · 8 comments
Open

ancient regression in qrdectxt.c #237

ldoolitt opened this issue Nov 21, 2022 · 8 comments

Comments

@ldoolitt
Copy link
Contributor

In the olden times, when men were men and giants roamed the earth,
zbar could decode the attached image as "LBNL DIGITIZER V1.1 SN 342".

2022-11-18-231711

Starting with merge commit 9222475 (Feb 4, 2019), it fails. I localized the actual origin to commit 6d02875, "handle multi-byte sequence split apart", pushed to a branch in Feb, 2014. Props to git for letting me find this! I was actually able to build a zbarimg binary on stock Debian Bullseye with commit dbb82d7 that works.

After isolating the problem, I found chatter here #64 suggesting that add the -Sqr.binary flag would work around the problem.
It does. :-)
I call it a workaround because "LBNL DIGITIZER V1.1 SN 342" is not actually binary. Also, code from the Debian Stretch era handled this case without needing a binary flag.
I'm convinced that patch from 2014 is buggy. It's pretty complicated, and it would take me a while to try to improve it without breaking a bunch of other use cases. @benck, do you remember how that commit was intended to work, and maybe help figure out how this case trips it up?

@ldoolitt
Copy link
Contributor Author

I have to say that qr_code_data_list_extract_text() is a freakish example of what LT calls the "function-growth-hormone-imbalance syndrome."
I added debug statements and can kind-of follow along its "Step 1" in my case:

Step 1 iteration j = 0 of 1
Step 1 entry k = 0 of 2, mode 4
  added 3 << 2 to sa_ctext: ef bb bf
Step 1 entry k = 1 of 2, mode 2
  added 26 << 0 to sa_ctext: 4c 42 4e 4c 20 44 49 47 49 54 49 5a 45 52 20 56 31 2e 31 20 53 4e 20 33 34 32
allocating sa_text for 38 bytes

That's an over-complicated way of saying there's one QR_MODE_BYTE entry with a 3-byte UTF-8 BOM, followed by a QR_MODE_ALNUM entry with the actual useful text.
I can't yet understand where things go wrong in "Step 2". It seems suspicious that nothing gets copied into bytebuf_text except in the QR_MODE_BYTE | QR_MODE_KANJI case.

@jameshilliard
Copy link
Contributor

Maybe this helps?

@ldoolitt
Copy link
Contributor Author

Maybe this helps?

I noticed that too, and tried it. Nope, it makes no difference in my case.
It also looks wrong -- wouldn't that mark a string containing mixed big5 and ascii as ascii?

@jameshilliard
Copy link
Contributor

It also looks wrong -- wouldn't that mark a string containing mixed big5 and ascii as ascii?

Not really sure there, that function is weird, kinda hard to tell in what cases will fall through to the return 1 and what cases will continue on and maybe trigger a return 0.

I mean if _len is 0 then it falls through to return 1 I think, which doesn't seem correct either, I'm not sure that's even a condition we can hit though. That qr_code_data_list_extract_text function could really use some serious refactoring.

@ldoolitt
Copy link
Contributor Author

That qr_code_data_list_extract_text function could really use some serious refactoring.

+1, see above. Also some unit tests.
I hope I can diagnose and fix my use case first.

@ldoolitt
Copy link
Contributor Author

Ow! Ow! Ow! This code is making my head hurt!
I attach a one-line patch that makes my use case work.
qrdectxt.patch.txt
I have low confidence that it will still handle all the cases that @benck had in mind, notably the big5-code-broken-across-entries case that motivated the whole thing. @benck, if you're listening, do you happen to have a test case (image) available?
Everyone else, please try it out on your favorite images. All I can say at the moment is It Works For Me, and doesn't actually break the (three simple) QR test cases in "make check-images". If even one person says it helps them, I'll create a pull request.

@ldoolitt
Copy link
Contributor Author

I posted my development tree at https://github.com/ldoolitt/zbar/tree/qrdectxt_work
It includes formatting changes (mostly whitespace in comments) to qrdectxt.c, as well as the one-line patch above.
I can submit a formal pull request if I get some positive feedback on it.

@ldoolitt ldoolitt mentioned this issue Jan 30, 2023
@ldoolitt
Copy link
Contributor Author

No comments positive or negative in two months.
<tap> <tap> is anyone maintaining this code base?
I went ahead and submitted pull request #241.

mchehab pushed a commit that referenced this issue Apr 16, 2023
Seems to fix issues #106 and #237
Could use more testing

Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants