Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Index/Scan: Duplicate Content field. #442

Open
dpieski opened this issue Dec 6, 2023 · 2 comments
Open

Index/Scan: Duplicate Content field. #442

dpieski opened this issue Dec 6, 2023 · 2 comments
Labels
bug Something isn't working scan Scan module

Comments

@dpieski
Copy link
Contributor

dpieski commented Dec 6, 2023

Device Information (please complete the following information):

  • OS: [e.g., Ubuntu 20.04, WSL2]
  • Deployment: [Linux, Linux ARM64 or Docker] Docker
  • SIST2 Version: [e.g., v2.9.0] 3.4.1
  • Elasticsearch Version (if relevant) : ``

Describe the bug

2023-12-06 16:37:33 [ERROR elastic.c] {
	"index":	{
		"_index":	"dcdocs",
		"_type":	"_doc",
		"_id":	"656fa65e.0000148e",
		"status":	400,
		"error":	{
			"type":	"parse_exception",
			"reason":	"Failed to parse content to map",
			"caused_by":	{
				"type":	"json_parse_exception",
				"reason":	"Duplicate field 'content'\n at [Source: (ByteArrayInputStream); line: 1, column: 156]"
			}
		}
	}
}

Steps To Reproduce
Please be specific!
I think this may be related to a file that has text in it, and OCR is run, which creates more text, possibly?

Expected behavior

Actual Behavior

Screenshots

Additional context

@dpieski dpieski added the bug Something isn't working label Dec 6, 2023
@simon987 simon987 added the scan Scan module label Dec 10, 2023
@simon987
Copy link
Collaborator

Would you be able to find this 656fa65e.0000148e document and send it to me?

sqlite sist2-scan-xxx.sist2
SELECT path FROM document WHERE id=5262;

@dpieski
Copy link
Contributor Author

dpieski commented Dec 11, 2023

That file is actually a VOB file. I think it is the menu-movie from a DVD.

f/u about this specific file on discord.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working scan Scan module
Projects
None yet
Development

No branches or pull requests

2 participants