Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Feature request: Support multiple matches within single messages in search results #1438

Open
jhwheeler opened this issue Jan 20, 2025 · 0 comments

Comments

@jhwheeler
Copy link

jhwheeler commented Jan 20, 2025

Support multiple matches within single messages in search results

Problem

Currently, Stream Chat's search only returns unique messages even when a search term appears multiple times within the same message. Additionally, the search response doesn't provide information about individual match positions within message text.

For example, if a message contains:
"This text is an example text"

And we search for "text", we only get one search result, even though "text" appears twice in the message.

Current Workaround

We currently have to implement a client-side solution that:

  1. Takes the search results from Stream Chat
  2. Manually finds all occurrences of the search term in each message
  3. Creates duplicate results for messages with multiple matches
  4. Tracks match positions for proper highlighting

This has several drawbacks:

  • Increased client-side complexity
  • Inaccurate result counts in pagination
  • Potential performance issues with large messages
  • Inconsistent behavior with search highlighting

Proposal

Add support for multiple matches within messages in the search API, similar to Algolia's search capabilities. This could be implemented by either:

  1. Including match information in the response (this is how Algolia and Typesense handle such situations):
{
  "results": [
    {
      "message": {
        "id": "123",
        "text": "This text is an example text",
        // ... other message fields
      },
      "matches": [
        { "start": 5, "length": 4 },  // First "text"
        { "start": 23, "length": 4 }  // Second "text"
      ]
    }
  ]
}
  1. Treating each match as a separate result while maintaining a reference to the original message:
{
  "results": [
    {
      "message": { ... },
      "matchIndex": 5
    },
    {
      "message": { ... },
      "matchIndex": 23
    }
  ]
}

I think the first option is preferable, since it would not break existing implementations that already handle this on the client. However, the second option could be easier to handle on the client; so if we opt for that approach, this new behavior could be turned on with a filter flag.

This would provide a better search experience, more accurate result counting, and simpler client implementations.

Thank you for your consideration.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant