Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

stop storing raw data in the solr index #205

Open
hancush opened this issue May 25, 2018 · 1 comment
Open

stop storing raw data in the solr index #205

hancush opened this issue May 25, 2018 · 1 comment

Comments

@hancush
Copy link
Member

hancush commented May 25, 2018

via @evz's fancy training!

we default to storing the raw value for all fields indexed by solr. this is useful if you want to display results without round-tripping to the database. haystack does this round-tripping for us, so the raw data isn't doing us any favors. in fact, it means that all this raw data is taking up a bunch of memory and space, for no reason. let's update our solr configs to not store the raw data.

example from la metro schema.xml:

    <field name="sponsorships_exact" type="string" indexed="true" stored="true" multiValued="true" />

    <field name="inferred_status_exact" type="string" indexed="true" stored="true" multiValued="false" />

    <field name="attachment_text" type="text_en" indexed="true" stored="true" multiValued="false" />

    <field name="source_url" type="string" indexed="false" stored="true" multiValued="false" />

    <field name="identifier" type="text_en" indexed="true" stored="true" multiValued="false" />
@hancush
Copy link
Member Author

hancush commented Feb 18, 2020

Relevant documentation in Haystack.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant