Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

"Direct installation" of Lucene indexes into Solrini #1010

Closed
lintool opened this issue Mar 3, 2020 · 3 comments
Closed

"Direct installation" of Lucene indexes into Solrini #1010

lintool opened this issue Mar 3, 2020 · 3 comments

Comments

@lintool
Copy link
Member

lintool commented Mar 3, 2020

Solr is basically just a webapp around Lucene. It is possible to bypass the Solr indexing REST APIs and just directly copy the Lucene indexes into the right path in the Solr directory structure. We should provide documentation on how to do this, to save users an extra indexing step.

@edwinzhng
Copy link
Member

Looked into it a bit, according to the steps here from 2016, we should:

  1. Explicitly create the Solr schema via the JSON API for all fields (will need to add more on top of the existing schema scripts). This is also good since we can control types in one place instead of letting Solr infer the field type like it does right now. Probably makes sense to do Refactor Solr schema scripts for ACL and CORE collections #1009 first

  2. Change the "dataDir" in solrconfig.xml to point to the Lucene index

@lintool
Copy link
Member Author

lintool commented Mar 4, 2020

hey @r-clancy you did this before, right? can you comment on whether @edwinzhng 's on the right track?

@edwinzhng for step 2 above we have pre-built indexes here: https://git.uwaterloo.ca/jimmylin/anserini-indexes

We can get just wget it and uncompress into the location that Lucene expects it... so we won't even need to change the dataDir?

@ryan-clancy
Copy link
Contributor

@lintool Yeah, I had done this before. @edwinzhng is on the right track - the process is just making sure the Solr schema aligns with the Lucene schema and copying the Lucene index (or changing the Solr data dir to point to the Lucene index) into the correct place.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants