-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add native textmodel_lda #30
Comments
I manage to make GibbsLDA++ work and we have both seeded and regular LDA. # seeded LDA (repliates https://github.com/koheiw/quanteda.seededlda)
> result10 <- textmodel_lda(dfmt_spnik, verbose = FALSE, seeds = tfmt_spnik)
> terms(result10)
economy politics society diplomacy military nature other
[1,] "company" "parliament" "police" "diplomatic" "army" "human" "going"
[2,] "money" "congress" "school" "embassy" "navy" "sand" "really"
[3,] "market" "politicians" "hospital" "ambassador" "soldiers" "water" "come"
[4,] "bank" "parliamentary" "prison" "treaty" "marine" "syria" "see"
[5,] "industry" "lawmakers" "women" "diplomat" "korea" "syrian" "american"
[6,] "banks" "voters" "man" "diplomats" "korean" "terrorist" "know"
[7,] "markets" "lawmaker" "investigation" "sanctions" "missile" "daesh" "facebook"
[8,] "banking" "politician" "found" "iran" "air" "turkish" "much"
[9,] "china" "uk" "court" "deal" "nuclear" "turkey" "good"
[10,] "chinese" "eu" "children" "meeting" "force" "weapons" "team"
# regular (unseeded) LDA
> result11 <- textmodel_lda(dfmt_spnik, k = 7, verbose = FALSE)
> terms(result11)
topic1 topic2 topic3 topic4 topic5 topic6 topic7
[1,] "korea" "china" "syria" "eu" "going" "uk" "police"
[2,] "korean" "chinese" "syrian" "sanctions" "really" "house" "video"
[3,] "nuclear" "economic" "israel" "iran" "much" "british" "women"
[4,] "missile" "india" "terrorist" "deal" "know" "department" "court"
[5,] "air" "oil" "daesh" "union" "see" "white" "man"
[6,] "nato" "billion" "turkish" "agreement" "come" "campaign" "found"
[7,] "force" "trade" "turkey" "germany" "good" "ukrainian" "children"
[8,] "japan" "project" "weapons" "elections" "something" "secretary" "service"
[9,] "kim" "indian" "saudi" "parliament" "facebook" "ukraine" "swedish"
[10,] "aircraft" "companies" "iraq" "german" "problem" "intelligence" "rights" My question is should I separate the function to |
Just my very subjective two cents: I think a dedicated Which doesn't mean though that |
@JBGruber thanks for the input. I added |
Sorry to be a downer here - and I was offline for 2 weeks - but seeded LDA is already available through |
topicmodels::LDA
is implemented using this library, which I can call directly via Rcpp:https://sourceforge.net/projects/gibbslda/files/
We can call the library in this way
https://github.com/cran/topicmodels/blob/ade6dc5698f385ad222fd28aa8e90c1a4bd33cf5/R/lda.R#L134-L155
There are a lot of things going on but it shouldn't be too complex for minimal functions that users usually need:
If we implement our quanteda-native LDA, I move quanteda.seededlda to this package.
https://github.com/koheiw/quanteda.seededlda
The text was updated successfully, but these errors were encountered: