Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

AttributeError: 'WikiCorpus' object has no attribute 'input' #2744

Open
joanPlepi opened this issue Feb 4, 2020 · 2 comments
Open

AttributeError: 'WikiCorpus' object has no attribute 'input' #2744

joanPlepi opened this issue Feb 4, 2020 · 2 comments

Comments

@joanPlepi
Copy link

joanPlepi commented Feb 4, 2020

Problem description

I am using WikiCorpus to read and process the wikipedia dump. However when I try to iterate over getstream() output, I get the error that there is no attribute input. Indeed, there is no input attribute but there is a fname attribute.

Steps/code/corpus to reproduce

path = "data/dewiki-latest-pages-articles.xml.bz2"
wiki = WikiCorpus(path)
for stream in wiki.getstream():
    print(stream)
    break

On the other hand, this fixes the problem, but I guess would be better if it gets fixed in the source code if the attribute missing is the problem.

setattr(wiki, "input", wiki.fname)

Versions

Linux-x86_64-with-debian-9.11
Python 3.7.0 (default, Oct  9 2018, 10:31:47) 
[GCC 7.3.0]
NumPy 1.17.3
SciPy 1.3.2
gensim 3.8.0
FAST_VERSION 1
@piskvorky
Copy link
Owner

What is "pathToWikipedia"?

@joanPlepi
Copy link
Author

joanPlepi commented Feb 4, 2020

my local path to Wikipedia dump.
I updated my post and put a more meaningful path.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants