DeprecationWarning in Gensim `most_similar`?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



DeprecationWarning in Gensim `most_similar`?



While implementating Word2Vec in Python 3.7, I am facing an unexpected scenario related to depreciation. My question is what exactly is the depreciation warning with respect to 'most_similar' in word2vec gensim python?



Currently, I am getting the following issue.



DeprecationWarning: Call to deprecated most_similar (Method will be removed in 4.0.0, use self.wv.most_similar() instead).
model.most_similar('hamlet')
FutureWarning: Conversion of the second argument of issubdtype from int to np.signedinteger is deprecated. In future, it will be treated as np.int32 == np.dtype(int).type.
if np.issubdtype(vec.dtype, np.int):


most_similar


int


np.signedinteger


np.int32 == np.dtype(int).type



Please help to curb this issue? Any help is appreciated. I am a newbie to python.



The code what, I have tried is as follows.


import re
from gensim.models import Word2Vec
from nltk.corpus import gutenberg

sentences = list(gutenberg.sents('shakespeare-hamlet.txt'))
print('Type of corpus: ', type(sentences))
print('Length of corpus: ', len(sentences))

for i in range(len(sentences)):
sentences[i] = [word.lower() for word in sentences[i] if re.match('^[a-zA-Z]+', word)]
print(sentences[0]) # title, author, and year
print(sentences[1])
print(sentences[10])
model = Word2Vec(sentences=sentences, size = 100, sg = 1, window = 3, min_count = 1, iter = 10, workers = 4)
model.init_sims(replace = True)
model.save('word2vec_model')
model = Word2Vec.load('word2vec_model')
model.most_similar('hamlet')




3 Answers
3



It's a warning which that it's about to become obsolete and non-functional.



Usually things are deprecated for a few versions giving anyone using them enough time to move to the new method before they are removed.



They've moved most_similar to wv


most_similar


wv



So most_simliar() should look something like:


most_simliar()


model.wv.most_similar('hamlet')



src ref



Hope this helps



Edit : using wv.most_similar()


wv.most_similar()


import re
from gensim.models import Word2Vec
from nltk.corpus import gutenberg

sentences = list(gutenberg.sents('shakespeare-hamlet.txt'))
print('Type of corpus: ', type(sentences))
print('Length of corpus: ', len(sentences))

for i in range(len(sentences)):
sentences[i] = [word.lower() for word in sentences[i] if re.match('^[a-zA-Z]+', word)]
print(sentences[0]) # title, author, and year
print(sentences[1])
print(sentences[10])
model = Word2Vec(sentences=sentences, size = 100, sg = 1, window = 3, min_count = 1, iter = 10, workers = 4)
model.init_sims(replace = True)
model.save('word2vec_model')
model = Word2Vec.load('word2vec_model')
similarities = model.wv.most_similar('hamlet')
for word , score in similarities:
print(word , score)





Thanks Madhan. I have tried this model.wv.most_similar('hamlet'). But, it displays the following error, "utureWarning: Conversion of the second argument of issubdtype from int to np.signedinteger is deprecated. In future, it will be treated as np.int32 == np.dtype(int).type. if np.issubdtype(vec.dtype, np.int):"
– Mishra Siba
Aug 10 at 18:30


int


np.signedinteger


np.int32 == np.dtype(int).type





Yeah. it's coming from numpy a dependency used by gensim. Genism are fixing it in newer version ref : github.com/brian-team/brian2/issues/918 . Mean while you can try by downgrading numpy to 1.13 : [ref]( github.com/brian-team/brian2/issues/918#issuecomment-364865331)
– Madhan Varadhodiyil
Aug 10 at 18:33



numpy





Thanks for your prompt response. Its mean a lot. How to downgrade the numpy to 1.13?
– Mishra Siba
Aug 10 at 18:42





No worries :) pip uninstall numpy. and then pip install numpy==1.13
– Madhan Varadhodiyil
Aug 10 at 18:43



pip uninstall numpy


pip install numpy==1.13





pip3 right? This is because I am using 3.7 version python.
– Mishra Siba
Aug 10 at 18:46



So, Gensim here is telling you that eventually you will not be able to use the most_similar method directly on the Word2Vec model. Instead, you will need to call it on the model.wv object, which are the keyed vectors that are stored when you train a model.


most_similar


model.wv





Thanks for the suggestion. But, I too tried the model.wv.most_similar but it displays me the following message, "Conversion of the second argument of issubdtype from int to np.signedinteger is deprecated".
– Mishra Siba
Aug 10 at 18:25


int


np.signedinteger





This sounds like a warning from numpy or some other dependency used by Gensim. They would need to change that.
– Steven
Aug 10 at 18:29


numpy


Gensim





Shall I upgrade numpy?
– Mishra Siba
Aug 10 at 18:32





No, because it is code within Gensim or some dependcy that is using that deprecated conversion method.
– Steven
Aug 10 at 18:37



Gensim



A deprecation warning is a warning to indicate the use of things that may or may not exist in future versions of Python, often replaced by other things. (tells what they are)



It appears that the errors originate inside of Word2Vec, and not your code. Removing these errors would entail going into that library and changing its code.



Try doing what it tells you to do.



Change your model.most_similar('hamlet') to model.wv.most_similar('hamlet')


model.most_similar('hamlet')


model.wv.most_similar('hamlet')



I am unfamiliar with this package, so adjust to how it would work for your use.





Thanks for the response. How to do that?
– Mishra Siba
Aug 10 at 18:23





I edited it. I copypasted wrong. It SHOULD work as model.wv.most_similar('hamlet')
– hunter463785
Aug 10 at 18:27


model.wv.most_similar('hamlet')





Now, it throws the following error, "utureWarning: Conversion of the second argument of issubdtype from int to np.signedinteger is deprecated. In future, it will be treated as np.int32 == np.dtype(int).type. if np.issubdtype(vec.dtype, np.int):"
– Mishra Siba
Aug 10 at 18:31


int


np.signedinteger


np.int32 == np.dtype(int).type





According to this signature, it doesn't look like you can simply pass it 'hamlet' radimrehurek.com/gensim/models/… Try passing, positive='hamlet'
– hunter463785
Aug 10 at 18:33



'hamlet'


positive='hamlet'






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

make 2 or more post in bootsrap

Store custom data using WC_Cart add_to_cart() method in Woocommerce 3

Firebase Auth - with Email and Password - Check user already registered