Import GoogleNews-vectors-negative300.bin

I am working on code using the gensim and having a tough time troubleshooting a ValueError within my code. I finally was able to zip GoogleNews-vectors-negative300.bin.gz file so I could implement it in my model. I also tried gzip which the results were unsuccessful. The error in the code occurs in the last line. I would like to know what can be done to fix the error. Is there any workarounds? Finally, is there a website that I could reference?

Thank you respectfully for your assistance!

import gensim
from keras import backend
from keras.layers import Dense, Input, Lambda, LSTM, TimeDistributed
from keras.layers.merge import concatenate
from keras.layers.embeddings import Embedding
from keras.models import Mode
pretrained_embeddings_path = "GoogleNews-vectors-negative300.bin"
word2vec =
gensim.models.KeyedVectors.load_word2vec_format(pretrained_embeddings_path,
binary=True)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-23bd96c1d6ab> in <module>() 1 pretrained_embeddings_path = "GoogleNews-vectors-negative300.bin"
----> 2 word2vec =
gensim.models.KeyedVectors.load_word2vec_format(pretrained_embeddings_path,
binary=True)
C:\Users\green\Anaconda3\envs\py35\lib\site-
packages\gensim\models\keyedvectors.py in load_word2vec_format(cls, fname,
fvocab, binary, encoding, unicode_errors, limit, datatype)
244 word.append(ch)
245 word = utils.to_unicode(b''.join(word),
encoding=encoding, errors=unicode_errors)
--> 246 weights = fromstring(fin.read(binary_len),
dtype=REAL)
247 add_word(word, weights)
248 else:
ValueError: string size must be a multiple of element size

4 Answers

The below commands work.

brew install wget
wget -c ""

This downloads the GZIP compressed file that you can uncompress using:

gzip -d GoogleNews-vectors-negative300.bin.gz

You can then use the below command to get wordVector.

from gensim import models
w = models.KeyedVectors.load_word2vec_format( '../GoogleNews-vectors-negative300.bin', binary=True)

you have to write the complete path.

use this path:

try this -

import gensim.downloader as api
wv = api.load('word2vec-google-news-300')
vec_king = wv['king']

also, visit this link :

Here is what worked for me. I loaded a part of the model and not the entire model as it's huge.

!pip install wget
import wget
url = '
filename = wget.download(url)
f_in = gzip.open('GoogleNews-vectors-negative300.bin.gz', 'rb')
f_out = open('GoogleNews-vectors-negative300.bin', 'wb')
f_out.writelines(f_in)
import gensim
from gensim.models import Word2Vec, KeyedVectors
from sklearn.decomposition import PCA
model = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True, limit=100000)

Viral Frenzy Report

Import GoogleNews-vectors-negative300.bin

4 Answers

Your Answer

Sign up or log in

Post as a guest

You Might Also Like

How to reach 100% total destruction with the Mighty Eagle in Angry Birds?

What forms of co-op exist in Saints Row 3?

Are Night Vision Goggles in DayZ Standalone yet?

Problem with connecting an Xbox One to a monitor