python - How to parse twitter feed data using json.reads(line) -


i've downloaded big stream of twitter data in json format , saved text file. want read in, line line, , decode dictionary using json.reads().

my problem throws error on first line, assume means function doesn't think data json? have added line want decode @ bottom of post. when print lines code works fine, json.reads() function throws error.

here code:

def decodejson(tweet_data):      line in tweet_data:         parsedjson = json.loads(line)         print(parsedjson) # want print confirm works. 

here error:

 file "/users/cc756/dropbox/pythonprojects/twitteranalysisassignment/tweet_sentiment.py", line 17, in analysesentiment     parsedjson = json.loads(line)   file "/users/cc756/anaconda/envs/tensorflow/lib/python3.5/json/__init__.py", line 319, in loads     return _default_decoder.decode(s)   file "/users/cc756/anaconda/envs/tensorflow/lib/python3.5/json/decoder.py", line 339, in decode     obj, end = self.raw_decode(s, idx=_w(s, 0).end())   file "/users/cc756/anaconda/envs/tensorflow/lib/python3.5/json/decoder.py", line 357, in raw_decode     raise jsondecodeerror("expecting value", s, err.value) none json.decoder.jsondecodeerror: expecting value: line 1 column 1 (char 0) 

here first string:

'b\\'{"delete":{"status":{"id":805444624881811457,"id_str":"805444624881811457","user_id":196129140,"user_id_str":"196129140"},"timestamp_ms":"1500994305560"}}\\'' 

i feel should work, i've been staring @ hour no improvement!

your strings in wrong format. i'm not sure need rid of 'b\\'(which doesn't make sense) @ beginning, manually typing in shell gives me this:

in [119]: json.loads(b'{"delete":{"status":{"id":805444624881811457,"id_str":"80      ...: 5444624881811457","user_id":196129140,"user_id_str":"196129140"},"time      ...: stamp_ms":"1500994305560"}}') out[119]:  {u'delete': {u'status': {u'id': 805444624881811457,    u'id_str': u'805444624881811457',    u'user_id': 196129140,    u'user_id_str': u'196129140'},   u'timestamp_ms': u'1500994305560'}} 

sorry, i'd make comment, imagine post in comment... :)

i'm not sure what's pasting of string question, it's following invalid format python, may want correct that.

update: issue data in binary format , needed decoded data.decode('utf-8')


Comments

Popular posts from this blog

networking - Vagrant-provisioned VirtualBox VM is not reachable from Ubuntu host -

c# - ASP.NET Core - There is already an object named 'AspNetRoles' in the database -

ruby on rails - ArgumentError: Missing host to link to! Please provide the :host parameter, set default_url_options[:host], or set :only_path to true -