python - How to parse twitter feed data using json.reads(line) -
i've downloaded big stream of twitter data in json format , saved text file. want read in, line line, , decode dictionary using json.reads()
.
my problem throws error on first line, assume means function doesn't think data json? have added line want decode @ bottom of post. when print lines code works fine, json.reads()
function throws error.
here code:
def decodejson(tweet_data): line in tweet_data: parsedjson = json.loads(line) print(parsedjson) # want print confirm works.
here error:
file "/users/cc756/dropbox/pythonprojects/twitteranalysisassignment/tweet_sentiment.py", line 17, in analysesentiment parsedjson = json.loads(line) file "/users/cc756/anaconda/envs/tensorflow/lib/python3.5/json/__init__.py", line 319, in loads return _default_decoder.decode(s) file "/users/cc756/anaconda/envs/tensorflow/lib/python3.5/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) file "/users/cc756/anaconda/envs/tensorflow/lib/python3.5/json/decoder.py", line 357, in raw_decode raise jsondecodeerror("expecting value", s, err.value) none json.decoder.jsondecodeerror: expecting value: line 1 column 1 (char 0)
here first string:
'b\\'{"delete":{"status":{"id":805444624881811457,"id_str":"805444624881811457","user_id":196129140,"user_id_str":"196129140"},"timestamp_ms":"1500994305560"}}\\''
i feel should work, i've been staring @ hour no improvement!
your strings in wrong format. i'm not sure need rid of 'b\\'
(which doesn't make sense) @ beginning, manually typing in shell gives me this:
in [119]: json.loads(b'{"delete":{"status":{"id":805444624881811457,"id_str":"80 ...: 5444624881811457","user_id":196129140,"user_id_str":"196129140"},"time ...: stamp_ms":"1500994305560"}}') out[119]: {u'delete': {u'status': {u'id': 805444624881811457, u'id_str': u'805444624881811457', u'user_id': 196129140, u'user_id_str': u'196129140'}, u'timestamp_ms': u'1500994305560'}}
sorry, i'd make comment, imagine post in comment... :)
i'm not sure what's pasting of string question, it's following invalid format python, may want correct that.
update: issue data in binary format , needed decoded data.decode('utf-8')
Comments
Post a Comment