Unescape json string in python -


i getting following string log file , want remove backslashes string.

string file:

this exact string fro log file, except sensitive info replaced dummy values.

2017-08-17 17:29:49.249  error org.foo.bar.logging.applicationlogger - apierror={"input":"{\"requestbody\":\"{\\\"request\\\":{\\\"drequests\\\":{\\\"items\\\":[{\\\"description\\\":\\\"i add additional card.\\\",\\\"fields\\\":{\\\"field\\\":[{\\\"fieldname\\\":\\\"severity\\\",\\\"fieldvalue\\\":\\\"4\\\"},{\\\"fieldname\\\":\\\"contact\\\",\\\"fieldvalue\\\":\\\"phone\\\"},{\\\"fieldname\\\":\\\"callbacknumber\\\",\\\"fieldvalue\\\":\\\"1 (123) 123456\\\"},{\\\"fieldname\\\":\\\"version\\\",\\\"fieldvalue\\\":\\\"11.1\\\"},{\\\"fieldname\\\":\\\"language\\\",\\\"fieldvalue\\\":\\\"english\\\"}]},\\\"product\\\":\\\"visa\\\",\\\"subject\\\":\\\"adding addition card\\\",\\\"serial_number\\\":\\\"123456789\\\"}]},\\\"email\\\":\\\"someone@gmail.com\\\",\\\"first_name\\\":\\\"foo\\\",\\\"last_name\\\":\\\"bar\\\"}}\"}"} 

python code

str = read_from_file() print str.replace('\\"', '"') 

i tried line of code not having effect. how can rid of backslahses json string?

edit

i tried solution of recursively doing json.loads didn't remove backslashes.

just give better context - not processing json string json, instead writing file more readable human. below complete code.

import re import json tailf import tailf   def parserecursive(obj):     if isinstance(obj, str):         try:  # see whether json: if so, parse             obj = json.loads(obj)         except json.jsondecodeerror:             pass  # if not, leave         if isinstance(obj, dict):  # perform recursion             prop, val in obj.items():                 obj[prop] = parserecursive(val)     return obj   line in tailf("/var/log/test.log"):     m = re.search('([\d\-:\s]+).*error.*apierror=(.*)', line)     if m none:         print "no match"     else:         print m.group(1)         print parserecursive(m.group(2)); 

when run script print string , backslashes not removed @ notice 2 u' character in beginning of second line.

output

2017-08-17 17:29:49 {u'input': u'{"requestbody":"{\\"request\\":{\\"drequests\\":{\\"items\\":[{\\"description\\":\\"i add additional card.\\",\\"fields\\":{\\"field\\":[{\\"fieldname\\":\\"severity\\",\\"fieldvalue\\":\\"4\\"},{\\"fieldname\\":\\"contact\\",\\"fieldvalue\\":\\"phone\\"},{\\"fieldname\\":\\"callbacknumber\\",\\"fieldvalue\\":\\"1 (123) 123456\\"},{\\"fieldname\\":\\"version\\",\\"fieldvalue\\":\\"11.1\\"},{\\"fieldname\\":\\"language\\",\\"fieldvalue\\":\\"english\\"}]},\\"product\\":\\"visa\\",\\"subject\\":\\"adding addition card\\",\\"serial_number\\":\\"123456789\\"}]},\\"email\\":\\"someone@gmail.com\\",\\"first_name\\":\\"foo\\",\\"last_name\\":\\"bar\\"}}"}'} 

update

it silly mistake! had type case oobject string , call replace function. below code worked.

import re tailf import tailf  line in tailf("/var/log/test.log"):     m = re.search('([\d\-:\s]+).*error.*apierror=(.*)', line)     if m none:         print "no match"     else:         print m.group(1)         encodedstring = m.group(2) + ''         print str(encodedstring).replace('\\', '') 

the json represented in string needs backslashes, apparently represented object has property values json encoded strings. these embedded strings need double quotes escaped. remove them make overall json invalid.

what may need resolve embedded json strings objects represent. best use recursive function uses json.loads method parse each of nested jsons.

nb: not idea use name str data, name of python data type.

here suggested solution:

import json import re  def parserecursive(obj):     if isinstance(obj, basestring):         try: # see whether json: if so, parse             obj = json.loads(obj)         except json.jsondecodeerror:             pass # if not, leave         if isinstance(obj, dict): # perform recursion             prop, val in obj.items():                 obj[prop] = parserecursive(val)     return obj  # sample input msg = r'2017-08-17 17:29:49.249  error org.foo.bar.logging.applicationlogger - apierror={"input":"{\"requestbody\":\"{\\\"request\\\":{\\\"drequests\\\":{\\\"items\\\":[{\\\"description\\\":\\\"i add additional card.\\\",\\\"fields\\\":{\\\"field\\\":[{\\\"fieldname\\\":\\\"severity\\\",\\\"fieldvalue\\\":\\\"4\\\"},{\\\"fieldname\\\":\\\"contact\\\",\\\"fieldvalue\\\":\\\"phone\\\"},{\\\"fieldname\\\":\\\"callbacknumber\\\",\\\"fieldvalue\\\":\\\"1 (123) 123456\\\"},{\\\"fieldname\\\":\\\"version\\\",\\\"fieldvalue\\\":\\\"11.1\\\"},{\\\"fieldname\\\":\\\"language\\\",\\\"fieldvalue\\\":\\\"english\\\"}]},\\\"product\\\":\\\"visa\\\",\\\"subject\\\":\\\"adding addition card\\\",\\\"serial_number\\\":\\\"123456789\\\"}]},\\\"email\\\":\\\"someone@gmail.com\\\",\\\"first_name\\\":\\\"foo\\\",\\\"last_name\\\":\\\"bar\\\"}}\"}"}'  # extract json part: take "{ ... }" message string match = re.search(r"\{.*\}", msg) if match:     # parse nested json     obj = parserecursive(match.group(0))     print obj 

see run on repl.int. note output obj dict. prefixed u in output means keys unicode strings. can access nested value in it, like:

print obj['input']['requestbody']['request']['email'] 

output:

someone@gmail.com


Comments

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -