Unicode from Variable

Hello Folks,

I’m trying to get some data from a web server, and get this data on perspective. But during the testes some data are in unicode formated with scape \u, I tried some ways to show the “right” text but no success, follow the example:

value = 'Folder Name: 8. Manuten\u00e7\u00e3o'
type(value)
print value
value2 = unicode(value,'utf-8')
type(value2)
print value2 #No Work

print u'Folder Name: 8. Manuten\u00e7\u00e3o' #Works OK

I I type directly on print the string with u’ works normal, but if try the same with a variable not work.

Anyone have a idea how to fix this issue?

Thanks in Advance!

A simple import from __future__ will fix your problem.

from __future__ import unicode_literals

value = 'Folder Name: 8. Manuten\u00e7\u00e3o'
type(value)
print value
value2 = unicode(value,'utf-8')
type(value2)
print value2 

Will produce the following output:

>>> 
<type 'unicode'>
Folder Name: 8. Manutenção
<type 'unicode'>
Folder Name: 8. Manutenção
1 Like

Your first line should be:

value = u'Folder Name: 8. Manuten\u00e7\u00e3o'

Or use the __future__ import to always get unicode. The problem is that you must start with unicode. Once in a classic str(), you’ve already broken your chance to interpret the unicode correctly. You can’t do it after the fact.

So, if you are pulling from a web API, you must use methods that interpret UTF-8 as the data arrives. Please show the rest of your code, and explain the data flow more completely.

3 Likes

Thanks for the quick response guys.

I getting data from Microsoft Graph, to get DriveItens from specific Drive, They will return a Json with the information from the Sub Folder of a Document Library at SharePoint, and the information of Files in Root of drive

def getItensOnDriveItem(SiteID, DriveID, Path):
	import httplib, urllib, system, ast, shared.HTTP.SharePointV2
	token_expired = system.tag.readBlocking(['[default]SharePointApp/Token/token_expired])
	if token_expired:
		access_token = shared.HTTP.SharePointV2.GetToken()
	else:
		access_token = system.tag.readBlocking(['[default]SharePointApp/Token/access_token'])
		
	url = "/v1.0/sites/" + SiteID + "/drives/" + DriveID + "/root:/"+ Path + ":/children"
	conn = httplib.HTTPSConnection("graph.microsoft.com")
	payload = ''
	headers = {
	  'Authorization': 'Bearer ' + access_token
	}
	conn.request("GET", url, payload, headers)
	res = conn.getresponse()
	data = res.read()
	
	return (ast.literal_eval(data.decode('utf-8')))

after cheking the reply from you guys, I found the issue, in the previous code I’m using the ast to convert JSON to Dict.

I replaced the code to use Json.Loads, and the Unicode are working fine now, if this code:

def getItensOnDriveItem(SiteID, DriveID, Path):
	import httplib, urllib, system, json, shared.HTTP.SharePointV2
	token_expired = system.tag.readBlocking(['[default]SharePointApp/Token/token_expired'])
	if token_expired:
		access_token = shared.HTTP.SharePointV2.GetToken()
	else:
		access_token = system.tag.readBlocking(['[default]SharePointApp/Token/access_token'])
		
	url = "/v1.0/sites/" + SiteID + "/drives/" + DriveID + "/root:/"+ Path + ":/children"
	conn = httplib.HTTPSConnection("graph.microsoft.com")
	payload = ''
	headers = {
	  'Authorization': 'Bearer ' + access_token
	}
	conn.request("GET", url, payload, headers)
	res = conn.getresponse()
	data = res.read()
	
	return (json.loads(data.decode('utf-8')))