python - Writing Unicode to .docx file -


is there way write unicode characters docx files? tried python-docx but, it's giving me typeerror.

traceback

traceback (most recent call last):

file "< ipython-input-1-ba89c735995d >", line 1, in runfile('h:/python/practice/new/download.py', wdir='h:/python/practice/new')

file "c:\program files\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile execfile(filename, namespace)

file "c:\program files\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

file "h:/python/practice/new/download.py", line 37, in document.add_paragraph(story.encode("utf-8"))

file "c:\program files\anaconda3\lib\site-packages\docx\document.py", line 63, in add_paragraph return self._body.add_paragraph(text, style)

file "c:\program files\anaconda3\lib\site-packages\docx\blkcntnr.py", line 36, in add_paragraph paragraph.add_run(text)

file "c:\program files\anaconda3\lib\site-packages\docx\text\paragraph.py", line 37, in add_run run.text = text

file "c:\program files\anaconda3\lib\site-packages\docx\text\run.py", line 163, in text self._r.text = text

file "c:\program files\anaconda3\lib\site-packages\docx\oxml\text\run.py", line 104, in text _runcontentappender.append_to_run_from_text(self, text)

file "c:\program files\anaconda3\lib\site-packages\docx\oxml\text\run.py", line 134, in append_to_run_from_text appender.add_text(text)

file "c:\program files\anaconda3\lib\site-packages\docx\oxml\text\run.py", line 142, in add_text self.add_char(char)

file "c:\program files\anaconda3\lib\site-packages\docx\oxml\text\run.py", line 156, in add_char elif char in '\r\n':

typeerror: 'in < string >' requires string left operand, not int

i trying scrape website , write texts ms word file. texts in local language (bangla). when print story on console prints whole texts perfectly.

code

import requests docx import document bs4 import beautifulsoup  url = "some url" response = requests.get(url) soup = beautifulsoup(response.text, "lxml")  story = soup.find("div", {"dir": "ltr"}).get_text().replace("<br />", "\n", len(response.text))  title = "title" document = document() document.add_heading(title, 0) document.add_paragraph(story.encode("utf-8")) document.save(title + ".docx") 

in python 3, strings (should) automatically support unicode, don't need special encoding-to-utf-8.

document.add_paragraph(story) #                      ^ directly, no need call `.encode('utf-8')` 

while can write unicode characters document, default text rendering engine used word processor may not support it. may need specify font ensure characters won't become □□□□.

run = document.add_paragraph(story).add_run() font = run.font font.name = 'vrinda' 

Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -