Replace all newline characters using python

Replace all newline characters using python

content = content.replace(\r\n, )

You need to double escape them.

I dont have access to your pdf file, so I processed one on my system. I also dont know if you need to remove all new lines or just double new lines. The code below remove double new lines, which makes the output more readable.

Please let me know if this works for your current needs.

from tika import parser

filename = myfile.pdf

# Parse the PDF
parsedPDF = parser.from_file(filename)

# Extract the text content from the parsed PDF
pdf = parsedPDF[content]

# Convert double newlines into single newlines
pdf = pdf.replace(nn, n)

#####################################
# Do something with the PDF
#####################################
print (pdf)

Replace all newline characters using python

print(open(myfile.txt).read().replace(n, ))

Leave a Reply

Your email address will not be published. Required fields are marked *