Simple CSV to XML Conversion – Python

Simple CSV to XML Conversion – Python

A possible solution is to first load the csv into Pandas and then convert it row by row into XML, as so:

import pandas as pd
df = pd.read_csv(untitled.txt, sep=|)

With the sample data (assuming separator and so on) loaded as:

          Title                   Type Format  Year Rating  Stars  
0  Enemy Behind           War,Thriller    DVD  2003     PG     10   
1  Transformers  Anime,Science Fiction    DVD  1989      R      9   

             Description  
0          Talk about...  
1  A Schientific fiction  

And then converting to xml with a custom function:

def convert_row(row):
    return <movietitle=%s>
    <type>%s</type>
    <format>%s</format>
    <year>%s</year>
    <rating>%s</rating>
    <stars>%s</stars>
    <description>%s</description>
</movie> % (
    row.Title, row.Type, row.Format, row.Year, row.Rating, row.Stars, row.Description)

print n.join(df.apply(convert_row, axis=1))

This way you get a string containing the xml:

<movietitle=Enemy Behind>
    <type>War,Thriller</type>
    <format>DVD</format>
    <year>2003</year>
    <rating>PG</rating>
    <stars>10</stars>
    <description>Talk about...</description>
</movie>
<movietitle=Transformers>
    <type>Anime,Science Fiction</type>
    <format>DVD</format>
    <year>1989</year>
    <rating>R</rating>
    <stars>9</stars>
    <description>A Schientific fiction</description>
</movie>

that you can dump in to a file or whatever.

Inspired by this great answer.


Edit: Using the loading method you posted (or a version that actually loads the data to a variable):

import csv              
f = open(movies2.csv)
csv_f = csv.reader(f)   
data = []

for row in csv_f: 
   data.append(row)
f.close()

print data[1:]

We get:

[[Enemy Behind, War, Thriller, DVD, 2003, PG, 10, Talk about...], [Transformers, Anime, Science Fiction, DVD, 1989, R, 9, A Schientific fiction]]

And we can convert to XML with minor modifications:

def convert_row(row):
    return <movietitle=%s>
    <type>%s</type>
    <format>%s</format>
    <year>%s</year>
    <rating>%s</rating>
    <stars>%s</stars>
    <description>%s</description>
</movie> % (row[0], row[1], row[2], row[3], row[4], row[5], row[6])

print n.join([convert_row(row) for row in data[1:]])

Getting identical results:

<movietitle=Enemy Behind>
    <type>War</type>
    <format>Thriller</format>
    <year>DVD</year>
    <rating>2003</rating>
    <stars>PG</stars>
    <description>10</description>
</movie>
<movietitle=Transformers>
    <type>Anime</type>
    <format>Science Fiction</format>
    <year>DVD</year>
    <rating>1989</rating>
    <stars>R</stars>
    <description>9</description>
</movie>

I tried to generalize robertoias function convert_row for any header instead of writing it by hand.

import csv  
import pandas as pd
            
f = open(movies2.csv)
csv_f = csv.reader(f)   
data = []

for row in csv_f: 
   data.append(row)
f.close()

df = pd.read_csv(movies2.csv)
header= list(df.columns)

def convert_row(row):
     str_row = <%s>%s</%s> n*(len(header)-1)
     str_row = <%s>%s +n+ str_row + </%s>
     var_values = [list_of_elments[k] for k in range(1,len(header)) for list_of_elments in [header,row,header]]
     var_values = [header[0],row[0]]+var_values+[header[0]]
     var_values =tuple(var_values)
     return str_row % var_values

text =<collection shelf=New Arrivals>+n+n.join([convert_row(row) for row in data[1:]])+n +</collection >
print(text)
with open(output.xml, w) as myfile: 
  myfile.write(text)

Of course with pandas now, it is simpler to just use
to_xml() :

df= pd.read_csv(movies2.csv)
with open(outputf.xml, w) as myfile: 
  myfile.write(df.to_xml())

Simple CSV to XML Conversion – Python

Leave a Reply

Your email address will not be published. Required fields are marked *