最近写了个读写表格的工具,在这里记录一下相关技术。
首先明确一下常见表格的格式,最常见的是xls和xlsx,前者是Excel2003版本的默认保存格式,后者是Excel2007以后版本的默认保存格式。
Excel2007向下兼容xls格式,但Excel2003不能兼容xlsx格式。
因此考虑到兼容性,最好采用xls格式作为默认格式进行处理。
除此以外还有一种xlsm格式,是一种特殊的xlsx文件,只是包含了宏命令,在数据处理中比较少见。
xlsx读取
import xlrd wb = xlrd.open_workbook(filename="test.xlsx")#打开文件 print(wb.sheet_names())#获取所有表格名字 sheet1 = wb.sheet_by_index(0)#通过索引获取表格 sheet2 = wb.sheet_by_name('年级')#通过名字获取表格 print(sheet1.name,sheet1.nrows,sheet1.ncols) rows = sheet1.row_values(2)#获取行内容 cols = sheet1.col_values(3)#获取列内容 print(rows) print(cols) print(sheet1.cell(1,0).value)#获取表格里的内容,三种方式 print(sheet1.cell_value(1,0)) print(sheet1.row(1)[0].value)
xlsx写入
import xlwt f = xlwt.Workbook() sheet1 = f.add_sheet('学生',cell_overwrite_ok=True) row0 = ["姓名","年龄","出生日期","爱好"] colum0 = ["张三","李四","恋习Python","小明","小红","无名"] #写第一行 for i in range(0,len(row0)): sheet1.write(0,i,row0[i],set_style('Times New Roman',220,True)) #写第一列 for i in range(0,len(colum0)): sheet1.write(i+1,0,colum0[i],set_style('Times New Roman',220,True)) sheet1.write(1,3,'2006/12/12') sheet1.write_merge(6,6,1,3,'未知')#合并行单元格 sheet1.write_merge(1,2,3,3,'打游戏')#合并列单元格 sheet1.write_merge(4,5,3,3,'打篮球') f.save('test.xls')
除了Excel以外,还有一种类表格csv格式,这是一种明文存储,以行和分隔符为界的数据存储格式, 第一行可以是列名 ,分隔符默认是逗号,也可以是分号或制表符。
许多程序特别是数据库都支持导入导出该格式,用来作为与其他软件进行数据交换的通用格式。
csv格式读写可以使用python自带的csv库,或者第三方的pandas库。
读csv
import csv with open('test.csv','r') as myFile: lines=csv.reader(myFile) for line in lines: print line
import pandas pandas.read_csv('test.csv',encoding='gbk') #其他常用参数 #sep=";" 指定逗号以外的分隔符 #delimiter=";" 备选分隔符,指定了这个则sep失效 #usecols=["name"] 读取部分列,可以指定列序号或者列名
写csv
import csv #由于默认写入一行后会留空一行会写入,因此newline=""的意思就是不留空行。 with open('frank.csv','w', newline='' ) as myFile: writer = csv.writer(myFile) writer.writerow(['frank', 'yangchao']) writer.writerows([('Bob','Tom'), ('liuwei', 'liuyong')])
import pandas Data = {'col2': [3, 4], 'col4': [1, 2]} DF = pd.DataFrame(Data) #可以避免dict的key无序性 DF.drop_duplicates() #删除重复数据 DF.to_csv(r'test.csv',encoding='gbk')
json格式读写csv
import csv with open('frank.csv','r') as myFile: reader = csv.DictReader(myFile) #第一行作为key,其他行为value
import csv with open('frank.csv','w',newline='') as myFile: writer = csv.DictWriter(myFile, fieldnames=['name', 'address']) # 这里注意,fieldname列表中的字段为需要存入的字典的key,存入之后就存入key所对应的value writer.writeheader() writer.writerows([{'name':'Tom','address':'American'}, {'name':'Bob', 'address':'China'}])
csv转换成xlsx格式
import csv import xlwt with open('1.csv', 'r', encoding='utf-8') as f: read = csv.reader(f) workbook = xlwt.Workbook() sheet = workbook.add_sheet('data') # 创建一个sheet表格 l = 0 for line in read: r = 0 for i in line: sheet.write(l, r, i) # 一个一个将单元格数据写入 r = r + 1 l = l + 1 workbook.save('1.xlsx') # 保存Excel
import pandas as pd csv = pd.read_csv('1.csv', encoding='utf-8') csv.to_excel('1.xlsx', sheet_name='data')
xlsx转换成csv格式
import xlrd import csv workbook = xlrd.open_workbook('1.xlsx') table = workbook.sheet_by_index(0) with codecs.open('1.csv', 'w', encoding='utf-8') as f: write = csv.writer(f) for row_num in range(table.nrows): row_value = table.row_values(row_num) write.writerow(row_value)
import pandas as pd data_xls = pd.read_excel('1.xlsx', index_col=0) data_xls.to_csv('1.csv', encoding='utf-8')
人生的旅程就是这样,用大把的时间迷茫在几个瞬间成长。
——瑞卡斯


评论
A cool post there mate ! Thank you for posting.
I was reading through some of your content on this internet site and I believe this web site is very informative ! Continue posting .
Thanks so much for this, keep up the good work 🙂
Thanks for sharing the information. I found the information very useful. That’s a awesome story you posted. I will come back to scan some more.
Enjoyed reading through this, very good material. Thanks!
I’ve been surfing online more than 3 hours today, yet I never found any interesting article like yours. It’s pretty worth enough for me. In my view, if all web owners and bloggers made good content as you did, the net will be much more useful than ever before.