1. iloc和loc的区别:
import pandas as pd import numpy as np a = np.arange(12).reshape(3,4) print a >>> [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] df = pd.DataFrame(a) print df >>> 0 1 2 3 0 0 1 2 3 1 4 5 6 7 2 8 9 10 11 print df.loc[0] >>> 0 0 1 1 2 2 3 3 Name: 0, dtype: int32 print df.iloc[0] 0 0 1 1 2 2 3 3 Name: 0, dtype: int32 print df.loc[:,[0,3]] 0 3 0 0 3 1 4 7 2 8 11 print df.iloc[:,[0,3]] 0 3 0 0 3 1 4 7 2 8 11
接下来是把行标签[0, 1, 2]改成['a', 'b', 'c'],则成这样了。
df.index = ['a','b','c'] print df >>> 0 1 2 3 a 0 1 2 3 b 4 5 6 7 c 8 9 10 11 print df.loc[0] # TypeError: cannot do label indexing onwith these indexers [0] of print df.iloc[0] >>> 0 0 1 1 2 2 3 3 Name: a, dtype: int32 print df.iloc['a'] # TypeError: cannot do positional indexing on with these indexers [a] of print df.loc['a'] # 正确 >>> 0 0 1 1 2 2 3 3 Name: a, dtype: int32
同样地,把列标签[0, 1, 2, 3]改成['A', 'B, 'C', 'D'],则成这样了。
df.columns = ['A','B','C','D'] print df >>> A B C D a 0 1 2 3 b 4 5 6 7 c 8 9 10 11 print df.loc[:,'A'] >>> a 0 b 4 c 8 Name: A, dtype: int32 print df.iloc[:,'A'] # ValueError: Location based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types
print df.ix[0] >>> A 0 B 1 C 2 D 3 Name: a, dtype: int32 print df.ix['a'] >>> A 0 B 1 C 2 D 3 Name: a, dtype: int32 print df.ix[:,0] >>> a 0 b 4 c 8 Name: A, dtype: int32 print df.ix[:,'A'] >>> a 0 b 4 c 8 Name: A, dtype: int32