Strip / trim all strings of a dataframe Ask Question

Strip / trim all strings of a dataframe Ask Question

Cleaning the values of a multitype data frame in python/pandas, I want to trim the strings. I am currently doing it in two instructions :

import pandas as pd

df = pd.DataFrame([['  a  ', 10], ['  c  ', 5]])

df.replace('^\s+', '', regex=True, inplace=True) #front
df.replace('\s+$', '', regex=True, inplace=True) #end

df.values

This is quite slow, what could I improve ?

ベストアンサー1

You can use DataFrame.select_dtypes列を選択してstringからapply機能するstr.strip

注意:はであるため、値はや のtypesようにはなりません。dictslistsdtypesobject

df_obj = df.select_dtypes('object')
#if need also processing string categories
#df_obj = df.select_dtypes(['object', 'category'])
print (df_obj)
0    a  
1    c  

df[df_obj.columns] = df_obj.apply(lambda x: x.str.strip())
print (df)

   0   1
0  a  10
1  c   5

しかし、列が数列しかない場合はstr.strip:

df[0] = df[0].str.strip()

おすすめ記事