본문 바로가기

머신러닝/EDA

3-1. EDA-Titanic 실습 3


part 3

sns.distplot(df_train['age'][df_train['sex']=='male'])

sns.distplot(df_train['age'][df_train['sex']=='female'])

sns.lmplot('age', 'survived', hue='sex', data=df_train)

sns.factorplot('age', kind='count', hue='survived', data=df_train)

sns.factorplot('age', kind='count', hue='survived', data=df_train[df_train['age'] < 6])

sns.factorplot('age', kind='count', hue='survived', data=df_train[df_train['age'] >70])

sns.factorplot('embarked', kind='count', hue='survived', data=df_train)

sns.factorplot('cabin', kind='count', hue='survived', data=df_train)

sns.lmplot('sibsp', 'survived', hue='sex', data=df_train)

sns.lmplot('parch', 'survived', hue='sex', data=df_train)

df_train['family_size'] = df_train['sibsp'] + df_train['parch']

df_train.head()

df_temp = df_train[['pclass', 'sex','age','sibsp', 'parch', 'fare']]

df_temp.head()

# sex 컬럼을 범주형에서 수치형으로 변환한다.
df_temp['sex'] = df_temp['sex'].map({'female':0, 'male':1})

df_temp.head()

df_temp.info()

'머신러닝 > EDA' 카테고리의 다른 글

3-2. EDA-Titanic 실습 2  (0) 2020.09.10
3-1. EDA-Titanic 실습 1  (0) 2020.09.10
3. EDA-타이타닉 생존자 예측  (0) 2020.09.10