Skip to main content

Posts

Showing posts with the label Kaggle

Useful Pandas Functions for Titanic Kaggle Competition

Pandas is widely used for data science, data analysis and machine learning. Here I will show tips in using Pandas to process the data set for Titanic Kaggle Competition. The pandas module has many useful methods or functions. Fist you should use pandas.read_csv to read a comma-separated values (csv) file into DataFrame. import pandas as pd train_data = pd.read_csv('/kaggle/input/titanic/train.csv') test_data = pd.read_csv('/kaggle/input/titanic/test.csv') When you want to make sure the elements in the data, use the function of pandas.DataFrame.head gives information of the first 5 rows in the data frame. train_data.head() PassengerId Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked 0 892 3 Kelly, Mr. James male 34.5 0 0 330911 7.8292 NaN Q 1 893 3 Wilkes, Mrs. James (Ellen Needs) female 47.0 1 0 363272 7.0000 NaN S 2 894 2 Myles, Mr. Thomas Francis male 62.0 0 0 240276 9.6875 NaN Q 3 895 3 Wirz, Mr. Albert male 27.0 0 0 315154 8.6625 NaN S 4 896 3 Hirv...