~ 2 min read

Adding Date Columns to a Python Pandas Dataframe

By: Adam Richardson
Share:

Intro

To enable analysis over time, by month, by week, or by day of week, we need to have that data in our data set. Let’s make this happen.

Adding Day of Week Column to Python Pandas dataframe

The first column we’re going to add is a day of week column. Because our date column is already of type datetime, we can do this really easily. dt is a standard python library.

df['DayOfWeek'] = df['date'].dt.day_name()
df.head()

strftime()

strftime is a method of the datetime class in the Python standard library that allows you to convert a date object to a string representation. It takes a format string as an argument, which specifies the desired layout of the resulting string.

Here is a list of the most common ones

strftime format specifiers

Format SpecifierDescriptionExample
%Y4-digit year2022
%m2-digit month01 for January
%d2-digit day01
%HHour (24-hour)00, 01, …, 23
%MMinute00, 01, …, 59
%SSecond00, 01, …, 59
%BFull month nameJanuary
%bAbbreviated month nameJan
%AFull day nameSaturday
%aAbbreviated day nameSat
%UWeek number (Sunday as first day of week)01, 02, …, 53
%WWeek number (Monday as first day of week)01, 02, …, 53
%wWeekday as a decimal number, where 0 is Sunday and 6 is Saturday0, 1, …, 6

Using strftime

I pretty much always use strftime to create my date columns. Here is how I can create dayOfWeek, monthNumber, month and weekNumber columns.

df['dayOfWeek'] = df['date'].dt.strftime('%A')
df['monthNumber'] = df['date'].dt.strftime('%m')
df['month'] = df['date'].dt.strftime('%B')
df['weekNumber'] = df['date'].dt.strftime('%W')
df.head()

Output

dateestKeycapacityoccupancyroomsSoldavgRatesalesValuerevPARdayOfWeekmonthNumbermonthweekNumber
2022-12-2702890.7521735.977805.4927.008616Tuesday12December52
2022-12-2712030.357182.315844.0128.788227Tuesday12December52
2022-12-2722070.51106227.8324149.98116.666570Tuesday12December52
2022-12-273270.3710126.461264.6046.837037Tuesday12December52
2022-12-274200.8717191.573256.69162.834500Tuesday12December52

Week & Month codes

Let’s create a unique reference as an integer for the weeks and months. This will be used for ordering our analysis by financial week/month across many years. You will see how this works soon.

We’re going to use python f strings to concatenate year and month/week, then convert it to an integer. We’ve done this before, let’s take a look again.

df['monthCode'] = df.apply(lambda x: f"{x['year']}{x['monthNumber']}", axis=1).astype(int)
df['weekCode'] = df.apply(lambda x: f"{x['year']}{x['weekNumber']}", axis=1).astype(int)

Here is the final output

dateestKeycapacityoccupancyroomsSoldavgRatesalesValuerevPARdayOfWeekmonthNumbermonthweekNumberyearmonthCodeweekCode
2022-12-2702890.7521735.977805.4927.008616Tuesday12December522022202212202252
2022-12-2712030.357182.315844.0128.788227Tuesday12December522022202212202252
2022-12-2722070.51106227.8324149.98116.666570Tuesday12December522022202212202252
2022-12-273270.3710126.461264.6046.837037Tuesday12December522022202212202252

Note: We’re adding these date columns here, to teach you how to do this. A best practice would be to create a separate table or csv file which contains all the date columns you will need, and you can join this to any data set which has a date column. Pretty much every analysis you will need these, so this is a good practice.

Now that we have the data in the correct format, and with the columns we need, let’s start doing some fun stuff. Next up we’re going to write the current file to a csv.

Share:
Subscribe to our newsletter

Stay up to date with our latest content - No spam!

Related Posts