Python: Add a column quickly

Jyoti Kumar
4 min readAug 20, 2022

Adding a column is an integral part of any Analytics/ Data Science project. A new column can be added in any of the following cases:
- Adding a column using a single column
- Adding a column using multiple columns
- Adding a column after group by

There are many ways to achieve the same in python. I will discuss a few cases using the following:

  • comparisons
  • apply and lambda function (simple, if else and nested if else)
  • map with dictionary and map with lambda function
  • np.where
  • np.select
  • row function
  • group by and transform

Loading the libraries

import pandas as pdimport numpy as np

Creating a sample dataframe

creating sample df
df
sample df

1. Adding a column using one column

# Adding a flag column using one columndf['one_col_flag'] = df['type'] == 'AA'df
flag column

Using np.where

It is similar to if else:

  • first argument is condition,
  • second is value if condition is True and
  • third is value when condition is False
df['np_where'] = np.where(df.value > 10, df.value*2, df.value**2)df
Using np.where

Using Lambda Function to add a column

Learn more about the lambda function from the documentation

# Adding a new column using lambda functiondf['value_lambda'] = df['value'].apply(lambda x: x**2)df
lambda function 1
# Adding a new column using if else in lambda functiondf['value_lambda2'] = df['value'].apply(lambda x: x*2 if x >= 20 else x**2)df
lambda if else
# Adding a new column using multiple if else in lambda functiondf['value_lambda3'] = df['value'].apply(lambda x: x*10 if x >= 20 else (x**2 if x >10 else x))df
lambda nested if else

Using map function

# Using map to map all values in a column to another value using a dictionarydf['map_dict'] = df['value'].map({10:'ten', 20:'twenty'})df
map-dict
# Using map and lambda functiondf['map_lambda'] = df['value'].map(lambda x: x*10 if x >= 20 else (x**2 if x >10 else x))df
map-lambda

Dropping the previously created columns

# Dropping all the previously created columndf.drop(['one_col_flag', 'np_where', 'value_lambda', 'value_lambda2', 'value_lambda3', 'map_dict', 'map_lambda'], axis = 1, inplace  = True)

2. Adding a column using two or more columns

# Adding a flag column using two columndf['two_col_flag'] = (df['type'] == 'AA') & (df['value'] >=20)df
two-col-flag
# Adding a flag column using two column and OR conditiondf['two_col_flag2'] = ((df['type'] == 'AA') | (df['value'] <20))*1df
two-col-flag2

Using np.where on two columns

df['np_where2'] = np.where((df.value > 10)&(df.type=='AA'), 1, df.value**2)df
np.where

Using np.select

np.select takes two list as first two arguments

  • conditions list — A list with conditions where conditions using one or multiple columns can be added
  • choice list — A list with choices for the conditions mentioned in the conditions list
  • default — Default value for all other cases
conditions = [(df['type'] == 'AA') & (df['value'] > 10), (df['type'] == 'BB') & (df['value'] >= 20)]choice = [1,2]df['np_select'] = np.select(conditions, choice, default = -1)df
np.select

Using apply and row function

It can be used in all the cases and almost any complex column addition can be achieved using row function

Defining a sample row function

def type_event(row):      if (row['type'] == 'AA') & (row['value'] > 10):
val = 1
elif (row['type'] == 'BB') & (row['value'] >= 20):
val = 2
else:
val = 0
return val

Using apply to apply row function type_event

df['row_fun'] = df.apply(type_event, axis = 1)df
row_func

Additional: Adding a column after groupby

Transform is used with function sum to add a column with sum of values for each type group

# sum can be replaced with min, maxdf['value_sum'] = df.groupby(['type'])['value'].transform(sum)df.sort_values(by='type')
groupby-transform-sum

I have tried to cover all the cases that I have encountered during my projects.

Please share the blog if you like it. Please follow as it motivates me to write more.

--

--

Jyoti Kumar

I have experience in Predictive Modelling and Dashboards. I have rich working experience on various tools and software like Python, R, Tableau, Power BI and SQL