Python: Add a column quickly

4 min readAug 20, 2022

Adding a column is an integral part of any Analytics/ Data Science project. A new column can be added in any of the following cases:
- Adding a column using a single column
- Adding a column using multiple columns
- Adding a column after group by

There are many ways to achieve the same in python. I will discuss a few cases using the following:

comparisons
apply and lambda function (simple, if else and nested if else)
map with dictionary and map with lambda function
np.where
np.select
row function
group by and transform

Loading the libraries

import pandas as pdimport numpy as np

Creating a sample dataframe

df

1. Adding a column using one column

# Adding a flag column using one columndf['one_col_flag'] = df['type'] == 'AA'df

Using np.where

It is similar to if else:

first argument is condition,
second is value if condition is True and
third is value when condition is False

df['np_where'] = np.where(df.value > 10, df.value*2, df.value**2)df

Using Lambda Function to add a column

Learn more about the lambda function from the documentation

# Adding a new column using lambda functiondf['value_lambda'] = df['value'].apply(lambda x: x**2)df

# Adding a new column using if else in lambda functiondf['value_lambda2'] = df['value'].apply(lambda x: x*2 if x >= 20 else x**2)df

# Adding a new column using multiple if else in lambda functiondf['value_lambda3'] = df['value'].apply(lambda x: x*10 if x >= 20 else (x**2 if x >10 else x))df

Using map function

# Using map to map all values in a column to another value using a dictionarydf['map_dict'] = df['value'].map({10:'ten', 20:'twenty'})df

# Using map and lambda functiondf['map_lambda'] = df['value'].map(lambda x: x*10 if x >= 20 else (x**2 if x >10 else x))df

Dropping the previously created columns

# Dropping all the previously created columndf.drop(['one_col_flag', 'np_where', 'value_lambda', 'value_lambda2', 'value_lambda3', 'map_dict', 'map_lambda'], axis = 1, inplace  = True)

2. Adding a column using two or more columns

# Adding a flag column using two columndf['two_col_flag'] = (df['type'] == 'AA') & (df['value'] >=20)df

# Adding a flag column using two column and OR conditiondf['two_col_flag2'] = ((df['type'] == 'AA') | (df['value'] <20))*1df

Using np.where on two columns

df['np_where2'] = np.where((df.value > 10)&(df.type=='AA'), 1, df.value**2)df

Using np.select

np.select takes two list as first two arguments

conditions list — A list with conditions where conditions using one or multiple columns can be added
choice list — A list with choices for the conditions mentioned in the conditions list
default — Default value for all other cases

conditions = [(df['type'] == 'AA') & (df['value'] > 10), (df['type'] == 'BB') & (df['value'] >= 20)]choice = [1,2]df['np_select'] = np.select(conditions, choice, default = -1)df

Using apply and row function

It can be used in all the cases and almost any complex column addition can be achieved using row function

Defining a sample row function

def type_event(row):      if (row['type'] == 'AA') & (row['value'] > 10):
         val = 1
      elif (row['type'] == 'BB') & (row['value'] >= 20):
         val = 2
      else:
         val = 0      return val

Using apply to apply row function type_event

df['row_fun'] = df.apply(type_event, axis = 1)df

Additional: Adding a column after groupby

Transform is used with function sum to add a column with sum of values for each type group

# sum can be replaced with min, maxdf['value_sum'] = df.groupby(['type'])['value'].transform(sum)df.sort_values(by='type')

I have tried to cover all the cases that I have encountered during my projects.

Please share the blog if you like it. Please follow as it motivates me to write more.

Python: Add a column quickly

1. Adding a column using one column

2. Adding a column using two or more columns

Using apply and row function

Additional: Adding a column after groupby

Written by Jyoti Kumar