ADAPT-uiuc/dias

`Series.replace()` to `apply() + re.sub()`

Opened this issue · 0 comments

Original

df['Name'].replace(to_replace='John', value='Mike', regex=True, inplace=True)

Rewritten

df['Name'] = df['Name'].apply(lambda x: re.sub(pattern='John', repl='Mike', string=x))

This is another weird case where the Pandas-provided method is slower.
Full example:

import pandas as pd
import re
df = pd.read_csv('https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv')
df = pd.concat([df]*2000, ignore_index=True)
df['Name'].replace(to_replace='John', value='Mike', regex=True, inplace=True)