Python and Data Science Code Snippets
Source code of Python and data science snippets posted daily at Data Science Simplified.
Python Built-in Methods
Title |
Explanation |
Code |
Get Multiples of a Number Using Modulus |
link |
link |
fractions: Get Numerical Results in Fractions instead of Decimals |
link |
link |
How to Use Underscores to Format Large Numbers in Python |
link |
link |
Confirm whether a variable is a number |
link |
link |
Title |
Explanation |
Code |
Boolean Operators: Connect Two Boolean Expressions into One Expression |
link |
link |
Title |
Explanation |
Code |
__str__ and __repr__ : Create a String Representation of a Python Object |
link |
link |
String find: Find the Index of a Substring in a Python String |
link |
link |
eval: Turn a Python String into a Variable or Function |
link |
link |
re.sub: Replace One String with Another String Using Regular Expression |
link |
link |
Title |
Explanation |
Code |
any: Check if Any Element of an Iterable is True |
link |
link |
Extended Iterable Unpacking: Ignore Multiple Values when Unpacking a Python Iterable |
link |
link |
How to Unpack Iterables in Python |
link |
link |
random.choice: Get a Randomly Selected Element from a Python List |
link |
link |
filter: Get the Elements of an Iterable that a Function Returns True |
link |
link |
heapq: Find n Max Values of a Python List |
link |
link |
join method: Turn an Iterable into a Python String |
link |
link |
Zip: Associate Elements from Two Iterators based on the Order |
link |
link |
collections.Counter: Count the Occurrences of Items in a List |
link |
link |
Zip Function: Create Pairs of Elements from Two Lists in Python |
link |
link |
Stop using = operator to create a copy of a Python list. Use copy method instead |
link |
link |
itertools.combinations: A better way to iterate through a pair of values in a Python list |
link |
link |
Enumerate |
link |
link |
Title |
Explanation |
Code |
namedtuple: A Lightweight Python Structure to Mange your Data |
link |
link |
slice: Make your Indices more Readable by Naming your Slice |
link |
link |
Title |
Explanation |
Code |
Defaultdict: Return a default value when a key is not available |
link |
link |
Ordered dictionary in Python |
link |
link |
Title |
Explanation |
Code |
datetime + timedelta: Calculate End DateTime based on Start DateTime and Duration |
link |
link |
Use Dates in a Month as the Feature |
link |
link |
Title |
Explanation |
Code |
*iterator: Pass Values of an Iterator to a Function |
link |
link |
Use Python Built-in Functions to Speed your Code |
link |
link |
**kwargs: Pass multiple arguments to a function in Python |
link |
link |
Return Multiple Values from a Function Using Python Dictionary |
link |
link |
Decorator in Python |
link |
link |
Title |
Explanation |
Code |
Abstract Classes: Declare Methods without Implementation |
link |
link |
classmethod: What is it and When to Use it |
link |
link |
getattr: a Better Way to Get the Attribute of a Class |
link |
link |
__call__ : You can Call your Class Instance like a Function. Here is how |
link |
link |
Static method: use the function without adding the attributes required for a new instance |
link |
link |
Title |
Explanation |
Code |
Shutil: Move Files in Python |
link |
link |
pathlith.Path |
link |
link |
pathlib: Create, Write, and Rename Files in One Line of Code |
link |
link |
Pathlib: Iterate Over All Files that End with ‘.csv’ in a Directory |
link |
link |
How to Improve the Readability of your JSON file using Indent |
link |
link |
__main__.py : Run a Directory like a Main Script |
link |
link |
Title |
Explanation |
Code |
Assert in Python: Output a Customized Message When the Assertion Fails |
link |
link |
warnings: Ignore Warnings when Running Python Code |
link |
link |
Title |
Explanation |
Code |
How to Execute Shell Commands in a Python Script |
link |
link |
argparse: Python Library to Parse Arguments from Command Line |
link |
link |
Title |
Explanation |
Code |
Stop Writing Code Comments. Use Meaningful Names Instead |
link |
|
Underscore(_): Ignore values that will not be used |
link |
link |
Underscore “_”: Ignore the index in Python for loops |
link |
link |
Save Immediate Output when an Error Occurs |
link |
|
Print error without stopping the for loop in Python |
link |
link |
Python Pass Statement |
link |
link |
Type hint in Python 3.9 |
link |
|
Title |
Explanation |
Code |
Concurrently execute tasks on separate CPUs |
link |
link |
Compare the execution time between 2 functions |
link |
link |
Pandas
Title |
Explanation |
Code |
pd.DataFrame.agg: Aggregate over Columns or Rows Using Multiple Operations |
link |
link |
DataFrame.pipe: Increase the Readability of your Code when Applying Multiple Functions to a DataFrame |
link |
link |
pd.Series.map: Change Values of a Pandas Series Using a Dictionary |
link |
link |
pd.Series.str: Manipulate Text Data in a pandas Series |
link |
link |
set_categories in pandas: Sort Categorical Column by a Specific Ordering |
link |
link |
parse_dates: Convert Columns into Datetime When Using Pandas to Read CSV Files |
link |
link |
Filter Rows only if Column Contains Values from another List |
link |
link |
Specify suffixes when using df.merge() |
link |
link |
Specify the datatype to speed up your code and reduce memory |
link |
|
Highlight your pandas DataFrame |
link |
link |
Assign Values to Multiple New Columns |
link |
link |
Reduce pd.DataFrame’s Memory |
link |
link |
Title |
Explanation |
Code |
df.columns.str.startswith: Find DataFrame’s Columns that Start with a Pattern |
link |
link |
pandas.DataFrame.iterrows: Iterate over Rows of a DataFrame |
link |
link |
pandas.Series.dt: Access Datetime Properties of pandas Series |
link |
link |
pd.Series.between: Select Rows in a pandas Series Containing Values between 2 Numbers |
link |
link |
DataFrame rolling: Find the average of the previous n datapoints using Pandas |
link |
link |
select_dtypes: Return a subset of a DataFrame including/excluding columns based on their dtype |
link |
link |
pct_change: Find the percentage change between the current and a prior element in a pandas Series |
link |
link |
DataFrame.diff and DataFrame.shift: Take the Difference between Rows within a Column in Pandas |
link |
link |
Pandas DataFrame: How to select all columns that start with a word |
link |
link |
Exclude Outliers |
link |
link |
Pandas DataFrame Get Data in a Year Range |
link |
link |
Title |
Explanation |
Code |
assert_frame equal: Test whether Two DataFrames are Similar |
link |
link |
Numpy
Title |
Explanation |
Code |
np.ravel: Flatten a Numpy Array |
link |
link |
Use List to Change the Positions of Rows or Columns in a Numpy Array |
link |
link |
Key Parameter in Max(): Find the Key with the Largest Value |
link |
link |
Difference between Numpy’s All and Any Methods |
link |
link |
Double np.argsort: Get Rank of Values in an Array |
link |
link |
Get the index of the max value in a Numpy array |
link |
link |
Data Science Tools
Title |
Explanation |
Code |
snoop : Smart Print to Debug your Python Function |
link |
link |
pytest benchmark: A Pytest Fixture to Benchmark your Code |
link |
link |
pytest.mark.parametrize: Test your Functions with Multiple Inputs |
link |
link |
Pytest: Shows only Failed Tests |
link |
|
Pytest Fixtures: Use the same data for different tests |
link |
link |
Pytest repeat |
link |
link |
Pandera: a Python Library to Validate Your Pandas DataFrame |
link |
link |
Title |
Explanation |
Code |
faker: Create Fake Data in One Line of Code |
link |
link |
DVC: A Data Version Control Tool for your Data Science Projects |
link |
link |
fetch_openml: Get OpenML’s Dataset in One Line of Code |
link |
link |
github-to-sqlite: Download the Data of your Starred GitHub Repositories in One Command Line |
link |
|
Autoscraper |
link |
link |
Extract series data from various Internet sources directly into a pandas DataFrame |
link |
link |
Compare the similar features between 2 different datasets |
link |
link |
Feature extraction
Title |
Explanation |
Code |
datefinder: Automatically Find Dates and Time in a Python String |
link |
link |
dill’s getname: Get Names a Python Object |
link |
link |
pytrend: Get the Trend of a Keyword on Google Search Over Time |
link |
link |
add_datepart: Add Relevant DateTime Features in One Line of Code |
link |
link |
Geopy: Extract Location Based on Python String |
link |
link |
Maya: Convert the string to datetime automatically |
link |
link |
Select the features by their relevance |
link |
|
Extract holiday from date column |
link |
link |
Title |
Explanation |
Code |
D-Tale: A Python Library to Visualize and Analyze your Data Without Code |
link |
|
Graphviz: Create a Flowchart to Capture your Ideas in Python |
link |
link |
Create an interactive map in Python |
link |
link |
Title |
Explanation |
Code |
Datapane: Publish your Python Objects on the Web in 2 Lines of Code |
link |
link |
gdown: Download a File from Google Drive in Python |
link |
link |
Natural Language Processing
Title |
Explanation |
Code |
TextBlob: Processing Text in One Line of Code |
link |
link |
sumy: Summarize Text in One Line of Code |
link |
|
Spacy_streamlit: Create a Web App to Visualize your Text in 3 Lines of Code |
link |
link |
Extract a contiguous sequence of 2 words |
link |
link |
Detect the “almost similar” articles |
link |
link |
Convert number to words |
link |
link |
Tools for Best Python Practices
Title |
Explanation |
Code |
Don’t Hard-Code. Use Hydra Instead |
link |
link |
python-dotenv: How to Load the Secret Information from .env File |
link |
link |
kedro pipeline: Create Pipeline for your Data Science Projects in Python |
link |
link |
docopt: Create Beautiful Command-line Interfaces for Documentation in Python |
link |
link |
Title |
Explanation |
Code |
fastai’s df_shrink: Shrink DataFrame’s Memory Usage in One Line of Code |
link |
link |
Swifter: Add One Word to Make your Pandas Apply 23 Times Faster |
link |
link |
Title |
Explanation |
Code |
rich-dataframe: Create Animated and Colorful Pandas Dataframe |
link |
|
tqdm: Add Progress Bar to your Pandas Apply |
link |
link |
Title |
Explanation |
Code |
causalimpact: Find Causal Relation of an Event and a Variable in Python |
link |
link |
Pipeline + GridSearchCV: Prevent Data Leakage when Scaling the Data |
link |
link |
Decompose high dimensional data into two or three dimensions |
link |
link |
Cross Validation with Time Series |
link |
|
Terminal
Title |
Explanation |
Code |
tr Command: Translate Characters to Improve Readability In Unix/Linux |
link |
link |
Sed Command: Replace a string with another string on the command line |
link |
link |
Title |
Explanation |
Code |
fd: a Simple Tool to Search for Files or Directories Fast |
link |
|
ln -s: Create Symbolic Link Between 2 Files |
link |
link |
tee: Save Command Output to a File |
link |
link |
Make Important Files Impossible to be Deleted |
link |
link |
View tree structure of your file |
link |
|
Title |
Explanation |
Code |
timeit on the Command Line: Measure Execution Time of Small Code Snippets |
link |
link |
Time Command: Track the Time it Takes to Execute a File in Linux |
link |
|
htop |
link |
|
Title |
Explanation |
Code |
Python Shell as an Calculator: Grab the Last Output Using “_” |
link |
|
Find version of a Python library using pip list and grep |
link |
|
Conda rollback to the last revision |
link |
link |
How to Check Whether a Library is Installed |
link |
link |
Title |
Explanation |
Code |
colorls: Beautify your ls Command with Color and Icons |
link |
|
Colorama: Produce a colored terminal text in Python |
link |
|
Title |
Explanation |
Code |
terminalizer: Record and Share your Terminal Sessions |
link |
|
Title |
Explanation |
Code |
Bash For Loop: Stop Staring at your Screen. Write a Bash For Loop instead |
link |
link |
Environment Variables: Save Private Information in your Local Machine |
link |
link |
Pet: A Command-line Snippet Tool That Allows you to Store your Favorite Commands |
link |
|
Loop through a list of data on your terminal |
link |
|
Multi-run command |
link |
|
Run multiple commands in one line of code |
link |
|
Cool Tools
Title |
Explanation |
Code |
How to Strip Outputs and Execute Interactive Code in a Python Script |
link |
link |
rich.inspect: Produce a Beautiful Report on any Python Object |
link |
link |
Rich’s Console: Debug your Python Function in One Line of Code |
link |
link |
loguru: Print Readable Traceback in Python |
link |
link |
Icecream: Adding a Datetime Stamp to Python print |
link |
link |
Icrecream: Never use print() to debug again |
link |
link |
Pyfiglet: Make large and unique letters out of ordinary text in Python |
link |
link |
Title |
Explanation |
Code |
Stacer: Visualize the History of your CPU and Memory Usage |
link |
|
Title |
Explanation |
Code |
sherlock: Search for a Username Across 298 Popular Website |
link |
|
getme forecast: Get the Weather Forecast Through your Terminal |
link |
link |
Title |
Explanation |
Code |
notion-py: Access and Edit your Notion App Using Python |
link |
link |
organize: Automate Organizing Files with Command Line |
link |
|
Schedule: Schedule your Python Functions to Run At a Specific Time |
link |
link |
notify-send: Send a Desktop Notification after Finishing Executing a File |
link |
link |
isort: Automatically Sort your Python Imports in 1 Line of Code |
link |
link |
knockknock: Receive an email when your code finishes executing |
link |
link |
Title |
Explanation |
Code |
Github CLI: Brings GitHub to your Terminal |
link |
link |
Pull one file from another branch using git |
link |
|
Download a file on Github using wget |
link |
link |
github1s: Read GitHub Code with VS Code on your Browser in One Second |
link |
|
PyGithub: Manage your Github resources using Python |
link |
link |
Astral: Organize your Github stars with ease |
link |
|
Title |
Explanation |
Code |
Box: Using Dot Notation to Access Keys in a Python Dictionary |
link |
link |
virtualenv-clone: Create a Copy of a Virtual Environment |
link |
link |
Jupyter Notebook
Title |
Explanation |
Code |
nbdime: Better Version Control for Jupyter Notebook |
link |
|
display in IPython: Display math equations in Jupyter Notebook |
link |
link |
Reuse the notebook to run the same code across different data |
link |
|
ngrok: Create a Public Server for your Jupyter Notebook in 1 Line of Code |
link |
link |