This is the code repository for Data Labeling in Machine Learning with Python, published by Packt.
Explore modern ways to prepare labeled data for training and fine-tuning ML and generative AI models
Data labeling is the invisible hand that guides the power of artificial intelligence and machine learning. In today’s data-driven world, mastering data labeling is not just an advantage, it’s a necessity. Data Labeling in Machine Learning with Python empowers you to unearth value from raw data, create intelligent systems, and influence the course of technological evolution.
This book covers the following exciting features:
- Excel in exploratory data analysis (EDA) for tabular, text, audio, video, and image data
- Understand how to use Python libraries to apply rules to label raw data
- Discover data augmentation techniques for adding classification labels
- Leverage K-means clustering to classify unsupervised data
- Explore how hybrid supervised learning is applied to add labels for classification
- Master text data classification with generative AI
- Detect objects and classify images with OpenCV and YOLO
- Uncover a range of techniques and resources for data annotation
If you feel this book is for you, get your copy today!
All of the code is organized into folders.
The code will look like the following:
storage:
backend: MINIO
minio:
bucket: pachyderm
Following is what you need for this book: This book starts with the introduction of exploratory data analysis using Python libraries and then covers the data labeling for tabular data, text data, image data, audio data using heuristics, semi-supervised learning, unsupervised learning and data augmentation. Finally, this book also delves into best practices and tools in the industry for data labeling.
With the following software and hardware list you can run all code files present in the book (Chapter 1-7).
Chapter | Software required | OS required |
---|---|---|
1-7 | AWS CLI (aws) | Any OS |
1-7 | Red Hat OpenShift Client (oc) | Any OS |
Vijaya Kumar Suda is a seasoned data and AI professional boasting over two decades of expertise collaborating with global clients. Having resided and worked in diverse locations such as Switzerland, Belgium, Mexico, Bahrain, India, Canada, and the USA, Vijaya has successfully assisted customers spanning various industries. Currently serving as a senior data and AI consultant at Microsoft, he is instrumental in guiding industry partners through their digital transformation endeavors using cutting-edge cloud technologies and AI capabilities. His proficiency encompasses architecture, data engineering, machine learning, generative AI, and cloud solutions.