Zero-Shot Text Classification using HuggingFace Model - GeeksforGeeks (2024)

Zero-shot text classification is a groundbreaking technique that allows for categorizing text into predefined labels without any prior training on those specific labels. This method is particularly useful when labeled data is scarce or unavailable. Leveraging the HuggingFace Transformers library, we can easily implement zero-shot classification using pre-trained models. In this article, we’ll explore how to use the HuggingFace pipeline for zero-shot classification and create an interactive web interface using Gradio.

Understanding Zero-Shot Classification

Zero-shot classification relies on pre-trained language models that understand language context deeply. These models can be prompted with new tasks, such as classification, by providing text and candidate labels. The model evaluates the text against the labels and assigns probabilities to each label based on its understanding.

HuggingFace Transformers

The HuggingFace Transformers library provides an easy-to-use interface for various natural language processing tasks, including zero-shot classification. One of the most popular models for this task is facebook/bart-large-mnli, which is based on the BART model and fine-tuned on the Multi-Genre Natural Language Inference (MNLI) dataset.

Implementing Zero-Shot Classification

Step 1: Install HuggingFace Transformers

First, ensure that you have the HuggingFace Transformers library installed:

pip install transformers

Step 2: Initialize the Zero-Shot Classification Pipeline

Next, we initialize the zero-shot classification pipeline using the facebook/bart-large-mnli model:

from transformers import pipeline

# Initialize the zero-shot classification pipeline
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

Step 3: Perform Classification

We can now classify a sample text into predefined labels. Here’s an example:

text = "The company's quarterly earnings increased by 20%, exceeding market expectations."
candidate_labels = ["finance", "sports", "politics", "technology"]

result = classifier(text, candidate_labels)
print(result)

Code of Zero-Shot Classification

Python
from transformers import pipeline# Initialize the zero-shot classification pipelineclassifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")text = "The company's quarterly earnings increased by 20%, exceeding market expectations."candidate_labels = ["finance", "sports", "politics", "technology"]result = classifier(text, candidate_labels)print(result)

Output:

{'sequence': "The company's quarterly earnings increased by 20%, exceeding market expectations.", 'labels': ['finance', 'technology', 'sports', 'politics'], 'scores': [0.6282334327697754, 0.22457945346832275, 0.08779555559158325, 0.05939162150025368]}

Evaluating Zero-Shot Classification

To evaluate the performance, you can compare the predicted labels with true labels using metrics like precision, recall, and F1-score. Here’s an example using a small dataset:

Python
from sklearn.metrics import classification_reporttexts = ["The stock market is up today.", "The new movie is a great thriller.", "The football match was exciting."]true_labels = ["finance", "entertainment", "sports"]predicted_labels = []for text in texts: result = classifier(text, candidate_labels=["finance", "entertainment", "sports"]) predicted_labels.append(result['labels'][0])print(classification_report(true_labels, predicted_labels))

Output:

 precision recall f1-score support

entertainment 0.50 1.00 0.67 1
finance 1.00 1.00 1.00 1
sports 0.00 0.00 0.00 1

accuracy 0.67 3
macro avg 0.50 0.67 0.56 3
weighted avg 0.50 0.67 0.56 3

Creating an Interactive Interface with Gradio

Gradio provides an easy way to create web interfaces for machine learning models. We can use Gradio to build an interactive interface for zero-shot classification.

Step 1: Install Gradio

First, install Gradio:

pip install gradio

Step 2: Define the Classification Function

Create a function that takes text and labels as inputs and returns the classification results:

import gradio as gr
from transformers import pipeline

# Initialize the zero-shot classification pipeline
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

# Define the classification function
def classify_text(text, labels):
labels = labels.split(",")
result = classifier(text, candidate_labels=labels)
return {label: score for label, score in zip(result["labels"], result["scores"])}

Step 3: Create the Gradio Interface

Set up the Gradio interface with text inputs for the sentence and labels, and a label output:

# Create the Gradio interface
interface = gr.Interface(
fn=classify_text,
inputs=[
gr.inputs.Textbox(lines=2, placeholder="Enter text here..."),
gr.inputs.Textbox(lines=1, placeholder="Enter comma-separated labels here...")
],
outputs=gr.outputs.Label(num_top_classes=3),
title="Zero-Shot Text Classification",
description="Classify text into labels without training data.",
)

# Launch the interface
interface.launch()

Complete Code for Creating an Interactive Interface with Gradio

Python
import gradio as grfrom transformers import pipeline# Initialize the zero-shot classification pipelineclassifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")# Define the classification functiondef classify_text(text, labels): labels = labels.split(",") result = classifier(text, candidate_labels=labels) return {label: score for label, score in zip(result["labels"], result["scores"])}# Create the Gradio interfaceinterface = gr.Interface( fn=classify_text, inputs=[ gr.Textbox(lines=2, placeholder="Enter text here..."), gr.Textbox(lines=1, placeholder="Enter comma-separated labels here...") ], outputs=gr.Label(num_top_classes=3), title="Zero-Shot Text Classification", description="Classify text into labels without training data.",)# Launch the interfaceinterface.launch()

Output:

Zero-Shot Text Classification using HuggingFace Model - GeeksforGeeks (1)

Gradio Interface for Interactive Zero Shot Classification

Conclusion

Zero-shot text classification using the HuggingFace Transformers library offers a flexible and powerful way to categorize text without the need for labeled training data. By leveraging models like facebook/bart-large-mnli, we can achieve high accuracy in various classification tasks. Additionally, integrating this functionality with Gradio allows for easy deployment of interactive web interfaces, making it accessible to a wider audience. This approach opens up numerous possibilities for real-world applications where labeled data is not readily available.



`; tags.map((tag)=>{ let tag_url = `videos/${getTermType(tag['term_id__term_type'])}/${tag['term_id__slug']}/`; tagContent+=``+ tag['term_id__term_name'] +``; }); tagContent+=`
Zero-Shot Text Classification using HuggingFace Model - GeeksforGeeks (2024)
Top Articles
Latest Posts
Article information

Author: Prof. Nancy Dach

Last Updated:

Views: 6318

Rating: 4.7 / 5 (57 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Prof. Nancy Dach

Birthday: 1993-08-23

Address: 569 Waelchi Ports, South Blainebury, LA 11589

Phone: +9958996486049

Job: Sales Manager

Hobby: Web surfing, Scuba diving, Mountaineering, Writing, Sailing, Dance, Blacksmithing

Introduction: My name is Prof. Nancy Dach, I am a lively, joyous, courageous, lovely, tender, charming, open person who loves writing and wants to share my knowledge and understanding with you.