تحليل المشاعر باستخدام لغة بايثون

هل تعرف ما هو أفضل ما في كونك مهندسًا؟ ببساطة، يمكنك بناء الأشياء. إنها أشبه بقوة خارقة. في إحدى الأمسيات الممطرة، خطرت لي فكرة عشوائية لإنشاء تمثيل مرئي لمشاعر نص مُدخل، باستخدام وجه مبتسم يتغير تعبيره بناءً على مدى إيجابية النص. فكلما كان النص أكثر إيجابية، بدا الوجه المبتسم أكثر سعادة. هناك بعض المفاهيم الشيقة التي يمكن تعلمها هنا، لذا دعني أشرح لك كيفية عمل هذا المشروع!

المتطلبات الأساسية

ستحتاج إلى الحزم التالية:

مكتبة Tkinter المخصصة
مكتبة OpenCV بايثون
مكتبة Torch
المحولات

باستخدام uv، يمكنك إضافة التبعيات باستخدام الأمر التالي:

uv add customtkinter opencv-python torch transformers

ملاحظة: عند استخدام uv مع torch، يجب تحديد فهرس الحزمة. على سبيل المثال، إذا كنت ترغب في استخدام cuda، فستحتاج إلى ما يلي في ملف pyproject.toml الخاص بك:

[[tool.uv.index]]
name = "pytorch-cu118"
url = "https://download.pytorch.org/whl/cu118"
explicit = true

[tool.uv.sources]
torch = [{ index = "pytorch-cu118" }]
torchvision = [{ index = "pytorch-cu118" }]

هيكل تخطيط واجهة المستخدم

في هذا النوع من المشاريع، أفضل دائمًا البدء بتصميم سريع لمكونات واجهة المستخدم. في هذه الحالة، سيكون التصميم بسيطًا للغاية؛ يوجد مربع نص بسطر واحد في الأعلى يملأ العرض، وأسفله لوحة الرسم التي تملأ المساحة المتبقية. هنا سنرسم الوجه المبتسم 🙂

باستخدام مكتبة customtkinter، يمكننا كتابة التصميم على النحو التالي:

import customtkinter

class App(customtkinter.CTk):
    def __init__(self) -> None:
        super().__init__()

        self.title("Sentiment Analysis")
        self.geometry("800x600")

        self.grid_columnconfigure(0, weight=1)
        self.grid_rowconfigure(0, weight=0)
        self.grid_rowconfigure(1, weight=1)

        self.sentiment_text_var = customtkinter.StringVar(master=self, value="Love")

        self.textbox = customtkinter.CTkEntry(
            master=self,
            corner_radius=10,
            font=("Consolas", 50),
            justify="center",
            placeholder_text="Enter text here...",
            placeholder_text_color="gray",
            textvariable=self.sentiment_text_var,
        )
        self.textbox.grid(row=0, column=0, padx=20, pady=20, sticky="nsew")
        self.textbox.focus()

        self.image_display = CTkImageDisplay(self)
        self.image_display.grid(row=1, column=0, padx=20, pady=20, sticky="nsew")

لسوء الحظ، لا يوجد حل جاهز ومناسب لرسم إطارات OpenCV على عناصر واجهة المستخدم، لذا قمتُ بإنشاء CTkImageDisplay خاص بي. باختصار، أستخدم مكون CTKLabel وأفصل عملية تحديث الصورة عن عملية واجهة المستخدم الرسومية باستخدام قائمة انتظار للمزامنة.

رمز تعبيري إجرائي

بالنسبة لرمز الوجه المبتسم، يمكننا استخدام صور منفصلة مختلفة لنطاقات المشاعر، كأن نحتفظ بثلاث صور للسلبية والمحايدة والإيجابية. مع ذلك، للحصول على تمثيل أدق للمشاعر، سنحتاج إلى المزيد من الصور، وهو ما يصبح غير عملي، ولن نتمكن من إضافة تأثيرات انتقالية بين هذه الصور.

يتمثل الأسلوب الأمثل في توليد صورة الوجه المبتسم إجرائيًا أثناء التشغيل. ولتبسيط الأمر، سنغير فقط لون خلفية الوجه المبتسم، بالإضافة إلى انحناءة فمه.

أولاً، نحتاج إلى إنشاء صورة لوحة الرسم، والتي يمكننا رسم الوجه المبتسم عليها.

def create_sentiment_image(positivity: float, image_size: tuple[int, int]) -> np.ndarray:
    """
    Generates a sentiment image based on the positivity score.
    This draws a smiley with its expression based on the positivity score.

    Args:
        positivity: A float representing the positivity score in the range [-1, 1].
        image_size: A tuple representing the size of the image (width, height).

    Returns:
        A string representing the path to the generated sentiment image.
    """
    width, height = image_size
    frame = np.zeros((height, width, 4), dtype=np.uint8)

    # TODO: draw smiley

    return frame

يجب أن تكون الصورة شفافة خارج حدود الوجه المبتسم، لذا نحتاج إلى 4 قنوات لونية، ستكون الأخيرة منها قناة ألفا. بما أن صور OpenCV تُمثَّل كمصفوفات NumPy تحتوي على أعداد صحيحة غير مُوقَّعة من 8 بت، فإننا نُنشئ الصورة باستخدام نوع البيانات np.uint8. تذكر أن المصفوفات تُخزَّن بدءًا من المحور y، لذا يُمرَّر ارتفاع image_size أولًا عند إنشاء المصفوفة.

يمكننا تحديد بعض المتغيرات لأبعاد وألوان وجهنا المبتسم والتي ستكون مفيدة أثناء الرسم.

color_outline = (80,) * 3 + (255,)  # gray
    thickness_outline = min(image_size) // 30
    center = (width // 2, height // 2)
    radius = min(image_size) // 2 - thickness_outline

يجب أن يكون لون خلفية الوجه المبتسم أحمر للمشاعر السلبية وأخضر للمشاعر الإيجابية. ولتحقيق ذلك مع سطوع موحد خلال الانتقال، يمكننا استخدام نظام ألوان HSV وتعديل درجة اللون ببساطة بين 0% و30%.

color_bgr = color_hsv_to_bgr(
    hue=(positivity + 1) / 6, # positivity [-1,1] -> hue [0,1/3]
    saturation=0.5,
    value=1,
)
color_bgra = color_bgr + (255,)

علينا التأكد من جعل اللون معتمًا تمامًا بإضافة قيمة ألفا بنسبة 100% في القناة الرابعة. الآن يمكننا رسم دائرة الوجه المبتسم مع إطار.

cv2.circle(frame, center, radius, color_bgra, -1) # Fill
cv2.circle(frame, center, radius, color_outline, thickness_outline) # Border

الأمور تسير على ما يرام حتى الآن، والآن يمكننا إضافة العينين. نحسب إزاحة من المركز إلى اليسار واليمين لوضع العينين بشكل متناظر.

# calculate the position of the eyes
eye_radius = radius // 5
eye_offset_x = radius // 3
eye_offset_y = radius // 4
eye_left = (center[0] - eye_offset_x, center[1] - eye_offset_y)
eye_right = (center[0] + eye_offset_x, center[1] - eye_offset_y)

cv2.circle(frame, eye_left, eye_radius, color_outline, -1)
cv2.circle(frame, eye_right, eye_radius, color_outline, -1)

والآن ننتقل إلى الجزء الأصعب، وهو الفم. سيكون شكل الفم قطعًا مكافئًا مُقاسًا بشكل مناسب. يمكننا ببساطة ضرب القطع المكافئ القياسي y=x² في قيمة الإيجابية.

في النهاية، سيتم رسم الخط باستخدام دالة cv2.polylines، التي تتطلب أزواج إحداثيات xy. باستخدام دالة np.linspace، نُنشئ 100 نقطة على المحور x، ثم نستخدم دالة polyval لحساب قيم y المقابلة للمضلع.

# mouth parameters
mouth_wdith = radius // 2
mouth_height = radius // 3
mouth_offset_y = radius // 3
mouth_center_y = center[1] + mouth_offset_y + positivity * mouth_height // 2
mouth_left = (center[0] - mouth_wdith, center[1] + mouth_offset_y)
mouth_right = (center[0] + mouth_wdith, center[1] + mouth_offset_y)

# calculate points of polynomial for the mouth
ply_points_t = np.linspace(-1, 1, 100)
ply_points_y = np.polyval([positivity, 0, 0], ply_points_t) # y=positivity*x²

ply_points = np.array(
    [
        (
            mouth_left[0] + i * (mouth_right[0] - mouth_left[0]) / 100,
            mouth_center_y - ply_points_y[i] * mouth_height,
        )
        for i in range(len(ply_points_y))
    ],
    dtype=np.int32,
)

# draw the mouth
cv2.polylines(
    frame,
    [ply_points],
    isClosed=False,
    color=color_outline,
    thickness=int(thickness_outline * 1.5),
)

وهكذا، أصبح لدينا وجه مبتسم إجرائي!

لاختبار هذه الدالة، قمت بكتابة حالة اختبار سريعة باستخدام pytest تقوم بحفظ الوجوه الضاحكة بدرجات مشاعر مختلفة:

from pathlib import Path

import cv2
import numpy as np
import pytest

from sentiment_analysis.utils import create_sentiment_image

IMAGE_SIZE = (512, 512)


@pytest.mark.parametrize(
    "positivity",
    np.linspace(-1, 1, 5),
)
def test_sentiments(visual_output_path: Path, positivity: float) -> None:
    """
    Test the smiley face generation.
    """
    image = create_sentiment_image(positivity, IMAGE_SIZE)

    assert image.shape == (IMAGE_SIZE[1], IMAGE_SIZE[0], 4)

    # assert center pixel is opaque
    assert image[IMAGE_SIZE[1] // 2, IMAGE_SIZE[0] // 2, 3] == 255

    # save the image for visual inspection
    positivity_num_0_100 = int((positivity + 1) * 50)
    image_fn = f"smiley_{positivity_num_0_100}.png"
    cv2.imwrite(str(visual_output_path / image_fn), image)

تحليل المشاعر

لتحديد مدى سعادة أو حزن رمزنا التعبيري، نحتاج أولاً إلى تحليل النص المدخل وحساب المشاعر. تُسمى هذه المهمة تحليل المشاعر. سنستخدم نموذج Transformer مُدرَّب مسبقًا للتنبؤ بدرجة تصنيف للفئات: سلبي، محايد، وإيجابي. بعد ذلك، يمكننا دمج درجات الثقة لهذه الفئات لحساب درجة المشاعر النهائية التي تتراوح بين -1 و+1.

باستخدام مسار المعالجة من مكتبة Transformers، يمكننا تحديد مسار معالجة يعتمد على نموذج مُدرَّب مسبقًا من Hugging Face. باستخدام المعامل top_k، يمكننا تحديد عدد نتائج التصنيف المطلوب إرجاعها. ولأننا نريد جميع الفئات الثلاث، فقد حددناه على 3.

from transformers import pipeline

model_name = "cardiffnlp/twitter-roberta-base-sentiment"

sentiment_pipeline = pipeline(
    task="sentiment-analysis",
    model=model_name,
    top_k=3,
)

لتشغيل تحليل المشاعر، يمكننا استدعاء مسار المعالجة باستخدام وسيط نصي. سيعيد هذا قائمة بالنتائج تحتوي على عنصر واحد، لذا نحتاج إلى استخراج العنصر الأول.

results = self.sentiment_pipeline(text)

# [
#     [
#         {"label": "LABEL_2", "score": 0.5925878286361694},
#         {"label": "LABEL_1", "score": 0.3553399443626404},
#         {"label": "LABEL_0", "score": 0.05207228660583496},
#     ]
# ]

for label_score_dict in results[0]:
    label: str = label_score_dict["label"]
    score: float = label_score_dict["score"]

يمكننا تحديد خريطة تصنيف توضح لنا كيف يؤثر كل مستوى ثقة على المشاعر النهائية. ثم يمكننا تجميع المشاعر الإيجابية عبر جميع مستويات الثقة.

label_mapping = {"LABEL_0": -1, "LABEL_1": 0, "LABEL_2": 1}

positivity = 0.0
for label_score_dict in results[0]:
    label: str = label_score_dict["label"]
    score: float = label_score_dict["score"]

    if label in label_mapping:
        positivity += label_mapping[label] * score

لاختبار مسار المعالجة الخاص بنا، يمكننا تغليفه في فئة وإجراء بعض الاختبارات باستخدام pytest. نتحقق من أن الجمل ذات المشاعر الإيجابية تحصل على درجة أكبر من الصفر، والعكس صحيح؛ فالجمل ذات المشاعر السلبية يجب أن تحصل على درجة أقل من الصفر.

import pytest

from sentiment_analysis.sentiment_pipeline import SentimentAnalysisPipeline


@pytest.fixture
def sentiment_pipeline() -> SentimentAnalysisPipeline:
    """
    Fixture to create a SentimentAnalysisPipeline instance.
    """
    return SentimentAnalysisPipeline(
        model_name="cardiffnlp/twitter-roberta-base-sentiment",
        label_mapping={"LABEL_0": -1.0, "LABEL_1": 0.0, "LABEL_2": 1.0},
    )


@pytest.mark.parametrize(
    "text_input",
    [
        "I love this!",
        "This is awesome!",
        "I am so happy!",
        "This is the best day ever!",
        "I am thrilled with the results!",
    ],
)
def test_sentiment_analysis_pipeline_positive(
    sentiment_pipeline: SentimentAnalysisPipeline, text_input: str
) -> None:
    """
    Test the sentiment analysis pipeline with a positive input.
    """
    assert (
        sentiment_pipeline.run(text_input) > 0.0
    ), "Expected positive sentiment score."


@pytest.mark.parametrize(
    "text_input",
    [
        "I hate this!",
        "This is terrible!",
        "I am so sad!",
        "This is the worst day ever!",
        "I am disappointed with the results!",
    ],
)
def test_sentiment_analysis_pipeline_negative(
    sentiment_pipeline: SentimentAnalysisPipeline, text_input: str
) -> None:
    """
    Test the sentiment analysis pipeline with a negative input.
    """
    assert (
        sentiment_pipeline.run(text_input) &lt; 0.0
    ), "Expected negative sentiment score."

التكامل

أما الجزء الأخير المتبقي، فهو ببساطة ربط مربع النص بنظام تحليل المشاعر لدينا وتحديث الصورة المعروضة بالرمز التعبيري المناسب. يمكننا إضافة مسار تتبع إلى متغير النص، مما سيؤدي إلى تشغيل نظام تحليل المشاعر في سلسلة عمليات جديدة تُدار بواسطة مجموعة سلاسل عمليات، وذلك لمنع تجميد واجهة المستخدم أثناء تشغيل النظام.

class App(customtkinter.CTk):
    def __init__(self, sentiment_analysis_pipeline: SentimentAnalysisPipeline) -> None:
        super().__init__()
        self.sentiment_analysis_pipeline = sentiment_analysis_pipeline

        ...

        self.sentiment_image = None

        self.sentiment_text_var = customtkinter.StringVar(master=self, value="Love")
        self.sentiment_text_var.trace_add("write", lambda *_: self.on_sentiment_text_changed())

        ...

        self.update_sentiment_pool = ThreadPool(processes=1)

        self.on_sentiment_text_changed()

    def on_sentiment_text_changed(self) -> None:
        """
        Callback function to handle text changes in the textbox.
        """
        new_text = self.sentiment_text_var.get()

        self.update_sentiment_pool.apply_async(
            self._update_sentiment,
            (new_text,),
        )

    def _update_sentiment(self, new_text: str) -> None:
        """
        Update the sentiment image based on the new text input.
        This function is run in a separate process to avoid blocking the main thread.

        Args:
            new_text: The new text input from the user.
        """
        positivity = self.sentiment_analysis_pipeline.run(new_text)

        self.sentiment_image = create_sentiment_image(
            positivity,
            self.image_display.display_size,
        )

        self.image_display.update_frame(self.sentiment_image)


def main() -> None:
    # Initialize the sentiment analysis pipeline
    sentiment_analysis = SentimentAnalysisPipeline(
        model_name="cardiffnlp/twitter-roberta-base-sentiment",
        label_mapping={"LABEL_0": -1, "LABEL_1": 0, "LABEL_2": 1},
    )

    app = App(sentiment_analysis)
    app.mainloop()

وأخيرًا، يتم عرض الرمز التعبيري في التطبيق ويتغير ديناميكيًا وفقًا لمشاعر النص المدخل!

مرتبط

اكتشاف المزيد من بايثون العربي

اشترك للحصول على أحدث التدوينات المرسلة إلى بريدك الإلكتروني.

المتطلبات الأساسية

هيكل تخطيط واجهة المستخدم

رمز تعبيري إجرائي

تحليل المشاعر

التكامل

شارك هذا الموضوع:

مرتبط

اكتشاف المزيد من بايثون العربي

اترك تعليقاً إلغاء الرد

اكتشاف المزيد من بايثون العربي