Perspective API with Python – mitigate toxicity and ensure healthy dialogue online

Intruduction

To start your adventure with the Perspective API, you need an ordinary Google account and follow these simple instructions:
https://developers.perspectiveapi.com/s/docs-get-started?language=en_US
The guide is also available in Polish: https://developers.perspectiveapi.com/s/docs-get-started?language=pl

Once you have received the confirmation email, you can activate your API, connect it to the Google project and get the API key, which is essential to take advantage of this wonderful tool.
Don’t forget to install the googleapiclient library:

pip install google-api-python-client.

Now we are good to go.

First attempt to measure TOXICITY (in my city)

Lets start with a simple code that measures the toxicity of the sentence „You’re stick in the mud”. This code is a slight modification of the sample code from the Perspective API site EN or PL.

from googleapiclient import discovery
import json

API_KEY = Your_API_Key

# start perspectiveAPI
client = discovery.build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False,
)

# set sentence and attribute to be measured
sentence = "you're stick in the mud"
attribute = "TOXICITY"

# request analyze perspectiveapi
analyze_request = {
  'comment': { 'text': sentence},
  'requestedAttributes': {attribute: {}}
}

# get response from perspectiveAPI
response = client.comments().analyze(body=analyze_request).execute()

# print full JSON output 
print(json.dumps(response, indent=2))

The output is a JSON file containing a variety of information including overall toxicity (0.3827457) and the detected language (en).

{
  "attributeScores": {
    "TOXICITY": {
      "spanScores": [
        {
          "begin": 0,
          "end": 23,
          "score": {
            "value": 0.3827457,
            "type": "PROBABILITY"
          }
        }
      ],
      "summaryScore": {
        "value": 0.3827457,
        "type": "PROBABILITY"
      }
    }
  },
  "languages": [
    "en"
  ],
  "detectedLanguages": [
    "en"
  ]
}

Now we will try the „yo mama” sentence and make the output more readable by extracting information from the JSON response.

from googleapiclient import discovery
import json

API_KEY = Your_API_Key

# start perspectiveAPI
client = discovery.build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False,
)

# set sentence and attribute to be measured
sentence = "yo mama"
attribute  = "TOXICITY"

# request analyze perspectiveAPI
analyze_request = {
  'comment': { 'text': sentence},
  'requestedAttributes': {attribute : {}}
}

# get response from perspectiveAPI
response = client.comments().analyze(body=analyze_request).execute()

# get Summary score of the single attribute and language, print it
SummaryScore = response["attributeScores"][attribute ]["summaryScore"]["value"]
Language = response["detectedLanguages"]

print(f"For the sentence: {sentence}\n {attribute } score is: {SummaryScore}\n language is: {Language}")

We get 0.1571 toxicity and the detected language is Spanish (es), pretty cool.

For the sentence: yo mama
 TOXICITY score is: 0.15711457
 language is: ['es']

Let’s make one last change to get a percentage response and change the sentence to „I’d call you dumb as a rock, but at least a rock can hold a door open” (thank you Reddit for the inspiration).

from googleapiclient import discovery
import json

API_KEY = Your_API_Key

# start perspectiveAPI
client = discovery.build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False,
)

# set sentence, attribute to be measured
sentence = "I'd call you dumb as a rock, but at least a rock can hold a door open"
attribute  = "TOXICITY"

# request analyze perspectiveAPI
analyze_request = {
  'comment': { 'text': sentence},
  'requestedAttributes': {attribute : {}}
}

# get response from perspectiveAPI
response = client.comments().analyze(body=analyze_request).execute()

# get Summayry score of the single attribute and language, print it
SummaryScore = response["attributeScores"][attribute ]["summaryScore"]["value"]
PercentSummaryScore = round(100 * SummaryScore, 1)
Language = response["detectedLanguages"]

print(f"For the sentence: {sentence}\n {attribute } score is: {PercentSummaryScore}%\n language is: {Language}")

Output:

For the sentence: I'd call you dumb as a rock, but at least a rock can hold a door open
 TOXICITY score is: 68.6%
 language is: ['en']

More attributes at once

Not only toxicity can be measured. There is more!

Attribute name	Description	Available Languages
TOXICITY	A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion.	Arabic (ar), Chinese (zh), Czech (cs), Dutch (nl), English (en), French (fr), German (de), Hindi (hi), Hinglish (hi-Latn), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Polish (pl), Portuguese (pt), Russian (ru), Spanish (es), Swedish (sv)
SEVERE_TOXICITY	A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words.	ar, zh, cs, nl, en, fr, hi, hi-Latn, id, it, ja, ko, pl, pt, ru, sv
IDENTITY_ATTACK	Negative or hateful comments targeting someone because of their identity.	ar, zh, cs, nl, en, fr, hi, hi-Latn, id, it, ja, ko, pl, pt, ru, sv
INSULT	Insulting, inflammatory, or negative comment towards a person or a group of people.	ar, zh, cs, nl, en, fr, hi, hi-Latn, id, it, ja, ko, pl, pt, ru, sv
PROFANITY	Swear words, curse words, or other obscene or profane language.	ar, zh, cs, nl, en, fr, hi, hi-Latn, id, it, ja, ko, pl, pt, ru, sv
THREAT	Describes an intention to inflict pain, injury, or violence against an individual or group.	ar, zh, cs, nl, en, fr, hi, hi-Latn, id, it, ja, ko, pl, pt, ru, sv

PerspecticeAPI attributes

Unleash them all.

from googleapiclient import discovery
import json

API_KEY = Your_API_Key

# start perspectiveAPI
client = discovery.build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False,
)

# set sentence, attribute or attributes to be measured
sentence = "I'd call you dumb as a rock, but at least a rock can hold a door open"
attributes = ["TOXICITY","SEVERE_TOXICITY", "IDENTITY_ATTACK", "INSULT", "PROFANITY", "THREAT"]

# request perspectiveAPI to analyze all atrributes
analyze_request = {
  'comment': { 'text': sentence},
  'requestedAttributes': {attr: {} for attr in attributes}
}

# get response from perspectiveAPI
response = client.comments().analyze(body=analyze_request).execute()

Language = response["detectedLanguages"]
print(f"For the sentence: {sentence}\nin language: {Language}")

# get the percentage value of each attribute
for attribute in attributes:
  AttributeScore = response["attributeScores"][attribute]["summaryScore"]["value"]
  PercentAttributeScore = round(100 * AttributeScore, 1)
  print(f"{attribute}: {PercentAttributeScore}%")

And finally, we know that our sentence is pretty toxic and insulting, but not a threat or an attack on identity.

For the sentence: I'd call you dumb as a rock, but at least a rock can hold a door open
in language: ['en']
TOXICITY: 68.6%
SEVERE_TOXICITY: 3.1%
IDENTITY_ATTACK: 3.1%
INSULT: 72.2%
PROFANITY: 38.8%
THREAT: 1.3%

The icing on the cake

Are you still with me? Good, because there are also experimental attributes that are only available in English. Let’s check out FLIRTATION.

from googleapiclient import discovery
import json

API_KEY = Your_API_Key

# start perspectiveAPI
client = discovery.build(
  "commentanalyzer",
  "v1alpha1",
  developerKey=API_KEY,
  discoveryServiceUrl="https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1",
  static_discovery=False,
)

# set sentence, attribute or attributes to be measured
sentence = "You're so charming when you smile that I find it hard to focus"
attribute = "FLIRTATION"

# request analyze perspectiveAPI
analyze_request = {
  'comment': { 'text': sentence},
  'requestedAttributes': {attribute: {}}
}

# get response from perspectiveAPI
response = client.comments().analyze(body=analyze_request).execute()

Language = response["detectedLanguages"]
print(f"For the sentence: {sentence}")

AttributeScore = response["attributeScores"][attribute]["summaryScore"]["value"]
PercentAttributeScore = round(100 * AttributeScore, 1)
print(f"{attribute}: {PercentAttributeScore}%")

Output:

For the sentence: You're so charming when you smile that I find it hard to focus
FLIRTATION: 90.4%

Summary

As you can see, using the Perspective API is quite simple and can certainly help to moderate online discussions or alert you to toxic comments on various platforms. Have fun with low toxicity.

0 0 votes

Ocena artykułu