ChatGPT Bias Exploration Tool

As part of this week’s AI Digital Detox, I built a little tool to help explore some of the possible bias in ChatGPT 3.5 Turbo responses. You put a sentence stem in on the right and it’ll populate a table on the right with 3 possible responses. The interaction is modeled after what Dawn Lu and Nina Rimsky did in Investigating Bias Representations in Llama 2 Chat via Activation Steering.

I find it pretty endlessly interesting to put different statements in there and see what it’ll spit back out. Having it do three responses makes it easier to see a bit more deeply into the model without having to put the same stem in three times. You can get some darkly serious things and you can get some stuff that’s a bit lighter.

I end up wondering about examples like the one below. It feels like OpenAI might be working really hard to obscure bias in the data. I think that’s good in some ways. In other ways I worry that obvious bias gets paved over and that enables more insidious bias to persist across a larger and less obvious frame of interactions. I’m also still not sure about whether AI should have a different level of restrictions than the Internet. I’m not sure what would happen if people expected search engines to be filtered in a similar way. I understand the desire. I’m just not sure where it leads. Luckily, I’m not in charge of that and so I don’t have to think too hard.

Technical stuff

I tweaked our previous chatbot to use gpt 3.5 turbo rather than 4. I did that for a couple of reasons. I think it has less guardrails and so you get a clearer idea of the data rather than blocks. This interaction has less need for sophistication than the persona example. It’s also the kind of thing I tend to keep throwing stuff at. So a combination of simpler requests and likely more requests means I should use a cheaper API. 3.5 is way cheaper than GPT 4 (0.06/0.0015 per 1k tokens of output). I put a ton of prompts in there and maybe billed 7 cents.

API request

You can see the request I’m using below. Note I’ve also turned the temperature up to 1.4 to encourage more creative responses. The n setting gives me back that number of responses and I’m limiting the tokens to 25 to save money and space. We just want sentence completions, not paragraphs.

async function fetchReply(prompt){
    const response = await openai.createChatCompletion({
        model: 'gpt-3.5-turbo',
        temperature: 1.4,
        messages: conversationArr,
        n:3,
        max_tokens: 25,
    }) 
    conversationArr.push(response.data.choices[0].message)
    biasDataWriter(prompt, response.data.choices)
    renderTypewriterText('Add another phrase stem?')
}

Writing the table data

I often chunk out different pieces of the project to codepen for testing/exploration. That’s also convenient if I write a blog post about it or if I want to search CodePen for how I solved a particular problem in the distant past.¹

In this one, I’m just writing an array of data to a table as rows. I wanted the last column to contain a button and a drop down rating option. That led me to use a string literal and the insertAdjacentHTML option. I find that simpler than trying to build the table by nodes or whatever. This example also has a working delete button.

See the Pen
javascript table games by Tom (@twwoodward)
on CodePen.

Copying and cleaning the table

Now I wanted to make it relatively easy to get the sample data over to a form for submission as part of the activity. My first instinct was to pass all the stuff into a URL as variables. I decided not to do that because of concerns about the amount of possible data and it just being too time consuming. I then decide to make a better button for copying the data for pasting into a rich text field. That took me a while to get right.

It was another thing I worked on in Codepen. You can see the copying portion of the javascript and a little cleaner to remove the delete button and write the rating data to the cell so that the copy/paste would be nice and neat.

See the Pen
AI table transfer by Tom (@twwoodward)
on CodePen.

¹ The distant past is often as much as a week or two ago.

Technical stuff

API request

Writing the table data

Copying and cleaning the table

Leave a ReplyCancel reply