Improving model output
Improving-model-output.Rmd
ChatGPT is not perfect at extracting information and can often make mistakes, but there are several routes to improve the output and to check the output.
- Hand-Coding Data
- Changing Model Version
- Few-Shot Prompting
- Custom Prompts
- Post-Processing Output
- Combination of Approaches
Hand-Coding Data
It is generally good practice to hand-code a random sample of your
data. This can help in several ways. First, it gives you an idea of what
type of information you can extract from your data. With this
information in hand, you can decide what to input to functions like
get_bio()
. Second, hand-coding will give you a dataset with
which you can evaluate the performance of output from ChatGPT. Third,
you can use these data as inputs for few-shot
prompting which can help improve output from ChatGPT.
Changing Model Version
biographR
defaults to gpt-3.5-turbo
because
it is less expensive than gpt-4
and generally performs
quite well. However, other models like gpt-4
and
gpt-4-1106-preview
can give better performance. In
particular, GPT-4 models perform better on function calling, according
to OpenAI
and, anecdotally, appear to do a better job extracting subtler
information like gender from pronouns. You can change which model you’re
using in biographR
through the openai_model
argument.
Few-Shot Prompting
ChatGPT output can also be improved by providing ChatGPT with exemplars of the task you want it to complete – what is known as “few-shot” prompting. This is another way that hand-coded biographical texts can come in handy: they give you some examples with which to improve ChatGPT responses.
Few-shot prompting is supported in biographR
using the
prompt_fewshot
argument in get_bio()
and
get_bio_function_call()
. The prompt_fewshot
argument accepts a data.frame or tibble containing the example bio, the
example bio_name (if applicable), and the results for every field in
your prompt_fields argument.
For example, we might have three exemplars of the type of information
we want. We would pass this information to the
prompt_fewshot
argument in this form:
prompt_fewshot <- data.frame(
bio = c("John Smith went to Nowhere University where he earned his B.A.",
"Sally Smith went to Hogwarts where she studied magic and earned her BS in potions.",
"Adam Driver got his BFA from Juilliard."),
bio_name = c("John Smith", "Sally Smith", "Adam Driver"),
gender = c("Male", "Female", "Male"),
undergraduate_education = c("Nowhere University - Bachelor's",
"Hogwarts - Bachelor's",
"Juilliard - Bachelor's")
)
We can then pass this information to one of our functions like this (from Joe Biden’s Wikipedia page):
get_bio(bio = "Joseph Robinette Biden Jr. is an American politician who is the 46th and current president of the United States. A member of the Democratic Party, he previously served as the 47th vice president from 2009 to 2017 under President Barack Obama and represented Delaware in the United States Senate from 1973 to 2009. Born in Scranton, Pennsylvania, Biden moved with his family to Delaware in 1953. He graduated from the University of Delaware before earning his law degree from Syracuse University.",
bio_name = "Joe Biden",
prompt_fields = c("undergraduate_education", "gender"),
prompt_fields_formats = list(undergraduate_education = "{UNDERGRADUATE SCHOOL} - {DEGREE}"),
prompt_fewshot = prompt_fewshot)
#> Input Tokens: 595
#> Output Tokens: 23
#> Total Tokens: 618
#> # A tibble: 1 × 2
#> undergraduate_education gender
#> <chr> <chr>
#> 1 University of Delaware - Bachelor's; Syracuse University - Law degree Male
Custom Prompts
It can be hard to design a perfect prompt, and different tasks might
require different approaches. The get_bio()
function
supports adding unique prompts which override the default through the
prompt
argument. When a custom prompt is supplied, the
output of the function will be unprocessed (i.e., get_bio()
won’t output a tibble), so that you can specify whichever form of output
you prefer.
get_bio(bio = "Joseph Robinette Biden Jr. is an American politician who is the 46th and current president of the United States. A member of the Democratic Party, he previously served as the 47th vice president from 2009 to 2017 under President Barack Obama and represented Delaware in the United States Senate from 1973 to 2009. Born in Scranton, Pennsylvania, Biden moved with his family to Delaware in 1953. He graduated from the University of Delaware before earning his law degree from Syracuse University.",
bio_name = "Joe Biden",
prompt = "Return ONLY a CSV format containing the gender and highest level of education for Joe Biden with columns 'gender' and 'highest_level_of_education'"
)
#> Warning: Custom prompt provided, bio_name argument will be ignored.
#> Input Tokens: 166
#> Output Tokens: 12
#> Total Tokens: 178
#> [1] "gender,highest_level_of_education\nMale,Law degree"
Post-Processing Output
A fifth approach we can take to improve our output involves
post-processing our results. For example, if we gather data from a bunch
of biographies, we may end up with data that are not in a common format
(e.g., calling a Bachelor’s degree a B.A., B.S., B.F.A., Bachelor’s,
etc.). You can, of course, post-process these data using regex or even
by hand, but we can also use ChatGPT to post-process these data. The
clean_columns()
function implements post-processing of
data.
example_data <- data.frame(highest_level_of_education = c("B.A.", "BA", "BFA", "Ph.D.", "Master's"),
state_of_birth = c("CA", "Calif.", "California", "NY", "New York"))
clean_columns(data = example_data,
column_values = list(highest_level_of_education = c("High School or less", "College", "Graduate School"),
state_of_birth = c("CA", "NY")))
#> highest_level_of_education
#> Input Tokens: 119
#> Output Tokens:113
#> Total Tokens:232
#> state_of_birth
#> Input Tokens: 106
#> Output Tokens:25
#> Total Tokens:131
#> # A tibble: 5 × 4
#> highest_level_of_education state_of_birth highest_level_of_education_gpt
#> <chr> <chr> <chr>
#> 1 B.A. CA College
#> 2 BA Calif. College
#> 3 BFA California College
#> 4 Ph.D. NY Graduate School
#> 5 Master's New York Graduate School
#> # ℹ 1 more variable: state_of_birth_gpt <chr>
Combination of Approaches
Finally, we can combine these approaches to produce better output: we can use more advanced GPT models with few-shot-prompting, custom prompts, and post-processing of data. And users should always check the output of these models: hand-coding data can allow you to assess the accuracy of ChatGPT output.