The last thing we need to do is to
split the text into several chunks if it's too long. Let's perform a simple calculation: GPT-3.5Turbo-16k can accept up to 16,000 tokens as an input, which is approximately equivalent to 12,000 English words (as stated on
openai.com/pricing:
'For English text, 1 token is approximately 4 characters or 0.75 words').
Given that we need to generate summaries, which can sometimes be quite lengthy, we should aim for a maximum input of 9,000 English words.
This translates to about 18 pages of text. The GPT-4-8k model is half the size, so we would limit the input to only 4,500 words or 9 pages of text.
Typically, we have
no more than 50 news articles per stock,
and their summaries can be generated in a single API call. However, on a given day, there may be between 230 to 260 news articles, requiring multiple calls.
For a weekly summary (comprising 2,200 to 2,400 news articles), it could necessitate 15 to 20 calls to the GPT-3.5Turbo-16k model. To prepare these inputs, we divide the text into 'chunks' that are sequentially processed in the API calls. Through empirical testing, I've found that a maximum of 6,000 words (approximately 100 news articles) works well for the GPT-3.5Turbo-16k model, while 3,000 words is suitable for the GPT4-8k model. While this may not be the strict optimal threshold, it has proven effective in my experience, and the code does not encounter crashes during execution.
It's important to note that when using the OpenAI API to generate summaries, the tokens used in those summaries also count towards your token usage quota. Tokens are the fundamental units of text processing that both the input text and the generated output consist of.
For instance, when you provide a news article to the API for summarization, the tokens in the original article count towards your token usage. However, the tokens generated in the summary by the API are also counted separately. This means that not only do you need to consider the length of the input text but also the length of the generated summary when calculating your token consumption.
Therefore, if you have a limited token budget or a specific token usage quota, it's essential to factor in both the input text and the expected length of the summaries you wish to generate. This consideration ensures that you can effectively manage and optimise your token usage when working with the OpenAI API.
Here is an example of producing chunks of input data for a one-week run. The numbers in brackets indicate the places where we define the border between two adjacent chunks. One chunk is one OpenAI's API call with close to maximum tokens used:
Chunks for market summary last day: [0, 96, 222]
Chunks for market summary last day for GPT-4: [0, 38, 96, 155, 222]
Chunks for market summary last week: [0, 96, 222, 367, 483, 608, 681, 809, 931, 1034, 1145, 1292, 1402, 1504, 1625, 1730, 1851, 1988, 2107]