关于chatgpt流式输出实现打字效果简单介绍

关于chatgpt流式输出其实之前我在新手如何用Python玩转ChatGPT API,详细讲解 这个文章有简单的提及过。但最近问我这个问题的人越来越多。所以才有了写这篇文章的理由。

其实官方本身就支持这种方法,只是没有在文档里体现出来而已。我们只要在代码里添加stream = True这个参数就能实现流式输出。

示例:

stream = False

start_time = time.time()

response = openai.ChatCompletion.create(
    model='gpt-3.5-turbo',
    messages=[
        {'role': 'user', 'content': 'Count to 100, with a comma between each number and no newlines. E.g., 1, 2, 3, ...'}
    ],
    temperature=0,
)

response_time = time.time() - start_time
print(f"Full response received {response_time:.2f} seconds after request")
print(f"Full response received:\n{response}")

stream = True

start_time = time.time()

# send a ChatCompletion request to count to 100
response = openai.ChatCompletion.create(
    model='gpt-3.5-turbo',
    messages=[
        {'role': 'user', 'content': 'Count to 100, with a comma between each number and no newlines. E.g., 1, 2, 3, ...'}
    ],
    temperature=0,
    stream=True  # 关键在这,开启流式输出
)

# create variables to collect the stream of chunks
collected_chunks = []
collected_messages = []
# iterate through the stream of events
for chunk in response:
    chunk_time = time.time() - start_time  # calculate the time delay of the chunk
    collected_chunks.append(chunk)  # save the event response
    chunk_message = chunk['choices'][0]['delta']  # extract the message
    collected_messages.append(chunk_message)  # save the message
    print(f"Message received {chunk_time:.2f} seconds after request: {chunk_message}")  # print the delay and text

# print the time delay and text received
print(f"Full response received {chunk_time:.2f} seconds after request")
full_reply_content = ''.join([m.get('content', '') for m in collected_messages])
print(f"Full conversation received: {full_reply_content}")

至于如何渲染到客户端,最终取决于你所使用的语言了,这里我就不过多介绍了。

评论 0