Customize prompts
Customizing prompts
In case you want to take control over the prompts sent by Instructor
to LLM for different modes, you can use the prompt
parameter in the
request()
or respond()
methods.
It will override the default Instructor prompts, allowing you to fully customize how LLM is instructed to process the input.
Prompting models with tool calling support
Mode::Tools
is usually most reliable way to get structured outputs following
provided response schema.
Mode::Tools
can make use of $toolName
and $toolDescription
parameters
to provide additional semantic context to the LLM, describing the tool to be used
for processing the input. Mode::Json
and Mode::MdJson
ignore these parameters,
as tools are not used in these modes.
Prompting models supporting JSON output
Aside from tool calling Instructor supports two other modes for getting structured
outputs from LLM: Mode::Json
and Mode::MdJson
.
Mode::Json
uses JSON mode offered by some models and API providers to get LLM
respond in JSON format rather than plain text.
Note that various models and API providers have specific requirements
on the input format, e.g. for OpenAI JSON mode you are required to include
JSON
string in the prompt.
Including JSON Schema in the prompt
Instructor takes care of automatically setting the response_format
parameter, but this may not be sufficient for some models or providers -
some of them require specifying JSON response format as part of the
prompt, rather than just as response_format
parameter in the request
(e.g. OpenAI).
For this reason, when using Instructor’s Mode::Json
and Mode::MdJson
you should include the expected JSON Schema in the prompt. Otherwise, the
response is unlikely to match your target model, making it impossible for
Instructor to deserialize it correctly.
The example above demonstrates how to manually create JSON Schema, but with Instructor you do not have to build the schema manually - you can use prompt template placeholder syntax to use Instructor-generated JSON Schema.
Prompt as template
Instructor allows you to use a template string as a prompt. You can use
<|variable|>
placeholders in the template string, which will be replaced
with the actual values during the execution.
Currently, the following placeholders are supported:
<|json_schema|>
- replaced with the JSON Schema for current response model
Example below demonstrates how to use a template string as a prompt:
Prompting the models with no support for tool calling or JSON output
Mode::MdJson
is the most basic (and least reliable) way to get structured
outputs from LLM. Still, you may want to use it with the models which do not
support tool calling or JSON output.
Mode::MdJson
relies on the prompting to get LLM response in JSON formatted data.
Many models prompted in this mode will respond with a mixture of plain text and JSON data. Instructor will try to find JSON data fragment in the response and ignore the rest of the text.
This approach is most prone to deserialization and validation errors and needs providing JSON Schema in the prompt to increase the probability that the response is correctly structured and contains the expected data.