Prompting
General Tips for Prompt Engineering
The overarching theme of using Instructor for function calling is to make the models self-descriptive, modular, and flexible, while maintaining data integrity and ease of use.
- Modularity: Design self-contained components for reuse.
- Self-Description: Use PHPDoc comments or #[Description(”)] attribute for clear field descriptions.
- Optionality: Use PHP’s nullable types (e.g. ?int) for optional fields and set sensible defaults.
- Standardization: Employ enumerations for fields with a fixed set of values; include a fallback option.
- Dynamic Data: Use key-value pairs for arbitrary properties and limit list lengths.
- Entity Relationships: Define explicit identifiers and relationship fields.
- Contextual Logic: Optionally add a “chain of thought” field in reusable components for extra context.
Utilize Nullable Attribute
Use PHP’s nullable types by prefixing type name with question mark (?) and set a default value to prevent undesired defaults like empty strings.
Handling Errors Within Function Calls
You can create a wrapper class to hold either the result of an operation or an error message. This allows you to remain within a function call even if an error occurs, facilitating better error handling without breaking the code flow.
With the MaybeUser
class, you can either receive a UserDetail
object in result or get an error message in ‘errorMessage’.
Original Instructor implementation in Python provides utility class Maybe making this pattern even easier. Such mechanism is not yet available in PHP version of Instructor.
Tips for Enumerations
To prevent data misalignment, use Enums for standardized fields. Always include an “Other” option as a fallback so the model can signal uncertainty.
If you’d like to improve LLM inference performance, try reiterating the requirements in the field descriptions (in the docstrings).
Reiterate Long Instructions
For complex attributes, it helps to reiterate the instructions in the field’s description.
Handle Arbitrary Properties
When you need to extract undefined attributes, use a list of key-value pairs.
Limiting the Length of Lists
When dealing with lists of attributes, especially arbitrary properties, it’s crucial to manage the length. You can use prompting and enumeration to limit the list length, ensuring a manageable set of properties.
To be 100% certain the list does not exceed the limit add extra validation, e.g. using ValidationMixin (see: Validation).
Consistent Arbitrary Properties
For multiple records containing arbitrary properties, instruct LLM to use consistent key names when extracting properties.
Defining Relationships Between Entities
In cases where relationships exist between entities, it’s vital to define them explicitly in the model.
Following example demonstrates how to define relationships between users by incorporating an $id
and $coworkers
field:
Modular Chain of Thought
This approach to “chain of thought” improves data quality but can have modular components rather than global CoT.
Reusing Components with Different Contexts
You can reuse the same component for different contexts within a model. In this example, the TimeRange component is used for both $workTime
and $leisureTime
.
Adding Context to Components
Sometimes, a component like TimeRange may require some context or additional logic to be used effectively. Employing a “chain of thought” field within the component can help in understanding or optimizing the time range allocations.