(Generating the) Task Client Package
The idea of HQS Tasks is to run a task by simply calling a Python function (which we call client function) in a regular Python script (which we call user script). This function however is isolated from the actual task implementation and does not directly run it; however, it serves as a proxy for this. Technically speaking, this client function is just a wrapper (interface) to tell the task execution backend that a task shall be invoked, and to wait until the backend reports that the task finished executing, finally returning the result.
This "bigger picture" is also described in the HQS Task User Documentation, specifically in the Architecture section.
What are Client Functions?
The client functions for tasks are auto-generated by a tool provided by HQS. For each task, an individual client function is generated. To give you a better idea, the following (simplified) code and explanation of these generated client functions are provided, corresponding to two hypothetic tasks:
from hqs_tasks_execution import execute
from hqs_task_example import InputModel, OutputModel
async def hello(message: str) -> str:
return await execute("hello", "1.2.3", message, str)
async def other_task(input: InputModel) -> OutputModel:
return await execute("other_task", "1.2.3", input, OutputModel)
We observe:
- The shape of the body of the client functions is always the same: a general
executefunction is being invoked. This function is implemented in the general client packagehqs_tasks_execution(completely independent of any concrete task). - The task name is passed as a string to that function. The client function's name matches the task name.
- Furthermore, the version is supplied (here:
1.2.3). This corresponds to the task version. - Then, the input (argument) is passed, i.e. forwarded exactly as it was provided by the caller.
- Not visualized here, but easily imaginable: The input (argument) and output (return) types match the ones in the task definition. These types are imported from that package (where the tasks have been defined).
- The output (return) type is passed as an argument to the general
executefunction. The reason for this is that after the task returned a result, we will create an instance of that type class in order to not only force type-safety but also to return an instance of the correct class at runtime, i.e. supportingisinstancechecks instead of just duck-typing. - The client function is asynchronous. This is due to the general client's internal logic which waits for the task to be completed. This shall not be blocking other code in case the client script also processes other things or runs multiple tasks concurrently.
Note some (less relevant) technical details are kept unmentioned here to not confuse the reader too much.
What is the Client Package?
Now, the client package is not much more than just several of these (auto-generated) client functions. The code generator takes a task registry file and generates a Python package. In its current implementation, for each task a separate module is created, containing the client function, required import statements as well as, in some cases, generated Python code which defines Pydantic models.
The latter happens for all (sub-)schemas of the input and output for which it is not possible to just import an existing model from a Python package, which is the case when one of the following is true:
- The schema was not generated from some code written in Python but instead in a different language (or without our provided task decorator).
- For tasks which have multiple arguments, these will be wrapped in a tuple for which a class is generated mostly for internal reasons.
- For types which are not supported by Pydantic out of the box, a generated code block will define a helper type which merely annotates the actual type with some information relevant for Pydantic to support validation and serialization.
- The generator can be explicitly told to not import a specific package and instead, when the task uses models from that package, generate code for these. Note that some features, such as constraints or extra logic, will get "lost in translation", so this should be used with caution. See also below under Adding and Black-Listing Sources.
How to Generate the Client Package
The generator is a separate tool installable via the Python package hqs_tasks_generator using hqstage:
hqstage install hqs_tasks_generator
Suppose you have a task registry file under the path tasks.json. Then you can generate the corresponding client package using the command line
hqs-tasks-generate --registry-file tasks.json \
--python-package-name my_tasks_client
In this case, the generated Python package will be named my_tasks_client and will be located under the path ./generated/python/. The latter can be customized using the option --target. The package specifies its version explicitly. Per default, this is taken from the version of the tasks (which only is possible if tall tasks have the same version). Otherwise, this can be explicitly specified using --python-package-version. For more options, read the usage by executing hqs-tasks-generate --help.
Adding and Black-Listing Sources
Normally, when using our task decorator, any model found in the signature of a task function (argument and return types) will be marked with a "source" in the generated task definition and registry. Hence, when using the generator to generate the client package, these types will be imported from the source package where they have been defined. This also has the consequence that the source package will be a dependency of the generated cilent package.
There might be cases where this might not be wanted, most notably when the source package is not public. In that case we want to black-list this package before running the client package code generator.
There might also be cases of the other way around: a source could not be determined or simply is not annotated in the task registry file, for example when the task was not been implemented using our Python framework with the task decorator, but there is a Python package which defines a model class which is compatible with the JSON schema. In this case we can add that information to the task registry prior to running the code generator.
For both cases there is a supplement CLI tool, both of which work the same way: You pass it the original registry file, let's call it tasks.json, and a target file name, tasks_modified.json, as well as a list of package names for which sources need to be added (or removed) from the task definitions.
hqs-tasks-add-sources tasks.json tasks_modified.json \
--add-package package_which_defines_models
hqs-tasks-blacklist-sources tasks.json tasks_modified.json \
--remove-package package_which_defines_models
Note that in the first case - when adding source information - for identifying the models we match the class name with where in the root JSON schema the (referenced) sub-schema is located (requiring it to be located under some path like /$defs/MyModel). This method may be changed using the --identiy-by switch, accepting one of the following values: name (default), title, title_type, title_type_properties, title_description. The identification then happens by the mentioned fields in the JSON schema.