BatchElements and Tasks
The components of CHEESE process data in the form of the BatchElement object. These are communicated between the components using the Task object, which essentially serves as a container for some BatchElement.
- class cheese.data.BatchElement(client_id: int = - 1, trip: int = 0, trip_start: str = 'client', trip_max: int = 1, error: bool = False, start_time: float = - 1.0, end_time: float = - 1.0)[source]
Abstract base class for BatchElements. Store all kinds of data being passed around CHEESE
- Parameters
client_id (int) – The ID of the last client that touched this data
trip (int) – How many targets have touched/accessed this data so far
trip_start (str) – First target for data (“client” or “model”) after it is queued by pipeline. Defaults to “client”.
trip_max (int) – How many targets can touch/access this data before it goes back to pipeline to be posted
error (bool) – A flag for frontend to mark the data as being erroneous (i.e. if it couldn’t be labelled properly). While it doesn’t do this by default, it is reccomended you account for errors in Pipeline.post()
start_time (float) – Timestamp for when data was first given to a client
end_time (float) – Timestamp for when data was sent back to pipeline
As an explanation for the trip attribute to BatchElement, consider the following cases. Suppose we want user to just label some data being read from a dataset, then write their labels to a new dataset. Then trip_max of 1 would result in the data visiting user then immediately going back to pipeline. Now suppose instead we want user to look at data and prompt a generation from a generative model, then label the generation along with the original data. We’d set trip_max of 3 for this since the data is visiting user, model then user again. Once trip becomes trip_max, the data is sent back to pipeline. In the case where it is 2, then the data will be sent back to the pipeline from the model.
- class cheese.tasks.Task(data: Optional[cheese.data.BatchElement] = None, client_id: int = - 1, terminate: bool = False)[source]
Tasks to communicate between the components in the cheese.
- Parameters
data (BatchElement) – The data contained in the task
client_id (int) – The ID of the client that is meant to receive this task
terminate (bool) – A flag to tell the client to terminate