Service¶
This page describes the service architecture and its specifications.
The service is a FastAPI application that is deployed on a Kubernetes cluster. It is a REST API that can be used to process data.
Architecture¶
To see the general architecture of the project, see the global UML Diagram.
This sequence diagram illustrates the interaction between an user and a service, without using the Core Engine.
sequenceDiagram
participant S as s - Service
participant C as c - Client
participant S3 as s3 - Storage
C->>+S3: file_keys = for file in data: upload(file)
S3-->>-C: return(200, file_key)
C->>+S: POST(s.url/process, callback_url: str, service_task: ServiceTask)
Note right of S: callback_url is the url where the service should send the response
Note right of S: service_task should match the model
S-->>-C: return(200, Task added to the queue)
S->>+S3: data = for key in service_task.task.data_in: get_file(service_task.s3_infos, key)
S3-->>-S: return(200, stream)
S->>S: result = process(data)
S->>+S3: data_out = for res in result: upload_file(service_task.s3_infos, data_out)
S3-->>-S: return(200, key)
S->>S: task_update = jsonable_encoder(TaskUpdate({status: finished, task.data_out: data_out}))
S->>+C: PATCH(callback_url, task_update)
C-->>-S: return(200, OK)
C->>+S3: GET(task_update.data_out)
S3-->>-C: return(200, stream)
Specifications¶
Inside the project, the services are implemented using Python. But the service is a REST API, so it can be implemented in any language.
Endpoints¶
To match the specifications, the service must implement the following endpoints:
- GET
/status
: returns the service availability. (Returns a string) - GET
/tasks/{task_id}/status
: returns the status of a task. (Returns a string) - POST
/compute
: computes the given task and returns the result. (Returns a string)
Models¶
The different models used in the pipeline are described below.
Task Input¶
The POST /compute
endpoint must be able to receive a JSON body that matches the following model:
The data_in
and data_out
fields are lists of S3 object keys. The status
field is a string that can be one of the following values:
The S3 settings are used to connect to the S3 storage where the data is stored and where the result will be stored. The callback_url
is the url where the service should send the response.
A JSON representation would look like this:
Task Output¶
Once the task is computed, the service must PATCH the task on /tasks/{task_id}
with the following model:
The data_out
field is a list of S3 object keys. The status
field is a string that can be one of the following values:
A JSON representation would look like this:
Register to the Core Engine¶
To register the service to the Core Engine, the service must send a POST request to the Core Engine /services
endpoint with the following model:
The data_in_fields
and data_out_fields
fields are lists of FieldDescription
models. A FieldDescription
model is defined as follows:
The url
field is the url of the service.
A JSON representation would look like this:
After the service is registered, it will be available on the Core Engine's /service-slug
endpoint.