Cosmian AI¶
Cosmian AI aims to perform query over an AI runner (summary, translation, query, RAG) in a secured workflow.
Deploy Cosmian VM AI¶
Follow the instructions from the deployment guide.
Hardware optimization¶
When running on Intel Xeon processors, the application can leverage AMX (Advanced Matrix Extensions) to significantly enhance performance, especially for matrix-intensive operations commonly found in machine learning workloads. To enable this feature, simply start the application with the “use_amx” key to true from your config file.
HuggingFace authentication¶
To use certain Hugging Face models—such as the one required by the Translation endpoint—you must provide a Hugging Face access token. This token should be configured in the application’s configuration file.
Usage¶
Cosmian AI Runner exposes endpoints which can be fetched:
Summarize text¶
Summary is made using “facebook/bart-large-cnn” model.
- Endpoint:
/summarize
- Method: POST
- Description: get the summary of a given text, using the configured model
- Request:
- Headers: ‘Content-Type: multipart/form-data’
- Body:
doc
- text to summarize, using model configured on summary config section - Response:
- Example:
Translate text¶
Translation is made using one of the “Helsinki-NLP/opus-mt” models.
- Endpoint:
/translate
- Method: POST
- Description: get the translation of a given text, using model configured on translation config section
- Request:
- Headers: ‘Content-Type: multipart/form-data’
- Body:
doc
- text to translatesrc_lang
- source languagetgt_lang
- targeted language - Response:
- Example:
Predict using text as context¶
- Endpoint:
/context_predict
- Method: POST
- Description: get prediction from a model using current text as a context
- Request:
- Headers: ‘Content-Type: multipart/form-data’
- Body:
context
- text to use as context for predictionquery
- query to answer - Example:
- Response: The response contains the answer to the query, from given context.
Predict using RAG¶
- Endpoint:
/rag_predict
- Method: POST
- Description: get prediction from a model using RAG and configured documentary basis
- Request:
- Headers: ‘Content-Type: multipart/form-data’
- Body:
db
- documentary basis to use for predictionquery
- query to answer - Example:
- Response: The response contains the answer to the query, from given context.
You can list available documentary basis and their uploaded references from current configuration using:
- Endpoint:
/documentary_bases
- Method: GET
- Example:
- Response:
Manage references¶
You can add an .epub
document, .docx
document or a PDF to the vector DB of the given RAG associated to a database, using:
- Endpoint:
/add_reference
- Method: POST
- Request:
- File sent on multipart
- Body:
db
- database to insert referencereference
- reference to insert - Example:
- Response:
So far, only epub, pdf and docx files can be handled.
You can remove a reference to the vector DB of the given RAG associated to a database, using:
- Endpoint:
/delete_reference
- Method: DELETE
- Request:
- Body:
db
- database to remove reference fromreference
- reference to delete - Example:
- Response:
Authentication¶
If your AI Runner is configured to handle authentication, in order to make requests to Cosmian VM AI, users must be authentificated using your Identity Provider application (setup previously for Client-Side encryption). It should be a Single Page or Web application type.