Integrate a service into ELG using the python SDK¶
Example with the integration of a NER model from HuggingFace: https://huggingface.co/elastic/distilbert-base-cased-finetuned-conll03-english
0. Set up the environment¶
Before starting, we will set up a new environment. Let’s create a new folder:
mkdir distilbert-ner-en
cd distilbert-ner-en
And now a new environment:
conda create -n python-services-distilbert-ner-en python=3.7
conda activate python-services-distilbert-ner-en
We will also install the packages needed:
pip install torch transformers
1. Have the model running locally¶
The first step is to have the model we want to integrate into ELG
running locally. In our case, we will firstly download the model using
git lfs
:
git lfs install
git clone https://huggingface.co/elastic/distilbert-base-cased-finetuned-conll03-english
And then create a simple python script that run the model (we can call
it use.py
):
from transformers import pipeline
class DistilbertNEREn:
nlp = pipeline("ner", "distilbert-base-cased-finetuned-conll03-english")
def run(self, input_str):
return self.nlp(input_str)
We can have a try using the Python Interpreter:
>>> from use import DistilbertNEREn
>>> model = DistilbertNEREn()
>>> model.run("Albert is from Germany.")
[{'word': 'Albert', 'score': 0.9995226263999939, 'entity': 'B-PER', 'index': 1}, {'word': 'Germany', 'score': 0.9997316598892212, 'entity': 'B-LOC', 'index': 4}]
The model is working well but is not yet ELG compatible.
2. Create an ELG compatible service¶
To create the ELG compatible service, we will use the ELG Python SDK installable through PIP:
pip install elg
Then we need to make our model inherits from the FlaskService
class
of elg
and uses ELG Request
and Response
object. Let’s
create a new python file called elg_service.py
:
from transformers import pipeline
from elg import FlaskService
from elg.model import AnnotationsResponse
class DistilbertNEREn(FlaskService):
nlp = pipeline("ner", "distilbert-base-cased-finetuned-conll03-english")
def convert_outputs(self, outputs, content):
annotations = {}
offset = 0
for output in outputs:
word = output["word"]
score = output["score"]
entity = output["entity"]
start = content.find(word) + offset
end = start + len(word)
content = content[end - offset :]
offset = end
if entity not in annotations.keys():
annotations[entity] = [
{
"start": start,
"end": end,
"features": {
"word": word,
"score": score,
},
}
]
else:
annotations[entity].append(
{
"start": start,
"end": end,
"features": {
"word": word,
"score": score,
},
}
)
return AnnotationsResponse(annotations=annotations)
def process_text(self, content):
outputs = self.nlp(content.content)
return self.convert_outputs(outputs, content.content)
flask_service = DistilbertNEREn("distilbert-ner-en")
app = flask_service.app
We also need to initialize the service at the end of the file and create
the app
variable.
We can now test our service from the Python Interpreter to make sure that everything is working:
>>> from elg_service import DistilbertNEREn
>>> from elg.model import TextRequest
>>> service = DistilbertNEREn("distilbert-ner-en")
>>> request = TextRequest(content="Albert is from Germany.")
>>> response = service.process_text(request)
>>> print(response)
type='annotations' warnings=None features=None annotations={'B-PER': [Annotation(start=0, end=6, sourceStart=None, sourceEnd=None, features={'word': 'Albert', 'score': 0.9995226263999939})], 'B-LOC': [Annotation(start=15, end=22, sourceStart=None, sourceEnd=None, features={'word': 'Germany', 'score': 0.9997316598892212})]}
>>> print(type(response))
<class 'elg.model.response.AnnotationsResponse.AnnotationsResponse'>
Our service is using an ELG Request
object as input and an ELG
Response
object as output. It is therefore ELG compatible.
3. Generate the Docker image¶
To generate the Docker image, the first step is to create a Dockerfile.
To simplify the process, you can use the elg
CLI and run:
elg docker create --path ./elg_service.py --classname DistilbertNEREn --required_folders distilbert-base-cased-finetuned-conll03-english --requirements torch --requirements transformers
It will generate the Dockerfile but also a file called
docker_entrypoint.sh
used to start the service inside the container,
and a requirements.txt
file.
We can now build the Docker image (here you need to replace with the name of your docker registry):
docker build -t airklizz/distilbert-ner-en:v1 .
It will generate a Docker image locally that we can test using the ELG Python SDK as follows:
>>> from elg import Service
>>> service = Service.from_docker_image("airklizz/distilbert-ner-en:v1", "http://localhost:8000/process", 8181)
# at that step, we will be asked to run the docker image locally using the command: `docker run -p 127.0.0.1:8181:8000 airklizz/distilbert-ner-en:v1`
>>> response = service("Albert is from Germany.", sync_mode=True)
>>> print(response)
type='annotations' warnings=None features=None annotations={'B-PER': [Annotation(start=0, end=6, sourceStart=None, sourceEnd=None, features={'word': 'Albert', 'score': 0.9995226263999939})], 'B-LOC': [Annotation(start=15, end=22, sourceStart=None, sourceEnd=None, features={'word': 'Germany', 'score': 0.9997316598892212})]}
The image is working and is compatible with ELG so we can push it on Docker Hub (or the Docker registry of you choice):
docker push airklizz/distilbert-ner-en:v1
4. Create a new service on ELG¶
You have to go to the ELG, have a provider account and then you can add a new ‘Service or Tool’.
You have to fill all the information needed.
And then save and create the item. You will be redirect to the landing page of the service.
When the landing page looks good to you, you can submit the service for publication through the ‘Actions’ button.
At that step, the ELG technical team will deploy the Docker image of your service into the ELG Kubernetes cluster. Once it done, the service is tested and published.
The service is now publicly available on ELG.
And can be run through the UI:
or through the Python SDK (you need to find the id of the service, here: 6366):
from elg import Service
service = Service.from_id(6366)
response = service('Albert is from Germany.')
print(response)
# type='annotations' warnings=None features=None annotations={'B-PER': [Annotation(start=0, end=6, sourceStart=None, sourceEnd=None, features={'word': 'Albert', 'score': 0.9995226263999939})], 'B-LOC': [Annotation(start=15, end=22, sourceStart=None, sourceEnd=None, features={'word': 'Germany', 'score': 0.9997316598892212})]}