huggingface_inference
Run inference on Hugging Face models via the Inference API.
Overview
Run inference on Hugging Face models via the Inference API.
This step provides access to thousands of pre-trained models hosted on Hugging Face for tasks like text classification, named entity recognition, summarization, translation, question answering, text generation, and more. You can use any public model from the Hugging Face Hub without managing infrastructure. Specify the model ID, provide your API token, and configure task-specific parameters. The step handles API communication and returns parsed results. Perfect for ML tasks without deploying your own models.
Setup: 1. Create a Hugging Face account at https://huggingface.co/ 2. Generate an access token at https://huggingface.co/settings/tokens 3. Choose a model from https://huggingface.co/models (ensure it has an Inference API) 4. Store your token securely (e.g., as an environment variable: HUGGINGFACE_TOKEN)
API Token: Required for private models. Get from https://huggingface.co/settings/tokens
Examples
Sentiment analysis
Classify text sentiment (positive/negative) using DistilBERT
type: huggingface_inference
model: distilbert-base-uncased-finetuned-sst-2-english
api_token: ${env:HUGGINGFACE_TOKEN}
input_from: review.text
output_to: review.sentiment
Text summarization
Generate concise summaries of long documents
type: huggingface_inference
model: facebook/bart-large-cnn
api_token: ${env:HUGGINGFACE_TOKEN}
input_from: article.full_text
output_to: article.summary
task_params:
max_length: 150
min_length: 50
timeout: 20
Named entity recognition
Extract people, organizations, and locations from text
type: huggingface_inference
model: dslim/bert-base-NER
api_token: ${env:HUGGINGFACE_TOKEN}
input_from: document.content
output_to: document.entities
timeout: 15
Question answering
Answer questions based on provided context
type: huggingface_inference
model: deepset/roberta-base-squad2
api_token: ${env:HUGGINGFACE_TOKEN}
input_from: qa_pair
output_to: answer
task_params:
question: ${qa_pair.question}
context: ${qa_pair.context}
Configuration
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model id (for example 'bert-base-uncased'). Values starting with http(s) are treated as full endpoint URLs. |
api_token | string | Yes | Hugging Face API token sent as a Bearer Authorization header. |
input_from | string | No | Dot path selecting the payload to send. When omitted, the entire event dictionary is posted. |
input_key | string | No | DEPRECATED: Use 'input_from' instead. Dot path selecting the payload. |
output_to | string | No | Event key that receives the parsed response payload.
Default: "huggingface" |
output_key | string | No | DEPRECATED: Use 'output_to' instead. Event key for response. |
payload_field | string | No | JSON key used to wrap the payload (defaults to 'inputs' to match Hugging Face conventions).
Default: "inputs" |
raw_on_error | boolean | No | When True, store the raw response body under '<output_to>_raw' if JSON parsing fails.
Default: true |
swallow_on_error | boolean | No | If True, skip injecting error details and return the original event on failures.
Default: false |
timeout | integer | No | Request timeout in seconds for the inference call (default 10).
Default: 10 |
extra_headers | string | No | Additional HTTP headers merged with the defaults for each request (Authorization, Content-Type, Accept, User-Agent). |
task_params | string | No | Additional top-level parameters (for example temperature) merged into the request body alongside the payload field. |
Base Configuration
These configuration options are available on all steps:
| Parameter | Type | Default | Description |
|---|---|---|---|
name | | null | Optional name for this step (for documentation and debugging) |
description | | null | Optional description of what this step does |
retries | integer | 0 | Number of retry attempts (0-10) |
backoff_seconds | number | 0 | Backoff (seconds) applied between retry attempts |
retry_propagate | boolean | false | If True, raise last exception after exhausting retries; otherwise swallow. |