Running and Testing AutoML Services¶

This guide shows how to start each ALFIE service using uvicorn directly from the shell, and how to test them with curl.

This page lists the services that exist at the moment and the options available for each
Note that this might change in the future
IMPORTANT : The localhost url might change in the future and is not loaded automatically from the .env here, so if something doesnt work, probably look at that first?

Prerequisites¶

You followed the setup
You have the azure keys (for AutoML plus tasks)
AutoDW is running (check AutoDW Setup)

Killing a Service on a Port¶

If a port is stuck, kill any process using it:

lsof -ti tcp:8000 | xargs kill -9

Replace 8000 with the relevant service port.

Services Overview (This might change based on the .env file)¶

To run this locally, run each service you want to test in its own shell using (replace from table):

uv run uvicorn app.x.main:app --reload --host 0.0.0.0 --port 800x

Service	Port	Uvicorn Target	Description
webfromfile	8003	`app.automlplus.main:app`	Website accessibility (HTML file input)
webfromurl	8003	`app.automlplus.main:app`	Website accessibility (URL input)
im2web	8003	`app.automlplus.main:app`	Image-to-Website tool
tabular	8001	`app.tabular_automl.main:app`	AutoML for tabular datasets
vision	8002	`app.vision_automl.main:app`	AutoML for vision datasets
AutoDW	8000	`autodw service`	AutoDW

Testing Services with Curl¶

Open another shell and run the tests.

AutoML Plus¶

Bits and bobs that are not really "AutoML" but use AI models for a specific use case

Test: Website Accessibility (HTML file input)¶

This tests the accessibility of a file given an HTML
Only options here are to enter the html file

curl -sN -X POST http://localhost:8000/automlplus/web_access/analyze/ \
  -H "Content-Type: multipart/form-data" \
  -F "file=@./sample_data/test.html"

Test: Website Accessibility (URL input)¶

This tests the accessibility of a file given a URL (it downloads the html/css)
Only options here are to enter the url

curl -s -X POST http://localhost:8000/automlplus/web_access/analyze/ \
  -H "Content-Type: multipart/form-data" \
  -F "url=https://alfie-project.eu"
  # Optionally add: -F "extra_file_input=@./sample_data/wcag_guidelines.txt"

Test: Image-to-Website Tool¶

If you upload an image of a website and ask the engine to create a website of it, it will do so
Options are the prompt and the image file

curl -sN -X POST http://localhost:8000/automlplus/image_tools/run_on_image_stream/ \
  -H "Content-Type: multipart/form-data" \
  -F "prompt=Recreate this image into a website with HTML/CSS/JS and explain how to run it." \
  -F "image_file=@./sample_data/websample.png"

Test: AutoML Tabular¶

!!!Note: This requires AutoDW to be running on the side, without this nothing will work. This test demonstrates the workflow for running a tabular AutoML job using the FastAPI backend integrated with AutoDW.
Supports tabular datasets fetched directly from AutoDW. Requires:
user_id (AutoDW user identifier)
dataset_id (AutoDW dataset identifier)
dataset_version (AutoDW dataset identifier)
target_column_name
task_type (classification, regression, time_series)
time_budget in seconds
task_id (AutoDW task identifier) [Will be deprecated eventually]

Supported dataset formats: CSV, TSV, Parquet

Trigger AutoML Training + Best Model Search¶

The entire process is handled by a single endpoint: POST /automl_tabular/best_model/

Example cURL Command

curl -s -X POST "http://localhost:8001/automl_tabular/best_model/" \
  -H "Content-Type: multipart/form-data" \
  -F "user_id=101" \
  -F "dataset_id=55" \
  -F "task_id=1" \
  -F "target_column_name=signature" \
  -F "task_type=classification" \
  -F "time_stamp_column_name=" \
  -F "time_budget=30"

What This Does¶

Fetches dataset metadata from AutoDW
Downloads dataset file
Validates:
Dataset structure
Target column
Timestamp column (if time-series)
Task type
Performs AutoML training within the time budget
Serializes and uploads:
The best model (the entire folder generated by AutoGluon is uploaded as a zip file)
Leaderboard as JSON + Markdown

Successful Response Example

{
  "message": "AutoML training completed successfully and model uploaded to AutoDW",
  "leaderboard": "| model | score | ... |"
}

Error Handling¶

400 – Validation Errors
Target column missing
Unsupported file format
Task type invalid
Timestamp column missing for time-series
502 – AutoDW Communication Failure
Metadata request fails
File download fails
500 – Unexpected Failures
Training crashes
Serialization issues
Unexpected runtime errors

Notes¶

Only csv, tsv, and parquet formats are supported.
All data is handled in a temporary directory isolated per request.
The AutoML leaderboard is uploaded to AutoDW as both JSON and Markdown.
The uploaded model includes metadata such as:
model_type
training_dataset
framework
description

AutoML Vision¶

The AutoML Vision pipeline supports end-to-end image classification training using a CSV file for labels and a ZIP archive containing images. The system automatically handles train/validation/test splitting, model selection, training, and best-model selection within a given time budget.

Key Features¶

Fully functional AutoML pipeline for vision classification
Automatic train / validation / test split
Session-based workflow (input collection → training)
Configurable time budget and model size
Designed to integrate with AutoDW once full dataset APIs are available

Input Requirements¶

1. Images ZIP (`images_zip`)¶

Images must be provided in a ZIP archive with the following structure:

images.zip
└── main_folder/
    ├── category1/
    │   ├── image1.png
    │   ├── image2.jpg
    ├── category2/
    │   ├── image3.jpeg
    |-- metadata.csv

Folder names may correspond to labels, but labels are ultimately taken from the CSV
This format is currently required and may later be replaced by a standardized format (e.g. Croissant)

2. CSV File (`csv_file`)¶

The CSV must contain exactly two required columns:

filename	label
image1.png	cat
image2.png	dog

filename: image filename only (no paths)
label: class label
Filenames must match the image files in the ZIP

3. Additional Parameters¶

Parameter	Description
`filename_column`	Name of the filename column in CSV
`label_column`	Name of the label column in CSV
`task_type`	Currently supports `classification`
`time_budget`	Training time budget (seconds)
`model_size`	One of `small`, `medium`, `large`

Model Size Mapping¶

small: ≤ 50M parameters
medium: ≤ 200M parameters
large: > 200M parameters

API Workflow¶

The AutoML Vision pipeline uses a two-step session-based workflow.

Step 1: Start a Vision AutoML Session¶

Uploads data, validates inputs, and initializes a training session.

curl -s -X POST http://localhost:8002/automl_vision/get_user_input/ \
  -H "Content-Type: multipart/form-data" \
  -F "filename_column=filename" \
  -F "label_column=label" \
  -F "task_type=classification" \
  -F "time_budget=10" \
  -F "model_size=small"

Step 2: Train and Find the Best Model¶

Triggers training using the previously created session.

curl -s -X POST http://localhost:8002/automl_vision/find_best_model/ \
  -H "Content-Type: application/json" \
  -d '{"session_id": "REPLACE_WITH_SESSION_ID"}'

What Happens Internally¶

Dataset is split into train / validation / test
Candidate vision models are trained within the time budget
Best-performing model is selected automatically
Metrics and training artifacts are produced
Best model will be uploaded to AutoDW

Notes & Current Limitations¶

Image ZIP structure is currently strict