Skip to content

Running and Testing AutoML Services

This guide shows how to start each ALFIE service using uvicorn directly from the shell, and how to test them with curl.

  • This page lists the services that exist at the moment and the options available for each
  • Note that this might change in the future
  • IMPORTANT : The localhost url might change in the future and is not loaded automatically from the .env here, so if something doesnt work, probably look at that first?

Prerequisites

  • You followed the setup
  • You have the azure keys (for AutoML plus tasks)
  • AutoDW is running (check AutoDW Setup)

Killing a Service on a Port

If a port is stuck, kill any process using it:

1
lsof -ti tcp:8000 | xargs kill -9

Replace 8000 with the relevant service port.


Services Overview (This might change based on the .env file)

To run this locally, run each service you want to test in its own shell using (replace from table):

1
uv run uvicorn app.x.main:app --reload --host 0.0.0.0 --port 800x
Service Port Uvicorn Target Description
webfromfile 8003 app.automlplus.main:app Website accessibility (HTML file input)
webfromurl 8003 app.automlplus.main:app Website accessibility (URL input)
im2web 8003 app.automlplus.main:app Image-to-Website tool
tabular 8001 app.tabular_automl.main:app AutoML for tabular datasets
vision 8002 app.vision_automl.main:app AutoML for vision datasets
AutoDW 8000 autodw service AutoDW

Testing Services with Curl

Open another shell and run the tests.

AutoML Plus

Bits and bobs that are not really "AutoML" but use AI models for a specific use case

Test: Website Accessibility (HTML file input)

  • This tests the accessibility of a file given an HTML
  • Only options here are to enter the html file
1
2
3
curl -sN -X POST http://localhost:8000/automlplus/web_access/analyze/ \
  -H "Content-Type: multipart/form-data" \
  -F "file=@./sample_data/test.html"

Test: Website Accessibility (URL input)

  • This tests the accessibility of a file given a URL (it downloads the html/css)
  • Only options here are to enter the url
1
2
3
4
curl -s -X POST http://localhost:8000/automlplus/web_access/analyze/ \
  -H "Content-Type: multipart/form-data" \
  -F "url=https://alfie-project.eu"
  # Optionally add: -F "extra_file_input=@./sample_data/wcag_guidelines.txt"

Test: Image-to-Website Tool

  • If you upload an image of a website and ask the engine to create a website of it, it will do so
  • Options are the prompt and the image file
1
2
3
4
curl -sN -X POST http://localhost:8000/automlplus/image_tools/run_on_image_stream/ \
  -H "Content-Type: multipart/form-data" \
  -F "prompt=Recreate this image into a website with HTML/CSS/JS and explain how to run it." \
  -F "image_file=@./sample_data/websample.png"

Test: AutoML Tabular

  • !!!Note: This requires AutoDW to be running on the side, without this nothing will work. This test demonstrates the workflow for running a tabular AutoML job using the FastAPI backend integrated with AutoDW.
  • Supports tabular datasets fetched directly from AutoDW. Requires:
  • user_id (AutoDW user identifier)
  • dataset_id (AutoDW dataset identifier)
  • dataset_version (AutoDW dataset identifier)
  • target_column_name
  • task_type (classification, regression, time_series)
  • time_budget in seconds
  • task_id (AutoDW task identifier) [Will be deprecated eventually]

Supported dataset formats: CSV, TSV, Parquet

The entire process is handled by a single endpoint: POST /automl_tabular/best_model/

Example cURL Command

1
2
3
4
5
6
7
8
9
curl -s -X POST "http://localhost:8001/automl_tabular/best_model/" \
  -H "Content-Type: multipart/form-data" \
  -F "user_id=101" \
  -F "dataset_id=55" \
  -F "task_id=1" \
  -F "target_column_name=signature" \
  -F "task_type=classification" \
  -F "time_stamp_column_name=" \
  -F "time_budget=30"

What This Does

  • Fetches dataset metadata from AutoDW
  • Downloads dataset file
  • Validates:
  • Dataset structure
  • Target column
  • Timestamp column (if time-series)
  • Task type
  • Performs AutoML training within the time budget
  • Serializes and uploads:
  • The best model (the entire folder generated by AutoGluon is uploaded as a zip file)
  • Leaderboard as JSON + Markdown
  • Successful Response Example
    1
    2
    3
    4
    {
      "message": "AutoML training completed successfully and model uploaded to AutoDW",
      "leaderboard": "| model | score | ... |"
    }
    

Error Handling

  • 400 – Validation Errors
  • Target column missing
  • Unsupported file format
  • Task type invalid
  • Timestamp column missing for time-series
  • 502 – AutoDW Communication Failure
  • Metadata request fails
  • File download fails
  • 500 – Unexpected Failures
  • Training crashes
  • Serialization issues
  • Unexpected runtime errors

Notes

  • Only csv, tsv, and parquet formats are supported.
  • All data is handled in a temporary directory isolated per request.
  • The AutoML leaderboard is uploaded to AutoDW as both JSON and Markdown.
  • The uploaded model includes metadata such as:
  • model_type
  • training_dataset
  • framework
  • description

AutoML Vision

The AutoML Vision pipeline supports end-to-end image classification training using a CSV file for labels and a ZIP archive containing images. The system automatically handles train/validation/test splitting, model selection, training, and best-model selection within a given time budget.


Key Features

  • Fully functional AutoML pipeline for vision classification
  • Automatic train / validation / test split
  • Session-based workflow (input collection → training)
  • Configurable time budget and model size
  • Designed to integrate with AutoDW once full dataset APIs are available

Input Requirements

1. Images ZIP (images_zip)

Images must be provided in a ZIP archive with the following structure:

1
2
3
4
5
6
7
8
images.zip
└── main_folder/
    ├── category1/
    │   ├── image1.png
    │   ├── image2.jpg
    ├── category2/
    │   ├── image3.jpeg
    |-- metadata.csv
  • Folder names may correspond to labels, but labels are ultimately taken from the CSV
  • This format is currently required and may later be replaced by a standardized format (e.g. Croissant)

2. CSV File (csv_file)

The CSV must contain exactly two required columns:

filename label
image1.png cat
image2.png dog
  • filename: image filename only (no paths)
  • label: class label
  • Filenames must match the image files in the ZIP

3. Additional Parameters

Parameter Description
filename_column Name of the filename column in CSV
label_column Name of the label column in CSV
task_type Currently supports classification
time_budget Training time budget (seconds)
model_size One of small, medium, large

Model Size Mapping

  • small: ≤ 50M parameters
  • medium: ≤ 200M parameters
  • large: > 200M parameters

API Workflow

The AutoML Vision pipeline uses a two-step session-based workflow.

Step 1: Start a Vision AutoML Session

Uploads data, validates inputs, and initializes a training session.

1
2
3
4
5
6
7
curl -s -X POST http://localhost:8002/automl_vision/get_user_input/ \
  -H "Content-Type: multipart/form-data" \
  -F "filename_column=filename" \
  -F "label_column=label" \
  -F "task_type=classification" \
  -F "time_budget=10" \
  -F "model_size=small"

Step 2: Train and Find the Best Model

Triggers training using the previously created session.

1
2
3
curl -s -X POST http://localhost:8002/automl_vision/find_best_model/ \
  -H "Content-Type: application/json" \
  -d '{"session_id": "REPLACE_WITH_SESSION_ID"}'

What Happens Internally

  • Dataset is split into train / validation / test
  • Candidate vision models are trained within the time budget
  • Best-performing model is selected automatically
  • Metrics and training artifacts are produced
  • Best model will be uploaded to AutoDW

Notes & Current Limitations

  • Image ZIP structure is currently strict