Running and Testing AutoML Services¶
This guide shows how to start each ALFIE service using uvicorn directly from the shell, and how to test them with curl.
- This page lists the services that exist at the moment and the options available for each
- Note that this might change in the future
- IMPORTANT : The localhost url might change in the future and is not loaded automatically from the .env here, so if something doesnt work, probably look at that first?
Prerequisites¶
- You followed the setup
- You have the azure keys (for AutoML plus tasks)
- AutoDW is running (check AutoDW Setup)
Killing a Service on a Port¶
If a port is stuck, kill any process using it:
1 | |
Replace 8000 with the relevant service port.
Services Overview (This might change based on the .env file)¶
To run this locally, run each service you want to test in its own shell using (replace from table):
1 | |
| Service | Port | Uvicorn Target | Description |
|---|---|---|---|
| webfromfile | 8003 | app.automlplus.main:app |
Website accessibility (HTML file input) |
| webfromurl | 8003 | app.automlplus.main:app |
Website accessibility (URL input) |
| im2web | 8003 | app.automlplus.main:app |
Image-to-Website tool |
| tabular | 8001 | app.tabular_automl.main:app |
AutoML for tabular datasets |
| vision | 8002 | app.vision_automl.main:app |
AutoML for vision datasets |
| AutoDW | 8000 | autodw service |
AutoDW |
Testing Services with Curl¶
Open another shell and run the tests.
AutoML Plus¶
Bits and bobs that are not really "AutoML" but use AI models for a specific use case
Test: Website Accessibility (HTML file input)¶
- This tests the accessibility of a file given an HTML
- Only options here are to enter the html file
1 2 3 | |
Test: Website Accessibility (URL input)¶
- This tests the accessibility of a file given a URL (it downloads the html/css)
- Only options here are to enter the url
1 2 3 4 | |
Test: Image-to-Website Tool¶
- If you upload an image of a website and ask the engine to create a website of it, it will do so
- Options are the prompt and the image file
1 2 3 4 | |
Test: AutoML Tabular¶
- !!!Note: This requires AutoDW to be running on the side, without this nothing will work. This test demonstrates the workflow for running a tabular AutoML job using the FastAPI backend integrated with AutoDW.
- Supports tabular datasets fetched directly from AutoDW. Requires:
- user_id (AutoDW user identifier)
- dataset_id (AutoDW dataset identifier)
- dataset_version (AutoDW dataset identifier)
- target_column_name
- task_type (classification, regression, time_series)
- time_budget in seconds
- task_id (AutoDW task identifier) [Will be deprecated eventually]
Supported dataset formats: CSV, TSV, Parquet
Trigger AutoML Training + Best Model Search¶
The entire process is handled by a single endpoint: POST /automl_tabular/best_model/
Example cURL Command
1 2 3 4 5 6 7 8 9 | |
What This Does¶
- Fetches dataset metadata from AutoDW
- Downloads dataset file
- Validates:
- Dataset structure
- Target column
- Timestamp column (if time-series)
- Task type
- Performs AutoML training within the time budget
- Serializes and uploads:
- The best model (the entire folder generated by AutoGluon is uploaded as a zip file)
- Leaderboard as JSON + Markdown
- Successful Response Example
1 2 3 4
{ "message": "AutoML training completed successfully and model uploaded to AutoDW", "leaderboard": "| model | score | ... |" }
Error Handling¶
- 400 – Validation Errors
- Target column missing
- Unsupported file format
- Task type invalid
- Timestamp column missing for time-series
- 502 – AutoDW Communication Failure
- Metadata request fails
- File download fails
- 500 – Unexpected Failures
- Training crashes
- Serialization issues
- Unexpected runtime errors
Notes¶
- Only csv, tsv, and parquet formats are supported.
- All data is handled in a temporary directory isolated per request.
- The AutoML leaderboard is uploaded to AutoDW as both JSON and Markdown.
- The uploaded model includes metadata such as:
- model_type
- training_dataset
- framework
- description
AutoML Vision¶
The AutoML Vision pipeline supports end-to-end image classification training using a CSV file for labels and a ZIP archive containing images. The system automatically handles train/validation/test splitting, model selection, training, and best-model selection within a given time budget.
Key Features¶
- Fully functional AutoML pipeline for vision classification
- Automatic train / validation / test split
- Session-based workflow (input collection → training)
- Configurable time budget and model size
- Designed to integrate with AutoDW once full dataset APIs are available
Input Requirements¶
1. Images ZIP (images_zip)¶
Images must be provided in a ZIP archive with the following structure:
1 2 3 4 5 6 7 8 | |
- Folder names may correspond to labels, but labels are ultimately taken from the CSV
- This format is currently required and may later be replaced by a standardized format (e.g. Croissant)
2. CSV File (csv_file)¶
The CSV must contain exactly two required columns:
| filename | label |
|---|---|
| image1.png | cat |
| image2.png | dog |
filename: image filename only (no paths)label: class label- Filenames must match the image files in the ZIP
3. Additional Parameters¶
| Parameter | Description |
|---|---|
filename_column |
Name of the filename column in CSV |
label_column |
Name of the label column in CSV |
task_type |
Currently supports classification |
time_budget |
Training time budget (seconds) |
model_size |
One of small, medium, large |
Model Size Mapping¶
small: ≤ 50M parametersmedium: ≤ 200M parameterslarge: > 200M parameters
API Workflow¶
The AutoML Vision pipeline uses a two-step session-based workflow.
Step 1: Start a Vision AutoML Session¶
Uploads data, validates inputs, and initializes a training session.
1 2 3 4 5 6 7 | |
Step 2: Train and Find the Best Model¶
Triggers training using the previously created session.
1 2 3 | |
What Happens Internally¶
- Dataset is split into train / validation / test
- Candidate vision models are trained within the time budget
- Best-performing model is selected automatically
- Metrics and training artifacts are produced
- Best model will be uploaded to AutoDW
Notes & Current Limitations¶
- Image ZIP structure is currently strict