ML Functions
On this page
Overview
Machine learning (ML) functions enable trained models to be run directly within SQL queries.
Introduction to Machine Learning
Machine learning is a field of study in artificial intelligence that develops and applies methods for learning patterns from historical data and using those patterns to make predictions or decisions on new data.
Classification
Classification is a supervised learning technique that assigns each input to one of a predefined set of classes or labels.
Following are the types of classification:
-
Binary Classification: Predicts one of two possible outcomes (for example, fraud vs.
non-fraud). -
Multiclass Classification: Predicts one label from multiple possible categories (for example, product type A, B, or C).
-
Multilabel Classification: Assigns multiple labels to a single data point (for example, tagging an image with "beach" and "sunset").
Examples:
-
A credit-card transaction classified as “fraudulent” or “legitimate.
” -
Customer support tickets categorized as “billing,” “technical issue,” or “account upgrade.
”
The ML_ is a supervised machine learning function for classification tasks.ML_ function and return predicted class labels.
Anomaly Detection
Anomaly detection identifies data points that deviate significantly from expected patterns.
-
Performance issues (for example, server overload)
-
System faults (for example, failed jobs or memory leaks)
-
Opportunities (for example, traffic spikes caused by a marketing campaign)
For example, if a cluster's CPU usage normally stays between 20–60% and suddenly rises to 95%, the spike is an anomaly.
Time-Series Anomaly Detection
Time-series anomaly detection analyzes data collected over time.
Following are the types of time-series anomaly detection:
-
Supervised: Supervised models use labeled anomalies to learn failure patterns.
-
Unsupervised: Unsupervised models learn normal behavior from historical data and flag deviations without labels.
The ML_ function is an unsupervised time-series anomaly detection function currently.
Install ML Functions
To install ML Functions, navigate to AI > AI & ML Functions, select the deployment on which to install ML Functions.
Once the ML Functions are installed, query them in the SQL Editor or SingleStore Notebooks.
|
Category |
Function |
|---|---|
|
Statistical and Predictive Functions |
|
|
Statistical and Predictive Functions
ML_ CLASSIFY
Performs binary and multi-class classification on a dataset using standard machine learning algorithms.
-
Logistic Regression
-
Random Forest
-
Gradient Boosting
Syntax
ML_CLASSIFY(model_name, TO_JSON(selected_data.*))
Arguments
-
model_: Name of the trained ML model to use.name -
selected_: A row or set of rows selected for prediction.data
Return Type
string
Usage
|
Basic usage |
|
|
Basic usage with |
|
|
Insert predictions into a table |
|
ML_ ANOMALY_ DETECT
Detects outliers and anomalies in datasets using statistical or machine learning-based methods.
-
Statistical: z-score, interquartile range (IQR)
-
ML-based: Isolation Forest, One-Class SVM
Syntax
ML_ANOMALY_DETECT(model_name, TO_JSON(selected_data.*))
Arguments
-
model_: Name of the trained ML model to use.name -
selected_: A row or set of rows selected for prediction.data
Return Type
string
Usage
|
Basic usage |
|
|
Basic usage with |
|
|
Insert predictions into a table |
|
Train a New ML Model
To train a new ML model, follow these steps:
-
Navigate to AI > Models.
-
Select ML Models tab and then select Train New ML Model.
-
In the Select Function dialog, select one of the following ML functions:
-
ML_CLASSIFY -
ML_ANOMALY_ DETECT
Select Next to configure the model.
-
Configure Model
|
Model Name |
Enter the name of the ML model. |
|
Training Description |
Enter the training description. |
|
Workspace |
Select the SingleStore deployment (workspace) the notebook connects to. Specifying a workspace allows natively connecting the SingleStore databases referenced in the notebook. |
|
Compute Size |
Select one of the following compute sizes:
|
|
Run as |
Run the notebook for training a model with or without personal credentials.
|
Select Next.
Select Training Data
|
Database |
Select the database that contains the training data. |
|
Table |
Select the table from the selected database to train the machine learning model. |
|
Target Column |
Select the column that represents the prediction target for the model. |
|
Feature Selection Mode |
Specify how feature columns are selected. |
|
Feature Column |
Select one or more columns to be used as input features for training the model. |
Preview the data and select Next.
Review the Summary and generated Fusion SQL syntax in the Generated SQL Script.
-
Creates and trains a ML model
-
Uses data from the selected table in the selected database
-
Predicts values of target column status
-
Runs on the selected compute instance
-
Uses all available features by default
Following is the syntax of Fusion SQL script:
%s2ml train <machine_learning_algorithm>--model <model_name>--db <database_name>--input_table <table_name>--target_column <target_column>--description <training_description>--runtime <compute_instance>--selected_features { \"mode\": <feature_selection_mode>, \"features\": <feature_column> }
Select Start Training to train the ML model.
Manage an Existing ML Model
Existing ML models can be managed by performing the following actions:
-
View details
-
Run prediction
-
Share
-
Delete
View Details of an Existing ML Model
To view details of an existing ML model, select the ellipsis under Actions column of the trained ML model, and select View Details.
Run Prediction on an Existing ML Model
Run batch prediction on the existing ML model.
Run a Batch Prediction
To run a batch prediction on the existing ML model, select the ellipsis under Actions column of the trained ML model, and select Run Prediction.
Select Prediction Data
|
Database |
Select the database. |
|
Target Table |
Select the target table on which the prediction will be run. |
|
Target Column |
Select the target column on which the prediction will focus on. |
|
Timestamp Column |
Select the column having timestamp data. |
Preview the data and select Next.
Configure Destination
|
Prediction Interval Width |
Select the interval width of prediction. |
|
Destination Table Name |
Select the destination table in which the prediction results will be stored. |
|
Destination Column |
Select the destination column in which the prediction data will be saved. |
|
Run as |
Run the notebook for training a model with or without personal credentials.
|
Review the Summary and generated Fusion SQL syntax in the Generated SQL Script.
View Predictions of an Existing ML Model
To view the predictions of the trained ML model, select the ML model in the Name column.
Share an Existing ML Model
To share an existing ML model, select the ellipsis under the Actions column of the trained ML model, and select Share.
Delete an Existing ML Model
To delete an existing ML model, select the ellipsis under Actions column of the trained ML model, and select Delete.
Status of ML Models
|
Status |
Description |
|---|---|
|
Pre-processing |
The system is preparing data for ML model training (e. |
|
Training |
The ML model is currently being trained but results are not yet available. |
|
Done |
The ML model has been successfully trained and is ready for use. |
|
Error |
The ML model training or processing failed due to an error. |
Last modified: November 21, 2025