Learn how to create, list, describe, update, and delete featurestores. A featurestore is a top-level container for entity types, features, and feature values.
Online and offline storage
Vertex AI Feature Store (Legacy) uses two storage methods classified as online storage and offline storage, which are priced differently. All featurestores have offline storage and optionally, online storage.
Online storage retains the latest timestamp values of your features to
efficiently handle online serving requests. When you run an import job by
using the API, you can control the job if the data is written to the online store. Skipping
the online store prevents any load on the online serving nodes. For example,
when you run backfill jobs, you can disable writes to the online store and write only
to the offline store. For more information, see the disableOnlineServing
flag in the API
reference.
Vertex AI Feature Store (Legacy) uses offline storage to store data until the data reaches the retention limit or until you delete the data. You can store unlimited data in the offline store. You can control offline storage costs by managing how much data you keep. You can also override the default online store data retention limit for your featurestore and the offline data retention limit for an entity type. Learn more about Vertex AI Feature Store (Legacy) quotas and limits.
Use the Google Cloud console to view the amount of online and offline storage you are using. View your featurestore's Total online storage and Total offline storage monitoring metrics to determine your usage.
Online serving nodes
Online serving nodes provide the compute resources used to store and serve feature values for low-latency online serving. These nodes are always running even when they aren't serving data. You are charged for each node hour.
The storage limit for online serving nodes is 5 TB per node. Learn more about Vertex AI Feature Store (Legacy) quotas and limits.
The number of online serving nodes that you require is directly proportional to the following two factors:
- The number of online serving requests (queries per second) that the featurestore receives.
- The number of import jobs that write to online storage.
Both factors contribute to the CPU utilization and performance of the nodes. From the Google Cloud console, view the metrics of the following:
- Queries per second: Number of queries per second to your featurestore.
- Node count: Number of your online serving nodes.
- CPU utilization: CPU utilization of your nodes.
If CPU utilization is consistently high, consider increasing the number of online serving nodes for your featurestore.
Test performance of online serving nodes
You can test the performance of online serving nodes for real-time feature serving. This lets you ensure that the featurestore has sufficient machine resources to perform within predetermined QPS or latency thresholds. You can perform these tests based on various benchmarking parameters, such as QPS, latency, and API. For guidelines and best practices to test the performance of online serving nodes, see Test the performance of online serving nodes for real-time serving in Best practices for Vertex AI Feature Store (Legacy).
Additionally, you can use the Vertex AI Benchmarker open source tool to load test the performance of your feature store resources. The Vertex AI Benchmarker open source tool consists of a Python command-line tool and a Java worker.
Scaling Options
You can switch between the following options to configure your number of online serving nodes:
Autoscaling
If you choose autoscaling, the featurestore automatically changes the number of nodes based on CPU utilization. Autoscaling reviews traffic patterns to maintain performance and optimize your cost by adding nodes when the traffic increases and removing nodes when the traffic decreases.
Autoscaling performs well for traffic patterns that experience gradual growth and decline. If you use Vertex AI Feature Store (Legacy) extensively for traffic patterns that encounter frequent load fluctuations, use autoscaling to improve cost efficiency.
Allocating a fixed node count
If you allocate a fixed node count, Vertex AI Feature Store (Legacy) maintains a consistent number of nodes regardless of the traffic patterns. The fixed node count keeps costs predictable, and the nodes should perform well when there are enough nodes to handle the traffic. You can manually change the fixed node count to handle changes in traffic patterns.
Additional considerations for autoscaling
If you choose autoscaling, there are four additional points to consider that include:
After adding online serving nodes, the online store needs time to rebalance the data. It can take up to 20 minutes under load before you see a significant improvement in performance. As a result, scaling the number of nodes might not help for short bursts of traffic. This limitation applies for both manual scaling and autoscaling.
If you submit online serving requests to the featurestore without online serving nodes, the operation returns an error.
Turn off online serving in your featurestore
If you don't require online serving and want to prevent incurring changes for online serving nodes, set the number of online serving nodes to zero. To turn off online serving in your featurestore, set the following configuration:
If you're using autoscaling, remove the
scalingparameter.Set the fixed number of online serving nodes to
0.
For more information about how to create a featurestore, see Create a featurestore. For more information about how to modify the configuration of an existing featurestore, see Update a featurestore.
If you set the number of online serving nodes to 0, the entire online store, including its data, is deleted. If you want to temporarily turn off your online store and then restore it, you must reimport the deleted data.
For example, if you set the online serving node count for your featurestore to 0 and then provision online serving nodes by setting the node count to 1 or higher, Vertex AI Feature Store (Legacy) doesn't migrate the deleted feature data to the online store. To repopulate your online store, you must reimport your data. One way to reimport your data is to export the historical data before you disable online serving nodes, and then import the exported data after you provision the nodes.
When you provision online serving nodes, you must wait for the operation to complete before importing new data. In-progress import jobs resume only after the online serving node provisioning is complete.
If you submit an online serving request to the featurestore without online serving nodes, the request returns an error.
Create a featurestore
Create a featurestore resource to contain entity types and features. The
location of your featurestore must be in the same location as your source data.
For example, if your featurestore is in us-central, you can import data from
files in Cloud Storage buckets that are in us-central1 or in the US
multi-region location, though source data from dual-region buckets
isn't supported. Similarly for BigQuery, you can import data from
tables that are in us-central1 or in the US multi-region location. For more information, see Source data
requirements.
Vertex AI Feature Store (Legacy) availability can vary by location. For more information, see Feature availability.
Web UI
You can create a featurestore using the Google Cloud console if a featurestore isn't already created in the Google Cloud project for the selected region. If a featurestore already exists for the project and region, use another method.
To create a featurestore using the Google Cloud console:
- In the Vertex AI section of the Google Cloud console, go to the Features page.
- Click Create featurestore
- Specify a name for the featurestore.
- If you want to turn on online serving for the featurestore,
click the Turn on online serving toggle and set the scaling options.
For more information about online serving and scaling options, see Online serving nodes - Click Create.