Caching
Liteset has moved off flask-caching to a native asyncio-friendly cache stack implemented in
superset.cache.manager (SyncRedisCacheAdapter, SimpleSyncCacheManager,
MetastoreSyncCacheManager and their async siblings). To keep configuration painless for existing
deployments, the cache-config dictionaries still accept the Flask-Caching-style schema —
CACHE_TYPE values such as 'RedisCache', 'SimpleCache', 'NullCache', and
'SupersetMetastoreCache' are recognised and routed to the appropriate Liteset backend at startup.
Caching can be configured by providing dictionaries in superset_config.py using the familiar
Flask-Caching schema. The recommended backend is Redis; Memcached, in-memory, and the metastore
fallback are also supported.
The following cache configurations can be customized in this way:
- Dashboard filter state (required):
FILTER_STATE_CACHE_CONFIG. - Explore chart form data (required):
EXPLORE_FORM_DATA_CACHE_CONFIG - Metadata cache (optional):
CACHE_CONFIG - Charting data queried from datasets (optional):
DATA_CACHE_CONFIG
For example, to configure the filter state cache using Redis:
FILTER_STATE_CACHE_CONFIG = {
'CACHE_TYPE': 'RedisCache',
'CACHE_DEFAULT_TIMEOUT': 86400,
'CACHE_KEY_PREFIX': 'superset_filter_cache',
'CACHE_REDIS_URL': 'redis://localhost:6379/0'
}
Dependencies
In order to use dedicated cache stores, additional python libraries must be installed
- For Redis: we recommend the redis Python package
- Memcached: we recommend using pylibmc client library as
python-memcacheddoes not handle storing binary data correctly.
These libraries can be installed using pip.
Fallback Metastore Cache
Note, that some form of Filter State and Explore caching are required. If either of these caches are undefined, Liteset falls back to using a built-in cache that stores data in the metadata database. While it is recommended to use a dedicated cache, the built-in cache can also be used to cache other data.
For example, to use the built-in cache to store chart data, use the following config:
DATA_CACHE_CONFIG = {
"CACHE_TYPE": "SupersetMetastoreCache",
"CACHE_KEY_PREFIX": "superset_results", # make sure this string is unique to avoid collisions
"CACHE_DEFAULT_TIMEOUT": 86400, # 60 seconds * 60 minutes * 24 hours
}
Chart Cache Timeout
The cache timeout for charts may be overridden by the settings for an individual chart, dataset, or
database. Each of these configurations will be checked in order before falling back to the default
value defined in DATA_CACHE_CONFIG.
Note, that by setting the cache timeout to -1, caching for charting data can be disabled, either
per chart, dataset or database, or by default if set in DATA_CACHE_CONFIG.
SQL Lab Query Results
Caching for SQL Lab query results is used when async queries are enabled and is configured using
RESULTS_BACKEND.
Note that this configuration does not use the Flask-Caching-style dictionary; instead it expects an
object that implements Liteset's SyncCacheProtocol (e.g. SyncRedisCacheAdapter). See
Async Queries via Celery for a worked example.
Caching Thumbnails
This is an optional feature that can be turned on by activating its feature flag on config:
FEATURE_FLAGS = {
"THUMBNAILS": True,
"THUMBNAILS_SQLA_LISTENERS": True,
}
By default thumbnails are rendered per user, and will fall back to the Selenium user for anonymous users.
To always render thumbnails as a fixed user (admin in this example), use the following configuration:
from superset.tasks.types import FixedExecutor
THUMBNAIL_EXECUTORS = [FixedExecutor("admin")]
For this feature you will need a cache system and celery workers. All thumbnails are stored on cache and are processed asynchronously by the workers.
An example config that stores thumbnails on Redis (the recommended setup for Liteset) and another that demonstrates plugging in a custom S3 backend:
from s3cache.s3cache import S3Cache # third-party
...
class CeleryConfig(object):
broker_url = "redis://localhost:6379/0"
imports = (
"superset.sql_lab",
"superset.tasks.thumbnails",
)
result_backend = "redis://localhost:6379/0"
worker_prefetch_multiplier = 10
task_acks_late = True
CELERY_CONFIG = CeleryConfig
# Redis (recommended) — uses Liteset's native SyncRedisCacheAdapter via the
# standard Flask-Caching-style schema:
THUMBNAIL_CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 86400,
"CACHE_KEY_PREFIX": "superset_thumbnail_",
"CACHE_REDIS_URL": "redis://localhost:6379/0",
}
# S3 (custom adapter) — assign an instance that satisfies SyncCacheProtocol.
# Liteset detects an object (vs. a dict) and uses it as-is, no Flask app required:
# THUMBNAIL_CACHE_CONFIG = S3Cache("bucket_name", "thumbs_cache/")
Using the above example cache keys for dashboards will be superset_thumb__dashboard__{ID}. You can
override the base URL for selenium using:
WEBDRIVER_BASEURL = "https://superset.company.com"
Additional selenium web drive configuration can be set using WEBDRIVER_CONFIGURATION. You can
implement a custom function to authenticate selenium. The default function decodes Flask-Login-format
session cookies natively (compat with existing installations) — Liteset does not run Flask-Login at
runtime, but the cookie format is preserved end-to-end so existing alerts/reports infrastructure
keeps working. Here's an example of a custom function signature:
def auth_driver(driver: WebDriver, user: "User") -> WebDriver:
pass
Then on configuration:
WEBDRIVER_AUTH_FUNC = auth_driver