# Configuration All settings in Vaex can be configured in a uniform way, based on [Pydantic](https://pydantic-docs.helpmanual.io/usage/settings/). From a Python runtime, configuration of settings can be done via the `vaex.settings` module. ```python import vaex vaex.settings.main.thread_count = 10 vaex.settings.display.max_columns = 50 ``` Via environmental variables: ``` $ VAEX_NUM_THREADS=10 VAEX_DISPLAY_MAX_COLUMNS=50 python myservice.py ``` Otherwise, values are obtained from a `.env` [file using dotenv](https://saurabh-kumar.com/python-dotenv/#usages) from the current working directory. ``` VAEX_NUM_THREADS=22 VAEX_CHUNK_SIZE_MIN=2048 ``` Lastly, a global yaml file from `$VAEX_PATH_HOME/.vaex/main.yaml` is loaded (with last priority). ``` thread_count: 33 display: max_columns: 44 max_rows: 20 ``` If we now run `vaex settings yaml`, we see the effective settings as yaml output: ``` $ VAEX_NUM_THREADS=10 VAEX_DISPLAY_MAX_COLUMNS=50 vaex settings yaml ... chunk: size: null size_min: 2048 size_max: 1048576 display: max_columns: 50 max_rows: 20 thread_count: 10 ... ``` ## Developers When updating `vaex/settings.py`, run the `vaex settings watch` to generate this documentation below automatically when saving the file. ## Schema A JSON schema can be generated using ``` $ vaex settings schema > vaex-settings.schema.json ``` ## Settings General settings for vaex ### aliases Aliases to be used for vaex.open Environmental variable: `VAEX_ALIASES` Python settings `vaex.settings.main.aliases` ### async How to run async code in the local executor Environmental variable: `VAEX_ASYNC` Example use: ``` $ VAEX_ASYNC=nest python myscript.py ``` Python settings `vaex.settings.main.async_` Example use: `vaex.settings.main.async_ = 'nest'` ### home Home directory for vaex, which defaults to `$HOME/.vaex`, If both `$VAEX_HOME` and `$HOME` are not defined, the current working directory is used. (Note that this setting cannot be configured from the vaex home directory itself). Environmental variable: `VAEX_HOME` Example use: ``` $ VAEX_HOME=/home/docs/.vaex python myscript.py ``` Python settings `vaex.settings.main.home` Example use: `vaex.settings.main.home = '/home/docs/.vaex'` ### mmap Experimental to turn off, will avoid using memory mapping if set to False Environmental variable: `VAEX_MMAP` Example use: ``` $ VAEX_MMAP=True python myscript.py ``` Python settings `vaex.settings.main.mmap` Example use: `vaex.settings.main.mmap = True` ### process_count Number of processes to use for multiprocessing (e.g. apply), defaults to thread_count setting Environmental variable: `VAEX_PROCESS_COUNT` Example use: ``` $ VAEX_PROCESS_COUNT=2 python myscript.py ``` Python settings `vaex.settings.main.process_count` Example use: `vaex.settings.main.process_count = 2` ### thread_count Number of threads to use for computations, defaults to multiprocessing.cpu_count() Environmental variable: `VAEX_NUM_THREADS` Example use: ``` $ VAEX_NUM_THREADS=2 python myscript.py ``` Python settings `vaex.settings.main.thread_count` Example use: `vaex.settings.main.thread_count = 2` ### thread_count_io Number of threads to use for IO, defaults to thread_count_io + 1 Environmental variable: `VAEX_NUM_THREADS_IO` Example use: ``` $ VAEX_NUM_THREADS_IO=2 python myscript.py ``` Python settings `vaex.settings.main.thread_count_io` Example use: `vaex.settings.main.thread_count_io = 2` ### path_lock Directory to store lock files for vaex, which defaults to `${VAEX_HOME}/lock/`, Due to possible race conditions lock files cannot be removed while processes using Vaex are running (on Unix systems). Environmental variable: `VAEX_LOCK` Example use: ``` $ VAEX_LOCK=/home/docs/.vaex/lock python myscript.py ``` Python settings `vaex.settings.main.path_lock` Example use: `vaex.settings.main.path_lock = '/home/docs/.vaex/lock'` ## Cache Setting for caching of computation or task results, see the [API](api.html#module-vaex.cache) for more details. ### type Type of cache, e.g. 'memory_infinite', 'memory', 'disk', 'redis', or a multilevel cache, e.g. 'memory,disk' Environmental variable: `VAEX_CACHE` Python settings `vaex.settings.cache.type` ### disk_size_limit Maximum size for cache on disk, e.g. 10GB, 500MB Environmental variable: `VAEX_CACHE_DISK_SIZE_LIMIT` Example use: ``` $ VAEX_CACHE_DISK_SIZE_LIMIT=10GB python myscript.py ``` Python settings `vaex.settings.cache.disk_size_limit` Example use: `vaex.settings.cache.disk_size_limit = '10GB'` ### memory_size_limit Maximum size for cache in memory, e.g. 1GB, 500MB Environmental variable: `VAEX_CACHE_MEMORY_SIZE_LIMIT` Example use: ``` $ VAEX_CACHE_MEMORY_SIZE_LIMIT=1GB python myscript.py ``` Python settings `vaex.settings.cache.memory_size_limit` Example use: `vaex.settings.cache.memory_size_limit = '1GB'` ### path Storage location for cache results. Defaults to `${VAEX_HOME}/cache` Environmental variable: `VAEX_CACHE_PATH` Example use: ``` $ VAEX_CACHE_PATH=/home/docs/.vaex/cache python myscript.py ``` Python settings `vaex.settings.cache.path` Example use: `vaex.settings.cache.path = '/home/docs/.vaex/cache'` ## Chunk Configure how a dataset is broken down in smaller chunks. The executor dynamically adjusts the chunk size based on `size_min` and `size_max` and the number of threads when `size` is not set. ### size When set, fixes the number of chunks, e.g. do not dynamically adjust between min and max Environmental variable: `VAEX_CHUNK_SIZE` Python settings `vaex.settings.main.chunk.size` ### size_min Minimum chunk size Environmental variable: `VAEX_CHUNK_SIZE_MIN` Example use: ``` $ VAEX_CHUNK_SIZE_MIN=1024 python myscript.py ``` Python settings `vaex.settings.main.chunk.size_min` Example use: `vaex.settings.main.chunk.size_min = 1024` ### size_max Maximum chunk size Environmental variable: `VAEX_CHUNK_SIZE_MAX` Example use: ``` $ VAEX_CHUNK_SIZE_MAX=1048576 python myscript.py ``` Python settings `vaex.settings.main.chunk.size_max` Example use: `vaex.settings.main.chunk.size_max = 1048576` ## Data Data configuration ### path Storage location for data files, like vaex.example(). Defaults to `${VAEX_HOME}/data/` Environmental variable: `VAEX_DATA_PATH` Example use: ``` $ VAEX_DATA_PATH=/home/docs/.vaex/data python myscript.py ``` Python settings `vaex.settings.data.path` Example use: `vaex.settings.data.path = '/home/docs/.vaex/data'` ## Display How a dataframe displays ### max_columns How many column to display when printing out a dataframe Environmental variable: `VAEX_DISPLAY_MAX_COLUMNS` Example use: ``` $ VAEX_DISPLAY_MAX_COLUMNS=200 python myscript.py ``` Python settings `vaex.settings.display.max_columns` Example use: `vaex.settings.display.max_columns = 200` ### max_rows How many rows to print out before showing the first and last rows Environmental variable: `VAEX_DISPLAY_MAX_ROWS` Example use: ``` $ VAEX_DISPLAY_MAX_ROWS=10 python myscript.py ``` Python settings `vaex.settings.display.max_rows` Example use: `vaex.settings.display.max_rows = 10` ## FileSystem Filesystem configuration ### path Storage location for caching files from remote file systems. Defaults to `${VAEX_HOME}/file-cache/` Environmental variable: `VAEX_FS_PATH` Example use: ``` $ VAEX_FS_PATH=/home/docs/.vaex/file-cache python myscript.py ``` Python settings `vaex.settings.fs.path` Example use: `vaex.settings.fs.path = '/home/docs/.vaex/file-cache'` ## MemoryTracker Memory tracking/protection when using vaex in a service ### type Which memory tracker to use when executing tasks Environmental variable: `VAEX_MEMORY_TRACKER` Example use: ``` $ VAEX_MEMORY_TRACKER=default python myscript.py ``` Python settings `vaex.settings.main.memory_tracker.type` Example use: `vaex.settings.main.memory_tracker.type = 'default'` ### max How much memory the executor can use maximally (only used for type='limit') Environmental variable: `VAEX_MEMORY_TRACKER_MAX` Python settings `vaex.settings.main.memory_tracker.max` ## TaskTracker task tracking/protection when using vaex in a service ### type Comma seperated string of trackers to run while executing tasks Environmental variable: `VAEX_TASK_TRACKER` Example use: ``` $ VAEX_TASK_TRACKER= python myscript.py ``` Python settings `vaex.settings.main.task_tracker.type` ## Logging Configure logging for Vaex. By default Vaex sets up logging, which is useful when running a script. When Vaex is used in applications or services that already configure logging, set the environomental variables VAEX_LOGGING_SETUP to false. See the [API docs](api.html#module-vaex.logging) for more details. Note that settings `vaex.settings.main.logging.info` etc at runtime, has no direct effect, since logging is already configured. When needed, call `vaex.logging.reset()` and `vaex.logging.setup()` to reconfigure logging. ### setup Setup logging for Vaex at import time. Environmental variable: `VAEX_LOGGING_SETUP` Example use: ``` $ VAEX_LOGGING_SETUP=True python myscript.py ``` Python settings `vaex.settings.main.logging.setup` Example use: `vaex.settings.main.logging.setup = True` ### rich Use rich logger (colored fancy output). Environmental variable: `VAEX_LOGGING_RICH` Example use: ``` $ VAEX_LOGGING_RICH=True python myscript.py ``` Python settings `vaex.settings.main.logging.rich` Example use: `vaex.settings.main.logging.rich = True` ### debug Comma seperated list of loggers to set to the debug level (e.g. 'vaex.settings,vaex.cache'), or a '1' to set the root logger ('vaex') Environmental variable: `VAEX_LOGGING_DEBUG` Example use: ``` $ VAEX_LOGGING_DEBUG= python myscript.py ``` Python settings `vaex.settings.main.logging.debug` ### info Comma seperated list of loggers to set to the info level (e.g. 'vaex.settings,vaex.cache'), or a '1' to set the root logger ('vaex') Environmental variable: `VAEX_LOGGING_INFO` Example use: ``` $ VAEX_LOGGING_INFO= python myscript.py ``` Python settings `vaex.settings.main.logging.info` ### warning Comma seperated list of loggers to set to the warning level (e.g. 'vaex.settings,vaex.cache'), or a '1' to set the root logger ('vaex') Environmental variable: `VAEX_LOGGING_WARNING` Example use: ``` $ VAEX_LOGGING_WARNING=vaex python myscript.py ``` Python settings `vaex.settings.main.logging.warning` Example use: `vaex.settings.main.logging.warning = 'vaex'` ### error Comma seperated list of loggers to set to the error level (e.g. 'vaex.settings,vaex.cache'), or a '1' to set the root logger ('vaex') Environmental variable: `VAEX_LOGGING_ERROR` Example use: ``` $ VAEX_LOGGING_ERROR= python myscript.py ``` Python settings `vaex.settings.main.logging.error` ## Progress Data configuration ### type Default progressbar to show: 'simple', 'rich' or 'widget' Environmental variable: `VAEX_PROGRESS_TYPE` Example use: ``` $ VAEX_PROGRESS_TYPE=simple python myscript.py ``` Python settings `vaex.settings.main.progress.type` Example use: `vaex.settings.main.progress.type = 'simple'` ### force Force showing a progress bar of this type, even when no progress bar was requested from user code Environmental variable: `VAEX_PROGRESS` Python settings `vaex.settings.main.progress.force` ## Settings Configuration options for the FastAPI server ### add_example Add example dataset Environmental variable: `VAEX_SERVER_ADD_EXAMPLE` Example use: ``` $ VAEX_SERVER_ADD_EXAMPLE=True python myscript.py ``` Python settings `vaex.settings.server.add_example` Example use: `vaex.settings.server.add_example = True` ### graphql Add graphql endpoint Environmental variable: `VAEX_SERVER_GRAPHQL` Example use: ``` $ VAEX_SERVER_GRAPHQL=False python myscript.py ``` Python settings `vaex.settings.server.graphql` Example use: `vaex.settings.server.graphql = False` ### files Mapping of name to path Environmental variable: `VAEX_SERVER_FILES` Python settings `vaex.settings.server.files`