[go: up one dir, main page]

Dataiku Reference Doc.
  • Product
    • Features
      • Connectivity
      • Data Exploration
      • Data Preparation
      • Machine Learning
      • Model Deployment
      • Automation
      •  
      • Code
      • Collaboration
      • Governance & Security
    • Plugins
    • Samples
    • Technology
    • Editions
  • Solutions
    • Use cases
    • Industries
    • Departments
    • Customers
  • Learn
    • Learn Dataiku DSS
    • All How-To's
    • Reference Doc.
    • Q & A
    • What's new
    • Support
  • Resources
    • White Papers
    • Reference Doc.
    • Webinars
    • Success Stories
  • Company
    • Our Story
    • Team
    • Careers
    • News
    • Events
    • Customers
    • Partners
  • Blog
  • Contact us
  • Get Started
  • Installing DSS
    • Requirements
    • Installing a new DSS instance
    • Upgrading a DSS instance
    • Updating a DSS license
    • Other installation options
      • Install on macOS
      • Install on AWS
      • Install on Azure
      • Install a virtual machine
      • Running DSS as a Docker container
    • Setting up Hadoop and Spark integration
    • R integration
    • Customizing DSS installation
    • Installing database drivers
    • Java runtime environment
    • Python integration
    • Installing a DSS plugin
    • Configuring LDAP authentication
    • Working with proxies
    • Migration operations
  • DSS concepts
  • Connecting to data
    • Supported connections
    • Upload your files
    • Server filesystem
    • HDFS
    • Amazon S3
    • Google Cloud Storage
    • Azure Blob Storage
    • FTP
    • SCP / SFTP (aka SSH)
    • HTTP
    • SQL databases
      • MySQL
      • PostgreSQL
      • Vertica
      • Amazon Redshift
      • Pivotal Greenplum
      • Teradata
      • Oracle
      • Microsoft SQL Server
      • SAP HANA
      • IBM Netezza
      • Google Bigquery
      • IBM DB2
      • Snowflake
    • Cassandra
    • ElasticSearch
    • Managed folders
    • “Files in folder” dataset
    • Metrics dataset
    • Internal stats dataset
    • HTTP (with cache)
    • Dataset plugins
    • Data connectivity macros
    • Making relocatable managed datasets
    • Data ordering
  • Exploring your data
    • Sampling
    • Analyze
  • Schemas, storage types and meanings
    • Definitions
    • Basic usage
    • Schema for data preparation
    • Creating schemas of datasets
    • Handling of schemas by recipes
    • List of recognized meanings
    • User-defined meanings
  • Data preparation
    • Processors reference
      • Extract from array
      • Fold an array
      • Sort array
      • Concatenate JSON arrays
      • Discretize (bin) numerical values
      • Change coordinates system
      • Copy column
      • Rename columns
      • Concatenate columns
      • Delete/Keep columns by name
      • Count occurrences
      • Convert currencies
      • Extract date elements
      • Compute difference between dates
      • Format date with custom format
      • Parse to standard date format
      • Split e-mail addresses
      • Enrich from French department
      • Enrich from French postcode
      • Extract ngrams
      • Extract numbers
      • Fill empty cells with fixed value
      • Filter rows/cells on date range
      • Filter rows/cells with formula
      • Filter invalid rows/cells
      • Filter rows/cells on numerical range
      • Filter rows/cells on value
      • Find and replace
      • Flag rows/cells on date range
      • Flag rows with formula
      • Flag invalid rows
      • Flag rows on numerical range
      • Flag rows on value
      • Fold multiple columns
      • Fold multiple columns by pattern
      • Fold object keys
      • Formula
      • Fuzzy join with other dataset (memory-based)
      • Generate Big Data
      • Compute distance between geopoints
      • Extract from geo column
      • Geo-join
      • Resolve GeoIP
      • Create GeoPoint from lat/lon
      • Extract lat/lon from GeoPoint
      • Flag holidays
      • Split invalid cells into another column
      • Join with other dataset (memory-based)
      • Extract with JSONPath
      • Group long-tail values
      • Translate values using meaning
      • Normalize measure
      • Move columns
      • Negate boolean value
      • Force numerical range
      • Generate numerical combinations
      • Convert number formats
      • Nest columns
      • Unnest object (flatten JSON)
      • Extract with regular expression
      • Pivot
      • Python function
      • Split HTTP Query String
      • Remove rows where cell is empty
      • Round numbers
      • Simplify text
      • Split and fold
      • Split and unfold
      • Split column
      • Transform string
      • Tokenize text
      • Transpose rows to columns
      • Triggered unfold
      • Unfold
      • Unfold an array
      • Convert a UNIX timestamp to a date
      • Fill empty cells with previous/next value
      • Split URL (into protocol, host, port, …)
      • Classify User-Agent
      • Generate a best-effort visitor id
      • Zip JSON arrays
    • Filtering and flagging rows
    • Managing dates
    • Reshaping
    • Geographic processing
    • Sampling
    • Execution engines
  • Charts
    • The Charts Interface
    • Sampling & Engine
    • Basic Charts
    • Tables
    • Scatter Charts
    • Map Charts
    • Other Charts
    • Common Chart Elements
    • Color palettes
  • Machine learning
    • Prediction (Supervised ML)
      • Prediction settings
      • Prediction Results
    • Clustering (Unsupervised ML)
      • Clustering settings
      • Clustering results
    • Automated machine learning
    • Features handling
      • Features roles and types
      • Categorical variables
      • Numerical variables
      • Text variables
      • Vector variables
      • Image variables
      • Custom Preprocessing
    • Algorithms reference
      • In-memory Python (Scikit-learn / XGBoost)
      • MLLib (Spark) engine
      • H2O (Sparkling Water) engine
      • Vertica
    • Advanced models optimization
    • Models ensembling
    • Deep Learning
      • Introduction
      • Your first deep learning model
      • Model architecture
      • Training
      • Multiple inputs
      • Using image features
      • Using text features
      • Runtime and GPU support
      • Advanced topics
      • Troubleshooting
    • Models lifecycle
    • Scoring engines
    • Writing custom models
    • Exporting models
  • The Flow
    • Visual Grammar
    • Rebuilding Datasets
    • Limiting Concurrent Executions
  • Visual recipes
    • Sync: copying datasets
    • Grouping: aggregating data
    • Window: analytics functions
    • Distinct: get unique rows
    • Join: joining datasets
    • Splitting datasets
    • Top N: retrieve first N rows
    • Stacking datasets
    • Sampling datasets
    • Sort: order values
    • Pivot recipe
    • Download recipe
  • Recipes based on code
    • The common editor layout
    • Python recipes
    • R recipes
    • SQL recipes
    • Hive recipes
    • Pig recipes
    • Impala recipes
    • Spark-Scala recipes
    • PySpark recipes
    • Spark / R recipes
    • SparkSQL recipes
    • Shell recipes
    • Variables expansion in code recipes
  • Code notebooks
    • SQL notebook
    • Python notebooks
    • Predefined notebooks
  • Webapps
    • “Standard” web apps
    • Shiny web apps
    • Bokeh web apps
    • Publishing webapps on the dashboard
  • Code reports
    • R Markdown reports
  • Dashboards
    • Dashboard concepts
    • Display settings
    • Exporting dashboards to PDF or images
    • Insights reference
      • Chart
      • Dataset table
      • Model report
      • Managed folder
      • Jupyter Notebook
      • Webapp
      • Metric
      • Scenarios
      • Wiki article
  • Working with partitions
    • Partitioning files-based datasets
    • Partitioned SQL datasets
    • Specifying partition dependencies
    • Partition identifiers
    • Recipes for partitioned datasets
    • Partitioned Hive recipes
    • Partitioned SQL recipes
    • Partitioning variables substitutions
  • DSS and Hadoop
    • Setting up Hadoop integration
    • Connecting to secure clusters
    • Hadoop filesystems connections (HDFS, S3, EMRFS, WASB, ADLS, GS)
    • DSS and Hive
    • DSS and Impala
    • Hive datasets
    • Multiple Hadoop clusters
    • Dynamic AWS EMR clusters
    • Hadoop multi-user security
    • Distribution-specific notes
      • Cloudera CDH
      • Hortonworks HDP
      • MapR
      • Amazon Elastic MapReduce
      • Microsoft Azure HDInsight
      • Google Cloud Dataproc
    • Teradata Connector For Hadoop
  • DSS and Spark
    • Usage of Spark in DSS
    • Setting up Spark integration
    • Spark configurations
    • Interacting with DSS datasets
    • Spark pipelines
    • Limitations and attention points
  • DSS and Python
    • Installing Python packages
    • Reusing Python code
    • Using Matplotlib
    • Using Bokeh
    • Using Plot.ly
    • Using Ggplot
  • DSS and R
    • Installing R packages
    • Reusing R code
    • Using ggplot2
    • Using Dygraphs
    • Using googleVis
    • Using ggvis
  • Code environments
    • Operations (Python)
    • Operations (R)
    • Base packages
    • Using Conda
    • Automation nodes
    • Non-managed code environments
    • Plugins’ code environments
    • Custom options and environment
    • Troubleshooting
    • Code env permissions
  • Running in containers
    • Concepts
    • Setting up
    • Using code envs with container execution
    • Running on Google Kubernetes Engine
    • Running on Azure Kubernetes Service
    • Running on Amazon Elastic Kubernetes Service
    • Customization of base images
    • Remote Docker daemons
  • Collaboration
    • Wikis
    • Discussions
    • Project folders
    • Version control
    • Markdown
  • Plugins
    • Installing plugins
    • Installing plugins offline
    • Writing your own plugin
    • Plugin author reference guide
      • Plugins and components
      • Parameters
      • Writing recipes
      • Writing DSS macros
      • Writing DSS Filesystem providers
      • Custom chart elements
      • Other topics
  • Automation scenarios, metrics, and checks
    • Definitions
    • Scenario steps
    • Launching a scenario
    • Reporting on scenario runs
    • Custom scenarios
    • Variables in scenarios
    • Step-based execution control
    • Metrics
    • Checks
    • Custom probes and checks
  • Automation node and bundles
    • Installing the Automation node
    • Creating a bundle
    • Importing a bundle
  • API Node & API Deployer: Real-time APIs
    • Introduction
    • Concepts
    • Installing an API node
    • Installing the API Deployer
    • First API (without API Deployer)
    • First API (with API Deployer)
    • Types of Endpoints
      • Exposing a visual prediction model
      • Exposing a Python prediction model
      • Exposing a R prediction model
      • Exposing a Python function
      • Exposing a R function
      • Exposing a SQL query
      • Exposing a lookup in a dataset
    • Enriching prediction queries
    • Security
    • Managing versions of your endpoint
    • Deploying on Kubernetes
      • Setting up
      • Deployment on Google Kubernetes Engine
      • Deployment on Minikube
      • Managing SQL connections
      • Custom base images
    • APINode APIs reference
      • API node user API
      • API node administration API
      • Endpoint APIs
    • Operations reference
      • Using the apinode-admin tool
      • High availability and scalability
      • Logging and auditing
      • Health monitoring
  • Advanced topics
    • Sampling methods
    • Formula language
    • Custom variables expansion
  • File formats
    • Delimiter-separated values (CSV / TSV)
    • Fixed width
    • Parquet
    • Avro
    • Hive SequenceFile
    • Hive RCFile
    • Hive ORCFile
    • XML
    • JSON
    • Excel
    • ESRI Shapefiles
  • DSS internal APIs
    • The internal Python API
      • API for interacting with datasets
      • API for interacting with Pyspark
      • API for managed folders
      • API for interacting with saved models
      • API for scenarios
      • API for performing SQL, Hive and Impala queries
      • API for performing SQL, Hive and Impala queries like the recipes
      • API for metrics and checks
      • API For creating static insights
      • API for plugin recipes
      • API for plugin datasets
      • API for plugin formats
      • API for plugin FS providers
    • The Javascript API
    • The R API
      • Authentication information
      • Creating static insights
    • The Scala API
  • Public API
    • Features
    • Public API Keys
    • Public API Python client
      • The main client class
      • Managing projects
      • Managing datasets
      • Managed folders
      • Managing recipes
      • Machine learning
      • Managing jobs
      • Managing scenarios
      • API Designer & Deployer
      • Managing meanings
      • Authentication information
      • Managing users and groups
      • Managing connections
      • Other administration tasks
      • Metrics and checks
      • SQL queries through DSS
      • Utilities
      • Reference API documentation
    • The REST API
  • Security
    • Main permissions
    • Connections security
    • User profiles
    • Exposed objects
    • Dashboard authorizations
    • User secrets
    • Multi-user security
      • Comparing security modes
      • Concepts
      • Prerequisites and limitations
      • Setup
      • Operations
      • Interaction with Hive and Impala
      • Interaction with Spark
      • Advanced topics
    • Audit Trail
    • Advanced security options
    • Single Sign-On
    • Multi-Factor Authentication
    • Passwords security
  • Operating DSS
    • dsscli tool
    • The data directory
    • Backing up
    • Logging in DSS
    • DSS Macros
    • Managing DSS disk usage
    • Understanding and tracking DSS processes
    • Tuning and controlling memory usage
    • Using cgroups for resource control
    • Monitoring DSS
  • Troubleshooting
    • Diagnosing and debugging issues
    • Obtaining support
    • Common issues
      • DSS does not start / Cannot connect
      • Cannot login to DSS
      • DSS crashes / The “Disconnected” overlay appears
      • Websockets problems
      • Cannot connect to a SQL database
      • A job fails
      • A scenario fails
      • A ML model training fails
      • “Your user profile does not allow” issues
    • Error codes
      • ERR_CODEENV_CONTAINER_IMAGE_FAILED: Could not build container image for this code environment
      • ERR_CODEENV_CONTAINER_IMAGE_TAG_NOT_FOUND: Container image tag not found for this Code environment
      • ERR_CODEENV_CREATION_FAILED: Could not create this code environment
      • ERR_CODEENV_DELETION_FAILED: Could not delete this code environment
      • ERR_CODEENV_EXISTING_ENV: Code environment already exists
      • ERR_CODEENV_INCORRECT_ENV_TYPE: Wrong type of Code environment
      • ERR_CODEENV_INVALID_CODE_ENV_ARCHIVE: Invalid code environment archive
      • ERR_CODEENV_JUPYTER_SUPPORT_INSTALL_FAILED: Could not install Jupyter support in this code environment
      • ERR_CODEENV_JUPYTER_SUPPORT_REMOVAL_FAILED: Could not remove Jupyter support from this code environment
      • ERR_CODEENV_MISSING_ENV: Code environment does not exists
      • ERR_CODEENV_MISSING_ENV_VERSION: Code environment version does not exists
      • ERR_CODEENV_NO_CREATION_PERMISSION: User not allowed to create Code environments
      • ERR_CODEENV_NO_USAGE_PERMISSION: User not allowed to use this Code environment
      • ERR_CODEENV_UNSUPPORTED_OPERATION_FOR_ENV_TYPE: Operation not supported for this type of Code environment
      • ERR_CODEENV_UPDATE_FAILED: Could not update this code environment
      • ERR_CONNECTION_ALATION_REGISTRATION_FAILED: Failed to register Alation integration
      • ERR_CONNECTION_API_BAD_CONFIG: Bad configuration for connection
      • ERR_CONNECTION_AZURE_INVALID_CONFIG: Invalid Azure connection configuration
      • ERR_CONNECTION_DUMP_FAILED: Failed to dump connection tables
      • ERR_CONNECTION_INVALID_CONFIG: Invalid connection configuration
      • ERR_CONNECTION_LIST_HIVE_FAILED: Failed to list indexable Hive connections
      • ERR_CONNECTION_S3_INVALID_CONFIG: Invalid S3 connection configuration
      • ERR_CONNECTION_SQL_INVALID_CONFIG: Invalid SQL connection configuration
      • ERR_CONNECTION_SSH_INVALID_CONFIG: Invalid SSH connection configuration
      • ERR_CONTAINER_CONF_NO_USAGE_PERMISSION: User not allowed to use this container execution configuration
      • ERR_CONTAINER_CONF_NOT_FOUND: The selected container configuration was not found
      • ERR_CONTAINER_IMAGE_PUSH_FAILED: Container image push failed
      • ERR_DATASET_ACTION_NOT_SUPPORTED: Action not supported for this kind of dataset
      • ERR_DATASET_CSV_UNTERMINATED_QUOTE: Error in CSV file: Unterminated quote
      • ERR_DATASET_HIVE_INCOMPATIBLE_SCHEMA: Dataset schema not compatible with Hive
      • ERR_DATASET_INVALID_CONFIG: Invalid dataset configuration
      • ERR_DATASET_INVALID_FORMAT_CONFIG: Invalid format configuration for this dataset
      • ERR_DATASET_INVALID_METRIC_IDENTIFIER: Invalid metric identifier
      • ERR_DATASET_INVALID_PARTITIONING_CONFIG: Invalid dataset partitioning configuration
      • ERR_DATASET_PARTITION_EMPTY: Input partition is empty
      • ERR_DATASET_TRUNCATED_COMPRESSED_DATA: Error in compressed file: Unexpected end of file
      • ERR_ENDPOINT_INVALID_CONFIG: Invalid configuration for API Endpoint
      • ERR_FOLDER_INVALID_PARTITIONING_CONFIG: Invalid folder partitioning configuration
      • ERR_FSPROVIDER_CANNOT_CREATE_FOLDER_ON_DIRECTORY_UNAWARE_FS: Cannot create a folder on this type of file system
      • ERR_FSPROVIDER_DEST_PATH_ALREADY_EXISTS: Destination path already exists
      • ERR_FSPROVIDER_FSLIKE_REACH_OUT_OF_ROOT: Illegal attempt to access data out of connection root path
      • ERR_FSPROVIDER_HTTP_CONNECTION_FAILED: HTTP connection failed
      • ERR_FSPROVIDER_HTTP_INVALID_URI: Invalid HTTP URI
      • ERR_FSPROVIDER_HTTP_REQUEST_FAILED: HTTP request failed
      • ERR_FSPROVIDER_ILLEGAL_PATH: Illegal path for that file system
      • ERR_FSPROVIDER_INVALID_CONFIG: Invalid configuration
      • ERR_FSPROVIDER_INVALID_FILE_NAME: Invalid file name
      • ERR_FSPROVIDER_LOCAL_LIST_FAILED: Could not list local directory
      • ERR_FSPROVIDER_PATH_DOES_NOT_EXIST: Path in dataset or folder does not exist
      • ERR_FSPROVIDER_ROOT_PATH_DOES_NOT_EXIST: Root path of the dataset or folder does not exist
      • ERR_FSPROVIDER_SSH_CONNECTION_FAILED: Failed to establish SSH connection
      • ERR_HIVE_HS2_CONNECTION_FAILED: Failed to establish HiveServer2 connection
      • ERR_HIVE_LEGACY_UNION_SUPPORT: Your current Hive version doesn’t support UNION clause but only supports UNION ALL, which does not remove duplicates
      • ERR_METRIC_DATASET_COMPUTATION_FAILED: Metrics computation completely failed
      • ERR_METRIC_ENGINE_RUN_FAILED: One of the metrics engine failed to run
      • ERR_MISC_ENOSPC: Out of disk space
      • ERR_MISC_EOPENF: Too many open files
      • ERR_NOT_USABLE_FOR_USER: You may not use this connection
      • ERR_OBJECT_OPERATION_NOT_AVAILABLE_FOR_TYPE: Operation not supported for this kind of object
      • ERR_PLUGIN_CANNOT_LOAD: Plugin cannot be loaded
      • ERR_PLUGIN_COMPONENT_NOT_INSTALLED: Plugin component not installed or removed
      • ERR_PLUGIN_DEV_INVALID_COMPONENT_PARAMETER: Invalid parameter for plugin component creation
      • ERR_PLUGIN_DEV_INVALID_DEFINITION: The descriptor of the plugin is invalid
      • ERR_PLUGIN_INVALID_DEFINITION: The plugin’s definition is invalid
      • ERR_PLUGIN_NOT_INSTALLED: Plugin not installed or removed
      • ERR_PLUGIN_WITHOUT_CODEENV: The plugin has no code env specification
      • ERR_PLUGIN_WRONG_TYPE: Unexpected type of plugin
      • ERR_PROJECT_INVALID_ARCHIVE: Invalid project archive
      • ERR_PROJECT_INVALID_PROJECT_KEY: Invalid project key
      • ERR_PROJECT_UNKNOWN_PROJECT_KEY: Unknown project key
      • ERR_RECIPE_CANNOT_CHANGE_ENGINE: Cannot change engine
      • ERR_RECIPE_CANNOT_CHECK_SCHEMA_CONSISTENCY: Cannot check schema consistency
      • ERR_RECIPE_CANNOT_CHECK_SCHEMA_CONSISTENCY_EXPENSIVE: Cannot check schema consistency: expensive checks disabled
      • ERR_RECIPE_CANNOT_CHECK_SCHEMA_CONSISTENCY_NEEDS_BUILD: Cannot compute output schema with an empty input dataset. Build the input dataset first.
      • ERR_RECIPE_CANNOT_CHECK_SCHEMA_CONSISTENCY_ON_RECIPE_TYPE: Cannot check schema consistency on this kind of recipe
      • ERR_RECIPE_CANNOT_CHECK_SCHEMA_CONSISTENCY_WITH_RECIPE_CONFIG: Cannot check schema consistency because of recipe configuration
      • ERR_RECIPE_CANNOT_CHANGE_ENGINE: Not compatible with Spark
      • ERR_RECIPE_CANNOT_USE_ENGINE: Cannot use the selected engine for this recipe
      • ERR_RECIPE_ENGINE_NOT_DWH: Error in recipe engine: SQLServer is not Data Warehouse edition
      • ERR_RECIPE_INCONSISTENT_I_O: Inconsistent recipe input or output
      • ERR_RECIPE_SYNC_AWS_DIFFERENT_REGIONS: Error in recipe engine: Redshift and S3 are in different AWS regions
      • ERR_RECIPE_PDEP_UPDATE_REQUIRED: Partition dependecy update required
      • ERR_RECIPE_SPLIT_INVALID_COMPUTED_COLUMNS: Invalid computed column
      • ERR_SCENARIO_INVALID_STEP_CONFIG: Invalid scenario step configuration
      • ERR_SECURITY_CRUD_INVALID_SETTINGS: The user attributes submitted for a change are invalid
      • ERR_SECURITY_GROUP_EXISTS: The new requested group already exists
      • ERR_SECURITY_INVALID_NEW_PASSWORD: The new password is invalid
      • ERR_SECURITY_INVALID_PASSWORD: The password hash from the database is invalid
      • ERR_SECURITY_MUS_USER_UNMATCHED: The DSS user is not configured to be matched onto a system user
      • ERR_SECURITY_PATH_ESCAPE: The requested file is not within any allowed directory
      • ERR_SECURITY_USER_EXISTS: The requested user for creation already exists
      • ERR_SECURITY_WRONG_PASSWORD: The old password provided for password change is invalid
      • ERR_SPARK_FAILED_DRIVER_OOM: Spark failure: out of memory in driver
      • ERR_SPARK_FAILED_TASK_OOM: Spark failure: out of memory in task
      • ERR_SPARK_FAILED_YARN_KILLED_MEMORY: Spark failure: killed by YARN (excessive memory usage)
      • ERR_SPARK_PYSPARK_CODE_FAILED_UNSPECIFIED: Pyspark code failed
      • ERR_SPARK_SQL_LEGACY_UNION_SUPPORT: Your current Spark version doesn’t support UNION clause but only supports UNION ALL, which does not remove duplicates
      • ERR_SQL_CANNOT_LOAD_DRIVER: Failed to load database driver
      • ERR_SQL_DB_UNREACHABLE: Failed to reach database
      • ERR_SQL_IMPALA_MEMORYLIMIT: Impala memory limit exceeded
      • ERR_SQL_POSTGRESQL_TOOMANYSESSIONS: too many sessions open concurrently
      • ERR_SQL_TABLE_NOT_FOUND: SQL Table not found
      • ERR_SQL_VERTICA_TOOMANYROS: Error in Vertica: too many ROS
      • ERR_SQL_VERTICA_TOOMANYSESSIONS: Error in Vertica: too many sessions open concurrently
      • ERR_TRANSACTION_FAILED_ENOSPC: Out of disk space
      • ERR_TRANSACTION_GIT_COMMMIT_FAILED: Failed committing changes
      • ERR_USER_ACTION_FORBIDDEN_BY_PROFILE: Your user profile does not allow you to perform this action
      • WARN_RECIPE_SPARK_INDIRECT_HDFS: No direct access to read/write HDFS dataset
      • WARN_RECIPE_SPARK_INDIRECT_S3: No direct access to read/write S3 dataset
      • Undocumented error
    • Known issues
  • Release notes
    • DSS 5.0 Release notes
    • DSS 4.3 Release notes
    • DSS 4.2 Release notes
    • DSS 4.1 Release notes
    • DSS 4.0 Release notes
    • DSS 3.1 Release notes
    • DSS 3.0 Relase notes
    • DSS 2.3 Relase notes
    • DSS 2.2 Relase notes
    • DSS 2.1 Relase notes
    • DSS 2.0 Relase notes
    • DSS 1.4 Relase notes
    • DSS 1.3 Relase notes
    • DSS 1.2 Relase notes
    • DSS 1.1 Release notes
    • DSS 1.0 Release Notes
    • Pre versions
  • Other Documentation
  • Third-party acknowledgements
 
Dataiku DSS
You are viewing the documentation for version 5.0 of DSS.
An up to date version might be available for the latest version
  • Docs »
  • Advanced topics

Advanced topics¶

  • Sampling methods
    • Generic sampling methods
    • Exploration / Visual data preparation
  • Formula language
    • Basic usage
    • Reading column values
    • Variables typing and autotyping
    • Boolean values
    • Operators
    • Array and object operations
    • Object notations
    • Array functions
    • Boolean functions
    • Date functions
    • Math functions
    • Object functions
    • String functions
    • Value access functions
    • Control structures
    • Tests
  • Custom variables expansion
    • Defining variables
    • Using variables in the code of a recipe
    • Using variables in configuration fields
    • Using override tables
    • Modifying the value of variables
Next Previous

© Copyright 2018, Dataiku.

Sphinx theme provided by Read the Docs