

Zeppelin 0.8.0 (with Spark/scala, md, file and JDBC interpreters).

The following version combinations are known to work:. The versions of the above components that the VM is provisioned with are defined in the file scripts/versions.sh Vagrant project to spin up a single virtual machine running: Inspect the execution results in the Run tool window.Vagrant Docker for Hadoop, Spark and Hive Introduction Then select configuration from the list of the created configurations and click. Target upload directory: the directory on the remote host to upload the executable files.Ĭlick OK to save the configuration. Then click Test Connection to ensure you can connect to the remote server. Specify the URL of the remote host with the Spark cluster and user's credentials to access it.

Print additional debug output: run spark-submit with the -verbose option to print debugging information. For more details, refer to Runtime Environment.Īrchives: comma-separated list of archives to be extracted into the working directory of each executor.
#Spark vagrant for mac driver
Proxy user: a username that is enabled for using proxy for the Spark connection.ĭriver Java options, Driver library path, and Driver class path: add additional driver options. You can also specify environment variables, for example, USER=jetbrains. Select the Interactive checkbox if you want to launch the script in the interactive mode. It is recommended to provide an absolute path to the script. Shell options: select if you want to execute any scripts before the Spark submit.Įnter the path to bash and specify the script to be executed. Kerberos: settings for establishing a secured connection with Kerberos. Spark Monitoring Integration: ability to monitor the execution of your application with Spark Monitoring. For the cluster mode, it is also possible to specify the number of cores.Įxecutor: Executor settings, such as amount of memory and the number of cores. You can add repositories or exclude some packages from the execution context.ĭriver: Spark Driver settings, such as amount of memory to use for the driver process. Spark Configuration: Spark configuration options available through a properties file or a list of properties.ĭependencies: files and archives (jars) that are required for the application to be executed. You can click the Add options and select an option to add to your configuration: Show this page: select this checkbox to show the run/debug configuration settings prior to actually starting the run/debug configuration.Īctivate tool window: by default this checkbox is selected and the Run tool window opens when you start the run/debug configuration. The tasks are performed in the order they appear in the list. Master: the format of the master URL passed to Spark.īefore launch: in this area you can specify tasks that must be performed before starting the selected run/debug configuration. See more details in the Cluster Mode Overview. The SparkContext can connect to several types of cluster managers (either Spark’s own standalone cluster manager, Mesos, or YARN). Run arguments: Arguments of your application.Ĭluster manager: select the management method to run an application on a cluster. idea directory, you can save the configuration to any other directory within the project. However, if you do not want to share the. Store as project file: save the file with the run configuration settings to share it with other team members.

Name: a name to distinguish between run/debug configurations.Īllow parallel run: select to allow running multiple instances of this run configuration in parallel. You can select either jar and py file, or IDEA artifact.Ĭlass: the name of the main class of the jar archive. Spark home: a path to the Spark installation directory.Īpplication: a path to the executable file. Select the Spark Submit | Local or Spark Submit | SSH configuration from the list of the available configurations. Alternatively, press Alt+Shift+F10, then 0.Ĭlick the Add New Configuration button ( ). You can prepare an IDEA artifact to execute.įrom the main menu, select Run | Edit Configurations. Run an application with the Spark Submit configurations For more details, see Manage Steps.Ĭurrently, IntelliJ IDEA does not support debugging Spark applications. If you have an Amazon EMR cluster, you can also configure an EMR connection and use Amazon EMR steps to submit your application to Spark installed on the EMR cluster. You can execute an application locally or using an SSH configuration. IntelliJ IDEA provides run/debug configurations to run the spark-submit script in Spark’s bin directory. With the Big Data Tools plugin, you can execute applications on Spark clusters.
